THOR THUNDERSCAN · THESIS Anti-Pattern Analysis

Database Design Anti-Patterns:
God Tables & EAV Model Abuse

How two common shortcuts — the God Table monolith and the Entity-Attribute-Value flexibility trap — transform high-performing systems into unmaintainable legacy nightmares.

Isaac Shi, Co-Founder & GP of Golden Section VC
October 8, 2025
18 min read · Database Architecture
Share on LinkedIn Share on X
Vector Databases & Embedding Architecture All Articles From CRUD to Cognition

Database design anti-patterns represent a fundamental challenge in software systems that can transform a high-performing, maintainable application into a legacy nightmare. These anti-patterns emerge when developers seek shortcuts to avoid proper relational design — either through laziness, poor understanding, or misguided attempts at "flexibility." The consequences of these decisions often surface years later, when the system has accumulated significant technical debt and changing the design becomes prohibitively expensive.

This essay explores two of the most problematic database design anti-patterns: the God Table and the Entity-Attribute-Value (EAV) model. Both represent attempts to solve perceived problems with relational databases that ultimately create far more severe, systemic issues than they solve. We examine God Tables first — using the Electronic Health Record (EHR) patients table as a primary example, because no domain illustrates this anti-pattern's real-world consequences more clearly — before turning to EAV, whose performance consequences are technically subtle but equally catastrophic.

Key Insight: Anti-patterns in database design cause real performance, cost, and velocity damage. They create compounding technical debt that affects every layer of an application — from query performance to developer onboarding time — and their costs grow exponentially the longer they remain in production.

God Tables: The Monolith That Swallows Everything

Understanding God Table Syndrome

God tables, also known as "universal tables," represent the anti-pattern of storing unrelated entities within a single massive table. These tables emerge gradually, often starting as a clean, simple entity that progressively accumulates unrelated columns until it becomes an unmanageable monolith. No industry illustrates this pathology more clearly — or more consequentially — than Electronic Health Records (EHR) and Health IT systems, where a patients table is one of the most common victims of god table growth. [5]

The EHR Patient Table: A Classic God Table

In a well-designed EHR system, a patients table should contain only core identity data — name, date of birth, MRN (Medical Record Number), and contact information. In practice, EHR systems routinely evolve the patient table into a god table that conflates at least six distinct clinical and administrative domains into a single row, often exceeding 80–120 columns in production systems.

TABLE: patients ⚠ GOD TABLE
86 columns · 1 sample row shown
Identity Demographics Clinical Insurance Care Team Legacy
# Column Name Data Type Domain Sample Value
Showing all 86 columns · This entire structure is one database table row ⚠ Every query against this table loads all 86 columns into memory

This is not a theoretical worst case — it is the default outcome of most EHR implementations. Diagnosis codes, insurance claims, care team assignments, and legacy HL7 migration artifacts accumulate into the same row because adding a column to the existing patients table is always the path of least resistance for each new feature team. The billing team adds insurance columns. The clinical team adds diagnosis fields. The interoperability team adds FHIR and HL7 references. The result is a table that no single developer fully understands.

🏥 WHY EHR SYSTEMS ARE ESPECIALLY VULNERABLE

EHR databases face three compounding pressures that accelerate god table growth: (1) regulatory mandates requiring new data fields (ICD-10 codes, HIPAA identifiers, Meaningful Use attestation fields) are added directly to existing tables rather than through schema evolution; (2) HL7/FHIR interoperability migrations deposit legacy reference columns that are never cleaned up; and (3) multi-specialty clinical workflows — from oncology to primary care — each append domain-specific columns to the shared patient record rather than introducing properly linked tables.

The psychological drivers behind god table creation follow predictable patterns:

  • Convenience over conception — Adding a column to patients is faster than designing a patient_diagnoses relationship; every team makes this trade-off once, and the damage compounds
  • Fear of schema complexity — Clinical teams worry that more tables will make the system "harder to query," failing to recognize that normalized structures make individual queries simpler and faster
  • Regulatory-driven column sprawl — Each new compliance requirement (ICD-10, SNOMED CT, HIPAA, CMS quality measures) appends fields to the existing table instead of extending the domain model
  • Accumulated migration artifacts — HL7 v2, HL7 v3, and FHIR R4 migrations each leave legacy reference columns behind, never deprecated

Systemic Failures of the EHR Patient God Table

⚡ TIGHT COUPLING ACROSS CLINICAL DOMAINS

When clinical diagnoses, insurance billing codes, and care team assignments all live in the patients table, a schema change for any one domain locks the entire table. A CMS-mandated update to ICD-10 code fields can block the scheduling system from writing appointment records. A billing integration change can trigger a full-table migration that takes the patient registration workflow offline. Domains that should be completely independent share a single point of failure.

📉 HIPAA SURFACE AREA EXPLOSION

Every column added to the patients table expands the Protected Health Information (PHI) footprint of every query. A scheduling query that only needs patient_id and appointment_time must touch a row containing SSN, diagnosis codes, insurance IDs, and financial data — because they all live in the same row. This violates the principle of minimum necessary access and dramatically increases HIPAA audit scope. Every application that touches the patients table becomes a potential PHI exposure vector. [6]

💥 INDEX FAILURE CASCADES IN CLINICAL WORKFLOWS

A hospital system documented a failure where a patients table with 94 columns experienced an index corruption on the insurance_id column. Because billing, scheduling, lab ordering, and admission workflows all read from the same table, every clinical workflow degraded simultaneously — not just billing. A problem in the financial domain caused patient check-in to time out. The on-call team could not isolate the failure because there was no domain boundary to contain it.

🔧 THE QUERY THAT FETCHES EVERYTHING TO SHOW NOTHING

A typical nurse station dashboard query in an EHR god table system reads: SELECT * FROM patients WHERE patient_id = ? — returning all 86 columns including SSN, outstanding balance, legacy HL7 identifiers, and deprecated fields from a 2009 migration, just to display the patient's name and allergies. Every page load in the application carries the weight of the entire table's history. This is not a query optimization problem — it is a schema design problem.

The Correct EHR Schema: Domain Decomposition

A properly normalized EHR design separates the six domains into independent tables, each owned by a distinct clinical or administrative team:

-- BEFORE: God Table — 86 columns, 6 domains, 1 lock surface CREATE TABLE patients ( patient_id, mrn, first_name, last_name, dob, -- identity address_line1, city, phone_home, race, -- demographics primary_diagnosis_icd10, chief_complaint, -- clinical insurance_provider, outstanding_balance, -- billing primary_care_physician_id, care_team_note, -- care coord fhir_resource_id, legacy_emr_id ... -- interop legacy ); -- AFTER: Normalized — 6 focused tables, independent domains CREATE TABLE patients (patient_id, mrn, first_name, last_name, dob, created_at); CREATE TABLE patient_demographics (patient_id REFERENCES patients, address, phone, language, race, ethnicity); CREATE TABLE patient_diagnoses (id, patient_id REFERENCES patients, icd10_code, onset_date, status, recorded_by); CREATE TABLE patient_insurance (id, patient_id REFERENCES patients, provider, member_id, group_number, copay, deductible); CREATE TABLE patient_care_team (patient_id REFERENCES patients, physician_id, role, assigned_at); CREATE TABLE patient_interop_refs (patient_id REFERENCES patients, system, external_id, version, imported_at);
Failure ModeRoot Cause in EHR God TableSeverityTimeline
Billing index failure takes check-in offlineClinical & financial domains share one tableCriticalImmediate
ICD-10 update requires full-table migrationDiagnosis columns mixed with identity columnsCriticalAny regulatory cycle
HIPAA audit scope covers entire applicationEvery query returns PHI fields unnecessarilyCriticalOngoing / compounding
New specialty workflow requires god table changeNo domain-specific tables to extend independentlyHighEvery new feature
HL7→FHIR migration leaves 30+ dead columnsNo interop isolation layerHighEvery migration cycle

Case Studies: God Tables Across Industries

Regional Hospital Network — EHR Patient Table Collapse
patients table grew from 18 → 94 columns over 11 years across 3 HL7 migration cycles

A regional hospital network's EHR system accumulated 11 years of clinical workflow additions, regulatory mandates, and interoperability migrations into a single patients table that reached 94 columns. The system broke down when a CMS quality reporting requirement mandated adding structured social determinants of health (SDOH) fields. The migration — which should have been an independent patient_sdoh table — instead required a full-table schema change that caused 4 hours of downtime across admissions, scheduling, lab ordering, and the nurse station dashboard simultaneously, because all four systems queried the same god table.

94
Columns in patients table
6
Clinical domains mixed together
4 hrs
System-wide downtime from one migration
11 yr
Accumulation across 3 HL7 cycles

Post-incident analysis revealed that 38 of the 94 columns were orphaned legacy fields from prior HL7 v2 and HL7 v3 migrations that no active application still read — yet they were included in every SELECT * query across the system, bloating row sizes and consuming buffer pool memory that should have served active clinical data. The HIPAA compliance officer flagged that 11 systems had been unnecessarily reading SSN and financial data during routine scheduling queries for years.

B2B SaaS Platform — The Same Pattern, Different Domain
accounts table evolved from 12 → 73 columns over 8 years

The god table anti-pattern is not unique to healthcare. A B2B SaaS platform's accounts table followed an identical accumulation pattern — reaching 73 columns combining authentication, billing, usage tracking, user preferences, and legacy migration data over eight years. When a billing adjustment required a schema change, it triggered a full migration spanning 47 dependent tables and consumed six months of engineering time that should have shipped customer-facing features. [5]

73
Columns in one table
47
Tables in migration blast radius
6 mo
Dev time lost to remediation
8 yr
Accumulation period
Common Thread: Whether it is a hospital's patients table or a SaaS platform's accounts table, the god table anti-pattern follows the same trajectory — a clean seed entity, years of cross-domain column accumulation, and a final catastrophic event that forces a painful remediation. The only difference in healthcare is that the stakes include patient safety and HIPAA liability alongside engineering cost.

The EAV Model: Flexibility at the Cost of Performance

Understanding EAV Architecture

The Entity-Attribute-Value model — also known as the "one true lookup table" or "sparse matrix" — represents an anti-pattern that forces a non-relational approach onto relational database systems. In its basic form, the EAV model uses three primary tables: objects, attributes, and attribute_values. This structure theoretically allows unlimited flexibility — new attributes can be added without modifying the database schema. [1]

-- The EAV "flexibility" trap: 3 tables to describe 1 product SELECT e.name, MAX(CASE WHEN a.name = 'price' THEN av.value END) AS price, MAX(CASE WHEN a.name = 'color' THEN av.value END) AS color, MAX(CASE WHEN a.name = 'size' THEN av.value END) AS size, MAX(CASE WHEN a.name = 'weight' THEN av.value END) AS weight, MAX(CASE WHEN a.name = 'material' THEN av.value END) AS material FROM entities e JOIN attribute_values av ON av.entity_id = e.id JOIN attributes a ON a.id = av.attribute_id WHERE e.type = 'product' GROUP BY e.name; -- vs. proper relational design — same result, one scan: SELECT name, price, color, size, weight, material FROM products;
Performance Benchmark: A study comparing EAV versus JSONB for 10 million entities in PostgreSQL found that JSONB required only 2.08 GB total storage vs. EAV's 6.43 GB — and executed queries 50,000× faster without indexing. [2]
6.43 GB
EAV — 10M entities
2.08 GB
JSONB — same 10M entities
Baseline
EAV unindexed query speed
50,000×
JSONB faster (unindexed)

The Performance Degradation Problem

Relational Database Management Systems operate most efficiently when data structures align with their optimization strategies. The EAV anti-pattern actively works against these mechanisms, creating what some database experts describe as "the worst possible design" for performance. [3]

🔀 QUERY COMPLEXITY EXPLOSION

Retrieving a single product with 15 attributes requires 15 separate joins on the same attribute_values table. A basic SELECT that would be trivial in a properly normalized schema becomes an incomprehensible maze of conditional aggregations. SQL complexity grows exponentially with every additional attribute.

📇 INDEXING PARALYSIS

Traditional RDBMS optimization relies heavily on column-specific indexes. In an EAV structure, all values exist within a generic Value column — making it impossible to build effective indexes on specific data types or attribute ranges. Indexes on foreign key relationships provide minimal benefit for attribute-specific queries.

✍️ WRITE OPERATION OVERHEAD

Creating a single entity requires multiple INSERT operations across separate tables. For an entity with 15 attributes, the system performs 16 separate INSERT statements instead of one. Every UPDATE multiplies similarly. Write throughput collapses under production load.

📊 STATISTICAL ANALYSIS COSTS

Basic analytics become prohibitively expensive. Calculating average prices across products requires scanning and joining hundreds of thousands of rows instead of performing simple columnar mathematics. Reporting latency measured in seconds degrades to minutes at scale.

Real-World EAV Failures

The EAV anti-pattern often emerges in organizations seeking to bypass governance processes. Development teams implement EAV structures as a technical workaround to avoid tedious approval processes for schema changes — trading short-term convenience for long-term technical debt. This anti-pattern appears most frequently in:

  1. Content Management Systems — Developers create "flexible" attribute systems that allow non-technical users to define custom fields
  2. Enterprise Software — Applications supporting multiple customer configurations without proper schema design for each
  3. Medical Data Systems — Systems attempting to store diverse clinical measurements without considering analytical requirements
Hospital System — EAV Patient Data Failure
Clinical reporting became functionally impossible

A hospital system using EAV for patient data discovered that generating simple daily reports required complex SQL with multiple CTEs (Common Table Expressions) and 15-second execution times for basic queries. The same data retrieved from properly normalized tables executed in milliseconds. The performance differential was not a configuration issue — it was a fundamental consequence of the EAV model's inability to leverage RDBMS optimization. [4]

Architectural Alternatives and Patterns

Contemporary database design recognizes that traditional relational approaches may not suit every data pattern. Rather than forcing data into inappropriate relational models or creating anti-patterns, developers have access to legitimate alternatives that provide genuine flexibility without sacrificing performance.

✅ JSONB FOR SEMI-STRUCTURED DATA

PostgreSQL's JSONB type offers a legitimate path for truly dynamic attributes. Unlike EAV tables, JSONB provides native indexing through GIN indexes, a 40% smaller storage footprint, query execution up to 50,000× faster than unindexed EAV approaches, and type safety options. [2]

-- JSONB: flexibility without EAV's join tax SELECT id, name, attributes->>'price' AS price, attributes->>'color' AS color, attributes->>'material' AS material FROM products WHERE (attributes->>'price')::numeric > 100; -- Add GIN index for fast attribute searches CREATE INDEX idx_products_attrs ON products USING GIN(attributes);
Document Databases for Highly Variable Data: Systems requiring extensive schema flexibility may benefit from document databases like MongoDB, where dynamic structures are a first-class feature rather than an anti-pattern workaround. This is a legitimate architectural choice — not a relational database shortcut.

God Table Remediation: Domain Decomposition

The cure for god tables is systematic domain decomposition — identifying the distinct business domains mixed into a single table and separating them into properly normalized structures:

-- BEFORE: God Table (73 columns, 5 domains) CREATE TABLE accounts ( id, email, full_name, -- core password_hash, mfa_secret, -- auth stripe_customer_id, mrr_cents, -- billing theme, dashboard_layout, -- preferences legacy_id, v1_user_id ... -- legacy ); -- AFTER: Proper decomposition — 5 focused tables CREATE TABLE users (id, email, full_name, created_at); CREATE TABLE user_auth (user_id REFERENCES users, password_hash, mfa_secret, last_login_ip); CREATE TABLE user_billing (user_id REFERENCES users, stripe_id, plan_id, mrr_cents); CREATE TABLE user_preferences (user_id REFERENCES users, theme, language, dashboard_layout); CREATE TABLE user_legacy_refs (user_id REFERENCES users, legacy_id, v1_user_id);
ApproachUse WhenAvoid WhenPerformance
Proper normalizationStructured, known schemaTruly unknown attributesExcellent
JSONB columnDynamic attributes, same entity typeFrequent attribute-level queriesVery Good
Document DBHighly variable schema by designComplex relational queries requiredGood
EAV modelAlmost never in RDBMSEverything in practicePoor
God tableNeverAlwaysDegrades rapidly

Consequences and Systemic Impact

Both EAV and god tables create technical debt that compounds exponentially over time. Database anti-patterns represent a special category of technical debt because they affect every tier of an application stack simultaneously.

🏗️
Application Layer Contamination

When the persistence layer relies on anti-patterns, the entire application must adapt. Business logic becomes intertwined with persistence mechanisms, making codebase evolution increasingly difficult.

⚙️
Operational Complexity

Backup and restore processes become slower and more resource-intensive. Performance tuning on flawed designs yields diminishing returns at every iteration.

👥
Team Velocity Loss

New developers require extensive time to understand anti-pattern structures. Simple features that should take hours require weeks. This is the most significant long-term cost — compounding developer friction. [4]

Industry Economic Impact

The broader economic consequences extend beyond individual projects. Organizations encountering entrenched anti-patterns typically face three outcomes:

  1. Forced System Rewrites — In severe cases, anti-pattern databases become unmaintainable, forcing complete rewrites that include data migration, application rewrites, and user experience disruption — a cost measured in millions.
  2. Competitive Disadvantage — While competitors advance customer-facing features, teams stuck with anti-patterns spend developer cycles on maintenance, while competitors ship customer-facing features.
  3. Market Perception Damage — Customers experience the symptoms through slow performance, unreliable data access, and inconsistent information — directly affecting retention and reputation.

Preventive Strategies and Recovery Patterns

Architectural Governance

  • Database Design Reviews — Mandated architectural reviews for any non-standard schema ensure anti-patterns receive proper consideration before implementation
  • Cross-Team Standards — Standardized approaches to common data modeling challenges prevent individual teams from implementing ad-hoc anti-pattern solutions
  • Tooling Investment — Providing development teams with schema evolution tooling (Flyway, Liquibase) removes the motivation for anti-pattern adoption
  • Automated Detection — Tools like SQLCheck can identify god tables and EAV structures during development, not production

Recovery and Migration Approaches

When anti-patterns are already in production, systematic recovery is crucial:

Phase 1 — Assess

Map all anti-patterns. Quantify business risk per table. Identify quick wins vs. complex migrations. Document all FK dependencies.

Phase 2 — Isolate

Freeze the god table — no new columns. Begin dual-write to new normalized tables. Write adapter layer for existing consumers.

Phase 3 — Migrate

Migrate consumers incrementally. Validate data consistency at each step. Gradually deprecate the anti-pattern table with a sunset date.

Critical: Anti-pattern recovery requires non-technical stakeholder buy-in since these changes affect development velocity and may require short-term feature deferment. Frame recovery as an investment with measurable ROI — not as "cleaning up technical debt."

The Path Forward: Mature Database Engineering

Modern database engineering recognizes that proper design represents an investment in system longevity rather than an implementation detail. Organizations succeeding with large-scale database systems demonstrate several critical characteristics:

  • They view database design as requiring senior-level technical expertise — not an entry-level concern
  • They invest in tooling and processes that make correct design choices easier than incorrect ones
  • They recognize when anti-patterns have been introduced and develop systematic approaches to address them
  • They treat schema evolution as a first-class engineering discipline with tooling, review, and governance

The EAV and god table anti-patterns serve as cautionary tales that illustrate how attempts to circumvent relational database design principles ultimately create far more complex problems than they solve. Understanding these anti-patterns helps database professionals recognize early warning signs and implement sustainable architectures that scale with business growth rather than becoming liabilities that constrain future development.

Detect Anti-Patterns in Your Production Database

ThunderScan automatically identifies god tables, EAV abuse, missing FK constraints, normalization violations, and 40+ other anti-patterns — with AI-generated remediation SQL and prioritized action plans.

Scan My Database Free

Sources & Further Reading

  1. Smithers, M. (2013). The Anti-Pattern: EAVil Database Design.
  2. Coussement, J. (2016). Replacing EAV with JSONB in PostgreSQL.
  3. CYBERTEC PostgreSQL. EAV Design in PostgreSQL: Don't Do It. (Page exists; access may require direct browser navigation.)
  4. Levitation Engineering. Anti-Patterns in Database Design: Lessons from Real-World Failures.
  5. CloudThat. Recognizing and Preventing Common Data Modeling Anti-Patterns.
  6. Level Up Coding. The God Object Anti-Pattern: Why You Should Avoid It At All Costs.

Further Reading from the Author

Isaac Shi writes about AI, software, and entrepreneurship at isaacshi.com. These essays provide the strategic and philosophical context behind this thesis.

Essay · Isaac Shi
The Narrative Trap
Anti-patterns survive because teams tell themselves a convincing story — why data and rigorous analysis must always overrule the comfortable narrative.
Essay · Isaac Shi
Don't Code Before Reading This
The engineering culture essay — bad patterns come from the same kitchen chaos Bourdain described, and fixing them starts with craft and discipline.
Found this useful?
Share with your team and help more engineers recognize costly database anti-patterns.
Share on LinkedIn Share on X Download PDF
Continue Reading
© 2026 Thor ThunderScan  ·  ← Back to Thesis  ·  Start Scanning →