Post

The Fork in the Road: Modeling Data Relationships with PostgreSQL JSONB vs. Traditional Tables

Title: The Fork in the Road: Modeling Data Relationships with PostgreSQL JSONB vs. Traditional Tables

Meta Description: Dive deep into a practical database design challenge. We compare using PostgreSQL’s flexible jsonb type against traditional linking tables for modeling a “study journey,” exploring the critical trade-offs between data integrity and developer flexibility.

Tags: PostgreSQL, Database Design, JSONB, SQL, Data Modeling, System Architecture, Tech Article


As developers and architects, we often stand at a crossroads. Down one path lies the well-trodden road of relational purity: structured, secure, and governed by the strict laws of normalization and foreign keys. Down the other is a newer, more flexible path paved with jsonb, offering a world of schema-on-read freedom and tantalizing simplicity.

Which path should you take? The answer, like most things in technology, is “it depends.” But exploring a real-world scenario can illuminate the signposts, helping us choose not just the easier path, but the wiser one.

Let’s walk through a common design problem: building a “study journey” for a learning platform.

The Scenario: Linking Courses to Avoid Repetition

Imagine our platform has dozens of courses, and courses are composed of individual lessons.

  • Course: “Beginner SQL”
  • Lessons: ‘SELECT Statements’, ‘JOIN Clauses’
  • Course: “Advanced SQL”
  • Lessons: ‘Window Functions’, ‘Advanced Aggregation’

The business has a key requirement: when a student signs up for an advanced course, the platform must know all the lessons from its prerequisites. This is crucial to prevent them from re-enrolling in lessons they’ve already mastered. “Advanced SQL,” for instance, requires “Beginner SQL.”

How do we model this requires relationship in our database? This is where we face our fork in the road.

Path 1: The jsonb Path to Flexibility

In a modern PostgreSQL world, our first instinct might be to reach for jsonb. It feels clean and self-contained. Why create another table when we can just embed the relationship data directly into the course itself?

Let’s add a jsonb column called journey_details to our courses table.

Table Structure:

1
2
3
4
5
6
7
8
9
10
11
CREATE TABLE courses (
    course_id serial PRIMARY KEY,
    course_title text NOT NULL,
    journey_details jsonb
);

-- Sample Data
INSERT INTO courses (course_id, course_title, journey_details) VALUES
(101, 'Beginner SQL', NULL),
(102, 'Advanced SQL', '{"prerequisites": [101]}'),
(103, 'SQL for Data Science', '{"prerequisites": [101, 102], "recommended": [201]}');

Look at that SQL for Data Science entry. We’ve not only defined its hard prerequisites but also added a “recommended” course—a completely new type of relationship—without a single ALTER TABLE statement. The flexibility is undeniable.

The Query: Finding Prerequisite Lessons

To solve our business problem, we can “un-nest” the prerequisite IDs from the JSON array and use them to find the lessons.

1
2
3
4
5
6
7
8
9
10
11
12
SELECT DISTINCT lesson.lesson_title
FROM lessons
JOIN course_lessons ON lesson.lesson_id = course_lessons.lesson_id
WHERE course_lessons.course_id IN (
    -- Dynamically extract prerequisite IDs from the JSON
    SELECT
        jsonb_array_elements_text(journey_details -> 'prerequisites')::int
    FROM
        courses
    WHERE
        course_id = 103
);

This works perfectly. But is it robust? What happens when data gets messy?

Path 2: The Traditional Path of Integrity

Let’s rewind and take the other path. A seasoned database administrator would likely advise creating a dedicated linking table. This is the textbook, third-normal-form approach.

Table Structure:

1
2
3
4
5
6
7
8
9
10
11
CREATE TABLE course_prerequisites (
    course_id int REFERENCES courses(course_id) ON DELETE CASCADE,
    prerequisite_id int REFERENCES courses(course_id) ON DELETE CASCADE,
    PRIMARY KEY (course_id, prerequisite_id)
);

-- Sample Data
INSERT INTO course_prerequisites (course_id, prerequisite_id) VALUES
(102, 101),
(103, 101),
(103, 102);

The structure is more verbose, but it comes with a superpower: FOREIGN KEY constraints. It is now impossible to create a prerequisite link to a course that doesn’t exist. The database itself guarantees the integrity of our study journey.

The Query: Finding Prerequisite Lessons

The query to get the same result involves a classic, highly optimized JOIN.

1
2
3
4
5
SELECT DISTINCT lesson.lesson_title
FROM lessons
JOIN course_lessons ON lesson.lesson_id = course_lessons.lesson_id
JOIN course_prerequisites cp ON course_lessons.course_id = cp.prerequisite_id
WHERE cp.course_id = 103;

This query is declarative and, for many developers, instantly readable. It describes the relationships we want to traverse, leaving the execution details to the query planner.

Head-to-Head: A Battle of Philosophies

So, which design is “better”? Let’s frame it not as a technical benchmark, but as a clash of design philosophies.

Considerationjsonb Approach (Flexibility First)Traditional Linking Table (Integrity First)The Deeper Question
Data IntegrityIt’s a free-for-all. You could insert a prerequisite ID of 9999 that points to nothing. The burden of validation is entirely on your application code.Rock-solid. The database enforces the rules. Bad data is rejected at the gate. ON DELETE CASCADE handles cleanup automatically.Who should be the guardian of your data’s truth: your application or your database?
FlexibilitySupreme. Want to add a “replaces_course” attribute? Just add the key to your JSON. No schema changes, no deployments needed.Rigid. Adding a new relationship type requires a schema migration (ALTER TABLE), a more deliberate and involved process.Is ultimate flexibility always a good thing, or does it invite chaos? When do we need guardrails?
MaintainabilityDeceptive. What happens when you delete the “Beginner SQL” course? You must remember to write code that hunts down and removes every instance of 101 from every jsonb column.Simple. Delete the “Beginner SQL” course, and the ON DELETE CASCADE rule automatically cleans up the course_prerequisites table. It just works.Where do you want to invest your debugging time: in complex application logic or in well-defined database constraints?
PerformanceExcellent. With a GIN index, PostgreSQL can query inside your jsonb objects at incredible speed. Don’t let anyone tell you jsonb is slow.Excellent. JOINs on indexed foreign keys are what relational databases were born to do. This is their home turf.Since performance is often comparable, what other factors should drive our decision?

Finding the Middle Ground: The Hybrid Approach

Perhaps the question isn’t “either/or.” What if we combine these paths?

  • Use a traditional linking table (course_prerequisites) for your strict, business-critical rules. This is your source of truth.
  • Use a jsonb column (course_metadata) on the courses table for everything else: soft recommendations, lists of suggested reading, instructor notes, or versioning information.

This hybrid model gives you the best of both worlds: the unyielding data integrity of the relational model for what truly matters, and the schema-less flexibility of jsonb for the descriptive, less-structured data that surrounds it.

The Journey’s End

There is no single “right” answer in the jsonb vs. tables debate. By choosing the jsonb path, you are voting for speed of development and structural flexibility, accepting the responsibility of maintaining data integrity at the application level. By choosing the traditional path, you are voting for data integrity and long-term maintainability, enforced at the database level.

The next time you stand at a design crossroads, pause and consider the nature of your data. Is it a strict rule or a loose suggestion? Is its structure stable or bound to evolve?

For your next project, which path will you choose, and why?

This post is licensed under CC BY 4.0 by the author.