Post

Data Cleanup Between Tests

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
layout: post
title:  "The Ghost in the Test Suite: A Detective Story on Rails Cleanup"
date:   2025-09-16
categories: Rails
---

It started with a whisper of a bug: a **flaky test**. Not a test that failed all the time, but one that only failed when run alongside a specific Rake task spec, or worse, only when the test runner was feeling especially moody. As developers, we love deterministic systems, and "moody" tests are our kryptonite.

Our main suspect was a brand new Rake task: `users:report`.

```ruby
# spec/tasks/users_report_spec.rb
RSpec.describe 'users:report', type: :task do
  # This setup creates users the Rake task should process
  before { create_list(:user, 5, is_active: true) }

  it 'processes only active users' do
    # ... logic to run the Rake task ...
    expect(User.processed.count).to eq(5)
  end

  # Another test in the same file
  it 'sends out the report email' do
    # ... logic to run the Rake task ...
    expect(ReportMailer).to have_received(:send_report)
  end
end

The specs passed beautifully when run in isolation. But when we ran the whole suite? Chaos. Other, unrelated user specs would fail, claiming the database already had 5 extra users that shouldn’t be there. The data was leaking.

It was time to investigate the core problem of testing in Rails: Data Isolation.

Chapter 1: The Fast & Flaky Suspect—The Transaction

In a standard Rails/RSpec setup, we rely on the fastest, most convenient cleanup strategy: Transactions.

The configuration, usually found in spec/rails_helper.rb, looks like this:

1
2
3
4
RSpec.configure do |config|
  # The default configuration and our primary cleanup tool
  config.use_transactional_fixtures = true
end

How Transactions Work

Every single it or example block is wrapped in a database BEGIN TRANSACTION.

  1. Start: BEGIN TRANSACTION
  2. Test Runs: Data is created (e.g., User.create!).
  3. End: ROLLBACK.

Since the database changes were never committed, they are simply undone, leaving the database in its pristine pre-test state. It’s blindingly fast because no data is ever written to disk.

Why it Failed the Rake Task

The flaw in the plan? The Rake task runs outside the main test process/connection.

Our Rake task’s code would run, perform its database operations, and—crucially—commit them to the database. When the RSpec runner finished its it block, it would perform its ROLLBACK, but the uncommitted data it created was only part of the story. The Rake task’s committed changes remained, waiting like ghosts to haunt the next test.

Diagnosis: The default Transaction strategy is fast, but it only works if your application code runs in the same transactional context. Rake tasks, background jobs, and feature specs with JavaScript drivers (Capybara/Selenium) always break this rule.


Chapter 2: The Aggressive Solution—DatabaseCleaner

To handle these “external process” tests, we need a tool that can enforce isolation by performing a permanent, commit-based cleanup. Enter DatabaseCleaner .

DatabaseCleaner isn’t a strategy; it’s a strategy manager that allows us to switch between fast (transactional) and thorough (physical) cleanup methods as needed.

The “Sledgehammer” Strategy: Truncation

The most effective strategy for our flaky Rake task is Truncation. This command wipes the slate clean, literally.

StrategySQL CommandSpeedPrimary Use Case
TransactionBEGIN/ROLLBACKFastest 🚀Default for unit/request specs.
DeletionDELETE FROM tableSlowest 🐌Rarely used; respects auto-increment IDs.
TruncationTRUNCATE TABLE tableSlow 🐢External Processes (Rake, JS Specs). Fast wipe and resets auto-increment IDs.

The Fix: Configuring Strategy Switching

To keep our suite fast while fixing the Rake task, we must configure DatabaseCleaner to use the fast Transaction strategy by default, and only switch to the slow Truncation for the Rake task specs.

  1. Install the gem: Add gem 'database_cleaner-active_record' to your Gemfile.
  2. Global Setup: In your spec/rails_helper.rb, disable RSpec’s transactions and set up DatabaseCleaner to handle everything.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# In spec/rails_helper.rb or spec/support/database_cleaner.rb

RSpec.configure do |config|
  # 1. Disable RSpec's transaction wrapper, giving control to DatabaseCleaner
  config.use_transactional_fixtures = false 

  config.before(:suite) do
    # 2. Aggressively clean the DB once before the entire suite starts (good practice)
    DatabaseCleaner.clean_with(:truncation)
  end

  config.around(:each) do |example|
    # 3. Default to the FASTEST strategy (Transaction)
    DatabaseCleaner.strategy = :transaction

    # 4. 🚨 The Fix: Switch to the SLOW, THOROUGH strategy for Rake/JS specs
    if example.metadata[:rake] || example.metadata[:js]
      DatabaseCleaner.strategy = :truncation
    end

    # 5. Run the test inside the chosen cleanup strategy
    DatabaseCleaner.cleaning do
      example.run
    end
  end
end
  1. Tag the Flaky Test: Finally, we tag the Rake task spec so the conditional check in the setup runs.
1
2
3
4
# spec/tasks/users_report_spec.rb
RSpec.describe 'users:report', type: :task, **rake: true** do # ⬅️ ADD TAG
  # ... no more flaky data leaks!
end

By switching to Truncation specifically for the Rake task specs, we forced a full, committed database wipe after those tests run, eliminating the data leak and restoring order to the test suite.


Chapter 3: The Forgotten Cleanup—before(:all)

Our final piece of the cleanup puzzle involves one last edge case that can still cause leaks even with DatabaseCleaner: data created in a before(:all) hook.

Why before(:all) Leaks

A before(:all) (or before(:context)) hook runs once before any test examples in a group. Since it runs outside of any specific around(:each) block, the data it creates is not managed by the per-example transaction or truncation cleanup.

The Fix: Manual after(:all) Hook

Data created this way requires a corresponding manual cleanup hook.

Hook Used to Create DataCleanup Method Needed
before(:each) or letTransaction Rollback (automatic)
before(:all) or before(:context)Manual after(:all)/after(:context)

If you absolutely must use before(:all) (e.g., for speed with read-only data), you must pair it with a manual delete.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# The correct, manual way to cleanup before(:all) data
RSpec.describe User do
  # ⚠️ Data created here is OUTSIDE the per-test transaction!
  before(:all) do 
    @admin = create(:user, role: :admin) 
  end

  it 'does something' do
    # ... test logic ...
  end

  # ✅ MUST manually cleanup the committed data!
  after(:all) do 
    @admin.destroy 
  end
end

Conclusion: A Clean Database is a Happy Developer

Data isolation is non-negotiable for a reliable test suite. By understanding the mechanisms behind transactional fixtures and strategically applying the more aggressive Truncation strategy via DatabaseCleaner for tests that break the transaction rule (Rake tasks, JS-enabled feature specs), you can eliminate flaky tests and restore faith in your code. Default to Transaction, Fallback to Truncation.

This post is licensed under CC BY 4.0 by the author.