Why Your Fast PostgreSQL Query Suddenly Became Slow A Deep Dive into MVCC and Index Bloat

Prabhu Srinivasan

Software Architect | Dec 19, 2025

Why Your Fast PostgreSQL Query Suddenly Became Slow: A Deep Dive into MVCC and Index Bloat

Ever noticed a query using an index correctly that’s blazing fast, then suddenly slows down – especially on tables with high write volume? We recently experienced this exact issue in production. Here’s what we learned.

Our Setup

We have an event_logs table with the following structure:
				
					CREATE TABLE event_logs (
  id BIGINT PRIMARY KEY,
  state VARCHAR,
  last_requeued_at TIMESTAMP,
  params JSON
);
				
			

A cron job runs every few minutes to pick up pending event logs and queue them for processing in batches:

Batch 1:

				
					SELECT "event_logs".*
FROM "event_logs"
WHERE "event_logs"."state" = 'initial'
  AND "event_logs"."last_requeued_at" <= '2025-12-15 10:34:32.473436'
ORDER BY "event_logs"."id" ASC
LIMIT 1000;
				
			

Batch 2:

				
					SELECT "event_logs".*
FROM "event_logs"
WHERE "event_logs"."state" = 'initial'
  AND "event_logs"."last_requeued_at" <= '2025-12-15 10:34:32.473436'
  AND "event_logs"."id" > 1000
ORDER BY "event_logs"."id" ASC
LIMIT 1000;
				
			
To optimize this query, we created a partial index that only indexes records in the ‘initial’ state:
				
					CREATE INDEX index_event_logs_on_initial_id_last_requeued_at 
ON event_logs USING btree (id, last_requeued_at) 
WHERE state = 'initial';
				
			

The Problem

In production, we were processing around 900 writes/second on the event_logs table:

Private Offers let you:

  • 300 new inserts/second (state = ‘initial’)
  • 300 updates/second (state from initial → processing)
  • 300 updates/second (state from processing → processed)

The table had around 1 billion rows.

Initially, query latency was around 3 milliseconds—excellent performance.

However, after a few hours, the batching query started experiencing high latency. Running EXPLAIN ANALYZE showed the query was taking several seconds, even with the index in place.

Our first instinct was to suspect the index. We ran REINDEX, which rebuilt the index from scratch. Immediately after, the query was blazing fast again. But within hours, the problem returned.

Something deeper was going on.

Understanding PostgreSQL's MVCC (Multi-Version Concurrency Control)

PostgreSQL uses MVCC to handle concurrent transactions. Instead of modifying rows in place, PostgreSQL keeps multiple versions of the same row

  • When a row is updated, PostgreSQL marks the old version as dead and creates a new version
  • When a row is deleted, PostgreSQL marks it as dead but doesn’t physically remove it
  • The original row remains on disk until cleanup occurs (we’ll read about the cleanup later in our blog)

Let’s trace what happens to our event logs and how this affects our partial index as they move through states:

Initial Insert

				
					INSERT INTO event_logs (id, state, last_requeued_at, params) 
VALUES (1, 'initial', '2024-01-01 10:00:00', '{"key": "value"}');
				
			
Physical storage:
  • Row version 1: id=1, state=’initial’, last_requeued_at=’2024-01-01 10:00:00′
  • Partial index: Entry for (id=1, last_requeued_at) → row version 1

Update 1: 'initial' → 'processing'

				
					UPDATE event_logs SET state='processing' WHERE id=1;

				
			
Physical storage:
  • Row version 1: state=’initial’ (dead tuple)
  • Row version 2: state=’processing’ (current)
  • Partial index: The previous entry pointing to the ‘initial’ row version is marked dead. The dead entry stays in the partial index until VACUUM runs (We’ll see later on what VACUUM is). No new entry is added since state=’processing’ doesn’t match the index’s WHERE clause.

Hence, each event log that transitions from initial to processing state creates a dead entry in the partial index.

Update 2: 'processing' → 'processed'

				
					UPDATE event_logs SET state='processed' WHERE id=1;

				
			
Physical storage:
  • Row version 1: state=’initial’ (dead)
  • Row version 2: state=’processing’ (dead)
  • Row version 3: state=’processed’ (current)
  • Partial index: Still has 1 dead entry from the original ‘initial’ state

Result: 3 physical row copies on disk, with accumulated dead entries in the partial index.

(Update from processing to processed state doesn’t affect the partial index since the partial index is only on the rows with ‘initial’ state)

How PostgreSQL Cleans Up Dead Tuples

VACUUM

When VACUUM runs, it processes both the table and its indexes, physically removing dead entries and marking space as reusable.

Important: Regular VACUUM doesn’t compact indexes or shrink files—it only marks space as reusable. For true compaction, you need VACUUM FULL, REINDEX, or pg_repack.

AUTOVACUUM

PostgreSQL’s automatic background process that runs VACUUM periodically. It’s enabled by default, so why did we still have problems?

The Root Cause: Default Autovacuum Settings

PostgreSQL’s default autovacuum settings aren’t aggressive enough for high-write tables, especially large ones.

Default settings:
  • autovacuum_vacuum_threshold = 50 rows
  • autovacuum_vacuum_scale_factor = 0.2 (20%)
  • autovacuum_vacuum_cost_limit = 200 (Number of vacuum cost units an autovacuum worker can accumulate before it must sleep)
  • autovacuum_vacuum_cost_delay = 2ms

Result: 3 physical row copies on disk, with accumulated dead entries in the partial index.

(Update from processing to processed state doesn’t affect the partial index since the partial index is only on the rows with ‘initial’ state)

Autovacuum triggers when:

				
					dead_tuples > autovacuum_vacuum_threshold + (autovacuum_vacuum_scale_factor × total_rows)

				
			

For our event_logs table with 1 billion rows, autovacuum would only trigger after:

				
					50 + (0.2 × 1,000,000,000) = ~200 million dead tuples

				
			

With 600 dead tuples/second being generated: (roughly one per update, so 600/s from the two updates)

				
					600 dead tuples/sec × 86,400 seconds/day = ~51.8 million dead tuples/day

				
			

It would take nearly 4 days to hit the autovacuum threshold, during which our index would accumulate millions of dead entries, causing severe query degradation.

The Solution

We made autovacuum much more aggressive for this specific table:

				
					ALTER TABLE test_schema.event_logs SET (
  autovacuum_vacuum_scale_factor = 0.005,
  autovacuum_analyze_scale_factor = 0.01 -- Keeps table stats fresh
);
				
			

This sets the scale factor to 0.5% instead of 20%.

New trigger threshold for 1 billion rows:

				
					50 + (0.005 × 1,000,000,000) = ~5 million dead tuples

				
			

This means autovacuum now runs approximately every 2.3 hours instead of every 4 days, keeping dead tuple accumulation under control and maintaining fast query performance.

To make the autovacuum process faster, you could set autovacuum_vacuum_cost_delay to zero.

Monitoring Autovacuum Activity

You can check dead tuple counts and autovacuum statistics with this query:

				
					SELECT
  schemaname,
  relname AS table_name,
  n_live_tup,
  n_dead_tup,
  last_autovacuum,
  last_vacuum,
  autovacuum_count,
  vacuum_count
FROM pg_stat_user_tables
WHERE relname = 'event_logs' 
  AND schemaname = 'your_schema';
				
			

Key Takeaways

  • MVCC creates dead tuples – Every update creates a new row version and leaves the old one as a dead tuple
  • Indexes accumulate dead entries – Including partial indexes, which must be cleaned by VACUUM
  • Dead entries slow down queries – PostgreSQL must check visibility and skip dead entries during index scans
  • Default autovacuum settings don’t scale – The 20% threshold is too high for large, high-write tables
  • Tune per-table settings – Aggressive autovacuum settings on hot tables prevent bloat-related performance issues

Reference Resources

FAQs

How do I buy Perx on AWS Marketplace?

Log in with your AWS account, find Perx Technologies on AWS Marketplace, and check out our solution on the AWS Marketplace. Then reach out to us to request a Private Offer for custom pricing.

Yes. Perx supports Private Offers through AWS Marketplace. This means you can work directly with us to define custom pricing, term lengths, or payment schedules while keeping AWS billing and procurement.

Yes – Perx is accessible to AWS customers across Singapore, Malaysia, the Philippines, Indonesia, Australia, and other supported regions. Reach out to know more.

Recommended for you

Loyalty Engagement Platform Built for the Mobile-first Economy
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.