
An ISO/IEC27001:2013 and ISO 27018:2019 certified cloud solution
© 2025 Perx Technologies. All rights reserved.
CREATE TABLE event_logs (
id BIGINT PRIMARY KEY,
state VARCHAR,
last_requeued_at TIMESTAMP,
params JSON
);
A cron job runs every few minutes to pick up pending event logs and queue them for processing in batches:
SELECT "event_logs".*
FROM "event_logs"
WHERE "event_logs"."state" = 'initial'
AND "event_logs"."last_requeued_at" <= '2025-12-15 10:34:32.473436'
ORDER BY "event_logs"."id" ASC
LIMIT 1000;
SELECT "event_logs".*
FROM "event_logs"
WHERE "event_logs"."state" = 'initial'
AND "event_logs"."last_requeued_at" <= '2025-12-15 10:34:32.473436'
AND "event_logs"."id" > 1000
ORDER BY "event_logs"."id" ASC
LIMIT 1000;
CREATE INDEX index_event_logs_on_initial_id_last_requeued_at
ON event_logs USING btree (id, last_requeued_at)
WHERE state = 'initial';
In production, we were processing around 900 writes/second on the event_logs table:
Private Offers let you:
The table had around 1 billion rows.
Initially, query latency was around 3 milliseconds—excellent performance.
However, after a few hours, the batching query started experiencing high latency. Running EXPLAIN ANALYZE showed the query was taking several seconds, even with the index in place.
Our first instinct was to suspect the index. We ran REINDEX, which rebuilt the index from scratch. Immediately after, the query was blazing fast again. But within hours, the problem returned.
Something deeper was going on.
PostgreSQL uses MVCC to handle concurrent transactions. Instead of modifying rows in place, PostgreSQL keeps multiple versions of the same row
Let’s trace what happens to our event logs and how this affects our partial index as they move through states:
INSERT INTO event_logs (id, state, last_requeued_at, params)
VALUES (1, 'initial', '2024-01-01 10:00:00', '{"key": "value"}');
UPDATE event_logs SET state='processing' WHERE id=1;
Hence, each event log that transitions from initial to processing state creates a dead entry in the partial index.
UPDATE event_logs SET state='processed' WHERE id=1;
Result: 3 physical row copies on disk, with accumulated dead entries in the partial index.
(Update from processing to processed state doesn’t affect the partial index since the partial index is only on the rows with ‘initial’ state)
When VACUUM runs, it processes both the table and its indexes, physically removing dead entries and marking space as reusable.
Important: Regular VACUUM doesn’t compact indexes or shrink files—it only marks space as reusable. For true compaction, you need VACUUM FULL, REINDEX, or pg_repack.
PostgreSQL’s automatic background process that runs VACUUM periodically. It’s enabled by default, so why did we still have problems?
PostgreSQL’s default autovacuum settings aren’t aggressive enough for high-write tables, especially large ones.
Result: 3 physical row copies on disk, with accumulated dead entries in the partial index.
(Update from processing to processed state doesn’t affect the partial index since the partial index is only on the rows with ‘initial’ state)
Autovacuum triggers when:
dead_tuples > autovacuum_vacuum_threshold + (autovacuum_vacuum_scale_factor × total_rows)
For our event_logs table with 1 billion rows, autovacuum would only trigger after:
50 + (0.2 × 1,000,000,000) = ~200 million dead tuples
With 600 dead tuples/second being generated: (roughly one per update, so 600/s from the two updates)
600 dead tuples/sec × 86,400 seconds/day = ~51.8 million dead tuples/day
It would take nearly 4 days to hit the autovacuum threshold, during which our index would accumulate millions of dead entries, causing severe query degradation.
We made autovacuum much more aggressive for this specific table:
ALTER TABLE test_schema.event_logs SET (
autovacuum_vacuum_scale_factor = 0.005,
autovacuum_analyze_scale_factor = 0.01 -- Keeps table stats fresh
);
This sets the scale factor to 0.5% instead of 20%.
New trigger threshold for 1 billion rows:
50 + (0.005 × 1,000,000,000) = ~5 million dead tuples
This means autovacuum now runs approximately every 2.3 hours instead of every 4 days, keeping dead tuple accumulation under control and maintaining fast query performance.
To make the autovacuum process faster, you could set autovacuum_vacuum_cost_delay to zero.
You can check dead tuple counts and autovacuum statistics with this query:
SELECT
schemaname,
relname AS table_name,
n_live_tup,
n_dead_tup,
last_autovacuum,
last_vacuum,
autovacuum_count,
vacuum_count
FROM pg_stat_user_tables
WHERE relname = 'event_logs'
AND schemaname = 'your_schema';
Log in with your AWS account, find Perx Technologies on AWS Marketplace, and check out our solution on the AWS Marketplace. Then reach out to us to request a Private Offer for custom pricing.
Yes. Perx supports Private Offers through AWS Marketplace. This means you can work directly with us to define custom pricing, term lengths, or payment schedules while keeping AWS billing and procurement.
Yes – Perx is accessible to AWS customers across Singapore, Malaysia, the Philippines, Indonesia, Australia, and other supported regions. Reach out to know more.

Blogs

Sustainability

Blogs

Blogs

Blogs
Perx Technologies Pte Ltd
20A Tanjong Pagar Road
Singapore 088443
An ISO/IEC27001:2013 and ISO 27018:2019 compliant cloud solution


© 2025 Perx Technologies. All rights reserved.
© 2025 Perx Technologies. All rights reserved.

Hey! Shashank