Series: Partitioning in the Wild | Post 1 of 10
Platform: PostgreSQL 18.3 (Percona Distribution) | Tested on Docker
Introduction:
- Partitioning is one of those topics every DBA claims to understand until something breaks at 3 am and the on-call phone starts ringing.
- In this post, we will run three different PostgreSQL partitioning approaches side by side with live IoT sensor data flowing at 10 rows per second. We will watch what happens at the exact moment the clock crosses an hour boundary. One of the three approaches will fail.
- This is Post 1 of a 10 part series covering partitioning across PostgreSQL, MySQL, MariaDB, Oracle and MongoDB.
- By the end of this post we will understand why constraint-based partitioning is a liability in any system where data arrives continuously and why native declarative partitioning is only half a solution.
- We will also see what pg_partman adds that makes the difference between a table that survives production and one that does not.
Note:
- All commands in this article are tested on PostgreSQL 18.3 running in Docker on a Windows laptop.
- It is always best to test in a non-production environment before applying any partitioning changes to a live system.
The Problem We Are Solving:
- When a table grows large, queries slow down, vacuums take longer and backups become painful. Partitioning splits the data into smaller physical chunks while keeping one logical table our application queries normally.
- PostgreSQL gives you three ways to do this:
- Constraint-based (legacy trigger based routing)
- Native declarative partitioning (PG v10+)
- pg_partman (automated lifecycle on top of native)
- The catch: partitions do not create themselves. Native declarative throws an error if no matching partition exists. Constraint-based raises a trigger exception. Only pg_partman creates the next partition in advance so data always has somewhere to go. We are going to prove all three behaviours with real data in real time.
The Setup:
Three tables, identical schema, one Python script feeding all three simultaneously.
Lab Environment:
| Component | Version / Detail |
|---|---|
| PostgreSQL | 18.3 (Percona Distribution) |
| pg_partman | 5.4.3 |
| Docker | Desktop (Windows) |
| Python | 3.x with psycopg2-binary |
| Sensors | 10 virtual IoT sensors |
| Data rate | 10 rows/sec per approach |
Sensor Locations:
| Sensor ID | Location |
|---|---|
| 101 | Server Room A |
| 102 | Server Room B |
| 103 | Data Hall North |
| 104 | Data Hall South |
| 105 | UPS Room |
| 106 | Cooling Unit 1 |
| 107 | Cooling Unit 2 |
| 108 | Office Floor 1 |
| 109 | Office Floor 2 |
| 110 | Rooftop Station |
Each sensor pushes temperature, humidity and pressure readings into all three partition approaches at the same time. Same data, same timestamps, no advantage to any approach.
Approach A: Native Declarative Partitioning
PostgreSQL native declarative partitioning defines the parent table structure and lets you attach child tables for specific value ranges. The planner understands the partition boundaries and can skip irrelevant partitions at query time. This is called partition pruning.
The problem: partitions must exist before data arrives. We need to create them manually.
CREATE TABLE lab.sensor_native ( reading_id BIGSERIAL, sensor_id INT NOT NULL, location TEXT NOT NULL, temperature NUMERIC(5,2) NOT NULL, humidity NUMERIC(5,2) NOT NULL, pressure NUMERIC(7,2) NOT NULL, recorded_at TIMESTAMPTZ NOT NULL DEFAULT now() ) PARTITION BY RANGE (recorded_at); -- Create partitions manually -- current hour and next hour only CREATE TABLE lab.sensor_native_p_current PARTITION OF lab.sensor_native FOR VALUES FROM (date_trunc('hour', now())) TO (date_trunc('hour', now()) + interval '1 hour'); CREATE TABLE lab.sensor_native_p_next PARTITION OF lab.sensor_native FOR VALUES FROM (date_trunc('hour', now()) + interval '1 hour') TO (date_trunc('hour', now()) + interval '2 hours'); CREATE INDEX ON lab.sensor_native_p_current (sensor_id, recorded_at); CREATE INDEX ON lab.sensor_native_p_next (sensor_id, recorded_at);
We created two partitions. Current hour and next hour. That is two hours of runway before this table starts rejecting inserts.
Approach B: Constraint-Based Partitioning
This is the pre-PG10 pattern. It uses PostgreSQL table inheritance with CHECK constraints on each child table and a trigger function that inspects every incoming row and routes it to the correct child table.
CREATE TABLE lab.sensor_constraint ( reading_id BIGSERIAL, sensor_id INT NOT NULL, location TEXT NOT NULL, temperature NUMERIC(5,2) NOT NULL, humidity NUMERIC(5,2) NOT NULL, pressure NUMERIC(7,2) NOT NULL, recorded_at TIMESTAMPTZ NOT NULL DEFAULT now() ); -- Child table with CHECK constraint locking it to the current hour CREATE TABLE lab.sensor_constraint_p_current ( CHECK (recorded_at >= date_trunc('hour', now()) AND recorded_at < date_trunc('hour', now()) + interval '1 hour') ) INHERITS (lab.sensor_constraint); CREATE TABLE lab.sensor_constraint_p_next ( CHECK (recorded_at >= date_trunc('hour', now()) + interval '1 hour' AND recorded_at < date_trunc('hour', now()) + interval '2 hours') ) INHERITS (lab.sensor_constraint);
The trigger function inspects the row timestamp and decides which child table to insert into:
CREATE OR REPLACE FUNCTION lab.sensor_constraint_insert() RETURNS TRIGGER LANGUAGE plpgsql AS $$ DECLARE v_hour TIMESTAMPTZ := date_trunc('hour', NEW.recorded_at); BEGIN IF NEW.recorded_at >= v_hour AND NEW.recorded_at < v_hour + interval '1 hour' THEN INSERT INTO lab.sensor_constraint_p_current VALUES (NEW.*); ELSIF NEW.recorded_at >= v_hour + interval '1 hour' AND NEW.recorded_at < v_hour + interval '2 hours' THEN INSERT INTO lab.sensor_constraint_p_next VALUES (NEW.*); ELSE RAISE EXCEPTION 'No partition for recorded_at = %', NEW.recorded_at; END IF; RETURN NULL; END; $$; CREATE TRIGGER sensor_constraint_insert_trigger BEFORE INSERT ON lab.sensor_constraint FOR EACH ROW EXECUTE FUNCTION lab.sensor_constraint_insert();
Look at the CHECK constraint carefully. date_trunc('hour', now()) is evaluated at CREATE TABLE time and stored as a fixed boundary. The trigger function recalculates it at INSERT time. When the clock moves to the next hour, those two evaluations point to different hours. That mismatch is the bug.
Approach C: pg_partman
pg_partman manages the partition lifecycle automatically. We call one function, tell it the table, the control column, the interval, and how many future partitions to keep ready. A background worker handles the rest while our application keeps running.
CREATE TABLE lab.sensor_partman ( reading_id BIGSERIAL, sensor_id INT NOT NULL, location TEXT NOT NULL, temperature NUMERIC(5,2) NOT NULL, humidity NUMERIC(5,2) NOT NULL, pressure NUMERIC(7,2) NOT NULL, recorded_at TIMESTAMPTZ NOT NULL DEFAULT now() ) PARTITION BY RANGE (recorded_at); SELECT partman.create_parent( p_parent_table => 'lab.sensor_partman', p_control => 'recorded_at', p_interval => '1 hour', p_premake => 3 );
After that single function call, check what was created:
SELECT child.relname AS partition, pg_size_pretty(pg_relation_size(child.oid)) AS size FROM pg_inherits JOIN pg_class parent ON parent.oid = pg_inherits.inhparent JOIN pg_class child ON child.oid = pg_inherits.inhrelid WHERE parent.relname = 'sensor_partman' ORDER BY child.relname;
sensor_partman_p20260522_000000 0 bytes sensor_partman_p20260522_010000 0 bytes sensor_partman_p20260522_020000 0 bytes sensor_partman_p20260522_030000 0 bytes <-- current at lab time sensor_partman_p20260522_040000 0 bytes sensor_partman_p20260522_050000 0 bytes sensor_partman_p20260522_060000 0 bytes sensor_partman_default 0 bytes <-- safety net
Eight partitions. Zero manual CREATE TABLE statements. Three hours ahead at all times. One default partition that catches anything that does not fit anywhere else.
Starting the Data Flow
[ ADD SCREENSHOT: Python terminal showing live counter -- Sent: 1850 rows | Rate: 9.6/s | Native: 1850 | Constraint: 1850 | Partman: 1850 | Errors: 0 ]
The Python script ran for a few minutes without incident. All three counters moved in sync. Row counts were identical across all three approaches. Then the clock hit 04:00 UTC.
The Moment It Broke
At 04:00:00.073 UTC, this appeared in the terminal:
[ERROR] new row for relation "sensor_constraint_p_current" violates
check constraint "sensor_constraint_p_current_recorded_at_check"
DETAIL: Failing row contains (2157, 108, Office Floor 1, 39.21, 29.46,
1014.42, 2026-05-22 04:00:00.07341+00).
CONTEXT: SQL statement "INSERT INTO lab.sensor_constraint_p_current VALUES (NEW.*)"
PL/pgSQL function lab.sensor_constraint_insert() line 7 at SQL statement
[ ADD SCREENSHOT: Your actual Python terminal showing this error ]
Here is what happened. The trigger function recalculated v_hour as 04:00 and tried to insert into p_current. But p_current was created when now() was 03:xx, so its CHECK constraint was frozen at the 03:00 to 04:00 range. The insert violated that constraint. The trigger had no fallback. It just failed.
Native declarative and pg_partman kept going without interruption.
Checking the Damage
SELECT * FROM lab.v_row_counts; approach | total_rows -------------------+----------- sensor_native | 4120 sensor_constraint | 4120 sensor_partman | 4120
All three show 4120 rows. The constraint-based approach recovered on the next retry because the trigger eventually routed correctly to p_next. But this was not graceful recovery. It was luck. If the data had arrived two or more hours ahead, the trigger would have hit the ELSE branch and raised an exception with no path forward.
Now look at exactly which partition received the data at the boundary:
SELECT tableoid::regclass AS partition, count(*) AS rows, min(recorded_at) AS earliest, max(recorded_at) AS latest FROM lab.sensor_partman GROUP BY tableoid::regclass ORDER BY latest DESC; partition | rows | earliest | latest -------------------------------------+------+--------------------------------+------------------------------- lab.sensor_partman_p20260522_040000 | 1970 | 2026-05-22 04:00:00.18158+00 | 2026-05-22 04:03:24.978939+00 lab.sensor_partman_p20260522_030000 | 2150 | 2026-05-22 03:56:16.087545+00 | 2026-05-22 03:59:59.344688+00
Partman switched partitions at exactly 04:00:00. No gaps. No errors. The pre-created p20260522_040000 was already there waiting.
The EXPLAIN Plans: Where the Real Difference Shows
Same query on all three tables. Count rows from the current hour.
Native Declarative:
Aggregate
-> Append
Subplans Removed: 1
-> Bitmap Index Scan on sensor_native_p_next_sensor_id_recorded_at_idx
One partition scanned. One eliminated at plan time.
Constraint-Based:
Aggregate -> Append -> Seq Scan on sensor_constraint -> Index Only Scan on sensor_constraint_p_current -> Bitmap Heap Scan on sensor_constraint_p_next
Three scans. Parent table and both child tables. No pruning at all. The planner cannot eliminate child tables when the WHERE clause uses now() because it cannot evaluate runtime functions at plan time. Scale this to a year of hourly data (8760 partitions) and this query touches every single one on every execution.
pg_partman:
Aggregate -> Append Subplans Removed: 7 -> Seq Scan on sensor_partman_p20260522_040000
Seven of eight partitions removed at plan time. One scanned.
[ ADD SCREENSHOT: Side-by-side EXPLAIN outputs from psql. Highlight "Subplans Removed: 7" ]
Force the Failure Without Waiting
We do not need to wait for a real hour boundary to see the failures. Insert a row with a timestamp that has no matching partition:
-- Native declarative fails immediately INSERT INTO lab.sensor_native (sensor_id, location, temperature, humidity, pressure, recorded_at) VALUES (101, 'Server Room A', 25.5, 60.0, 1013.0, date_trunc('hour', now()) + interval '5 hours'); ERROR: no partition of relation "sensor_native" found for row DETAIL: Partition key of the failing row contains (recorded_at) = (2026-05-22 09:00:00+00).
-- Constraint-based also fails INSERT INTO lab.sensor_constraint (sensor_id, location, temperature, humidity, pressure, recorded_at) VALUES (101, 'Server Room A', 25.5, 60.0, 1013.0, date_trunc('hour', now()) + interval '5 hours'); ERROR: No partition for recorded_at = 2026-05-22 09:00:00+00 CONTEXT: PL/pgSQL function lab.sensor_constraint_insert()
-- pg_partman handles it silently via the default partition INSERT INTO lab.sensor_partman (sensor_id, location, temperature, humidity, pressure, recorded_at) VALUES (101, 'Server Room A', 25.5, 60.0, 1013.0, date_trunc('hour', now()) + interval '5 hours'); INSERT 0 1
Let us check where it landed:
SELECT tableoid::regclass AS partition, recorded_at FROM lab.sensor_partman ORDER BY recorded_at DESC LIMIT 3; partition | recorded_at ---------------------------+----------------------------- lab.sensor_partman_default | 2026-05-22 09:00:00+00
The default partition caught it. When partman.run_maintenance() runs next, it creates the future partition and the row moves there.
Comparison Summary
| Native Declarative | Constraint-Based | pg_partman | |
|---|---|---|---|
| Partition creation | Manual | Manual | Automatic |
| Hour boundary handling | Works if p_next exists | Fails then recovers (luck) | Seamless |
| Query pruning | Yes | No | Yes |
| Missing partition | Hard error | Trigger exception | Default partition |
| Trigger overhead | None | One call per insert | None |
| Ongoing maintenance | Manual | Manual | Automatic via BGW |
Key Takeaways
- Constraint-based partitioning is a pre-PG10 pattern. If you are on PG10 or later there is no valid reason to use it. The trigger overhead on every insert, the absence of partition pruning, and the boundary failure behaviour all make it the wrong choice for continuous data workloads.
- Native declarative partitioning works correctly and prunes efficiently but puts the maintenance responsibility entirely on you. Two manually created partitions means two hours of runway. In a system that runs 24 hours a day you need an automated process creating partitions ahead of time or you are one missed cron job away from an outage.
- pg_partman is that automated process. It sits on top of native declarative and adds the lifecycle management that PostgreSQL does not provide natively. The
premakeparameter keeps future partitions always ready. The default partition catches anything unexpected. The background worker runs maintenance without manual intervention.
Note:
- All error outputs shown in this post are real and captured during the live lab session, not constructed for illustration.
- The Python sensor generator script pushes data into all three tables simultaneously so the comparison is fair. Same data, same timestamps, same load.
- Lab files and scripts are available here: [Add GitHub link]
What is Next
Native declarative survived because p_next was pre-created and ready. What happens at the next boundary when p_next becomes the current partition and there is nothing beyond it? And how does pg_partman's background worker decide when to create new partitions and how far ahead to look? That is Post 2.
Series: Partitioning in the Wild
- Post 1: Your partitioned table will fail at midnight (this post)
- Post 2: Pre-creating partitions manually is how incidents start at 3am (coming soon)
- Post 3: Moving 50 million rows into a partitioned table without downtime (coming soon)
- Post 4: Partition pruning: the feature that only works if you write queries right (coming soon)
- Post 5: What breaks in your partitioned tables when you upgrade to PG18 (coming soon)

No comments:
Post a Comment