I downloaded the latest check out of PostgreSQL 8.3 (not even beta yet) just to see how things are progressing. Of particular interest to me was a recent conversation about sequential scans I had with Jeff Davis. In short, in 8.3 sequential scans should be faster.
I decided to test the theory. Using the exact same machine, and an optimized 8.1.8 installation, I compared 8.1.8 versus a out of the box unconfigured (except for the tcpip port change) 8.3.
First step, create table and populate on both 8.1.8 and 8.3dev.
CREATE TABLE seqtest (test integer); INSERT INTO seqtest (test) VALUES (generate_series(1,1000000));Next step execute a guaranteed sequential scan query on 8.3dev.
EXPLAIN ANALYZE SELECT COUNT(*) FROM seqtest;
QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=16905.05..16905.06 rows=1 width=0) (actual time=4633.981..4633.984 rows=1 loops=1) -> Seq Scan on seqtest (cost=0.00..14405.24 rows=999924 width=0) (actual time=0.363..2317.909 rows=1000000 loops=1) Total runtime: 4634.037 msI ran the above EXPLAIN ANALYZE on 8.3dev five times, all with extremely similar results. In fact the variance was only about 70 milliseconds one way or the other. Then I performed the exact same query on 8.1.9, five times.
QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=17904.90..17904.91 rows=1 width=0) (actual time=4905.377..4905.379 rows=1 loops=1) -> Seq Scan on seqtest (cost=0.00..15405.12 rows=999912 width=0) (actual time=0.025..2536.639 rows=1000000 loops=1) Total runtime: 4905.515 msAgain the numbers were consistent across five runs. This is great news. O.k. 300 milliseconds isn't alot... Actually yes it is, that is almost a third of full second, now think about it over 10 million rows. I also ran the same test against SELECT sum() and got similar results. 8.3 does indeed appear to be faster than 8.1 for seq scans. Updated: 08/24/07 I now have what I suspect is the reason for the speed up:
seqtest=# select version(),pg_relation_size('seqtest'); -[ RECORD 1 ]----+------------------------------------------------------------------------------------------------ version | PostgreSQL 8.1.8 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2 (Ubuntu 4.1.2-0ubuntu3) pg_relation_size | 44285952
seqtest=# select version(),pg_relation_size('seqtest'); -[ RECORD 1 ]----+--------------------------------------------------------------------------------------------------------- version | PostgreSQL 8.3devel on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 (Ubuntu 4.1.2-0ubuntu4) pg_relation_size | 36093952You will note that 8.3 is approximately 19% smaller for the same data set. In 8.3 the hackers were able to reduce the size of the row header. Amazing what the little things can do.