1
0
mirror of https://github.com/cldellow/sqlite-parquet-vtable.git synced 2025-03-12 07:49:45 +00:00
Colin Dellow d3ab5ff3e7 Cache clauses -> row group mapping
Create a shadow table. For `stats`, it'd be `_stats_rowgroups`.

It contains three columns:

- the clause (eg `city = 'Dawson Creek'`)
- the initial estimate, as a bitmap of rowgroups based on stats
- the actual observed rowgroups, as a bitmap

This papers over poorly sorted parquet files, at the cost of some disk
space. It makes interactive queries much more natural -- drilldown style
queries are much faster, as they can leverage work done by previous
queries.

eg 'SELECT * FROM stats WHERE city = 'Dawson Creek' and question_id >= 1935 and question_id <= 1940`
takes ~584ms on first run, but 9ms on subsequent runs.

We only create entries when the estimates don't match the actual
results.

Fixes #6
2018-03-24 23:57:15 -04:00
..
2018-03-18 19:11:26 -04:00
2018-03-24 12:48:29 -04:00
2018-03-24 19:02:30 -04:00
2018-03-04 22:48:39 -05:00
2018-03-04 22:48:39 -05:00