sqlite-parquet-vtable

Author	SHA1	Message	Date
Colin Dellow	d7c5002cee	Move some code out of ensureColumn Saves ~4% on the cold census needle query (~425ms -> ~405ms)	2018-06-23 19:10:23 -04:00
Colin Dellow	d3ab5ff3e7	Cache clauses -> row group mapping Create a shadow table. For `stats`, it'd be `_stats_rowgroups`. It contains three columns: - the clause (eg `city = 'Dawson Creek'`) - the initial estimate, as a bitmap of rowgroups based on stats - the actual observed rowgroups, as a bitmap This papers over poorly sorted parquet files, at the cost of some disk space. It makes interactive queries much more natural -- drilldown style queries are much faster, as they can leverage work done by previous queries. eg 'SELECT * FROM stats WHERE city = 'Dawson Creek' and question_id >= 1935 and question_id <= 1940` takes ~584ms on first run, but 9ms on subsequent runs. We only create entries when the estimates don't match the actual results. Fixes #6	2018-03-24 23:57:15 -04:00
Colin Dellow	92ba5f94e0	reuse FileMetaData For the statscan dataset, parsing the file metadata takes ~30-40ms, so stash it away for future re-use.	2018-03-15 19:57:38 -04:00
Colin Dellow	824a416f51	better debug logs for xBestIndex	2018-03-08 13:21:33 -05:00
Colin Dellow	1de843fca8	Very rough first cut supports int32, double, strings.	2018-03-03 15:44:01 -05:00