sqlite-parquet-vtable/parquet
Colin Dellow 6648ff5968 add string == row group filter
For the statscan census set filtering on `== 'Dawson Creek'`, the query
goes from 980ms to 660ms.

This is expected, since the data isn't sorted by that column.

I'll try adding some scaffolding to do filtering at the row level, too.

We could also try unpacking the dictionary and testing the individual
values, although we may want some heuristics to decide whether it's
worth doing -- eg if < 10% of the rows have a unique value.

Ideally, this should be like a ~1ms query.
2018-03-15 20:40:21 -04:00
..
.gitignore Initial checkin of CSV table 2018-03-02 18:59:34 -05:00
Makefile reuse FileMetaData 2018-03-15 19:57:38 -04:00
cmds.txt Code to pretty print constraints 2018-03-10 10:59:53 -05:00
go Code to pretty print constraints 2018-03-10 10:59:53 -05:00
parquet.cc Remove bool from Constraint 2018-03-12 20:50:30 -04:00
parquet_cursor.cc add string == row group filter 2018-03-15 20:40:21 -04:00
parquet_cursor.h Add stub row group filters for text/int/dbl 2018-03-12 23:07:41 -04:00
parquet_filter.cc add string == row group filter 2018-03-15 20:40:21 -04:00
parquet_filter.h add string == row group filter 2018-03-15 20:40:21 -04:00
parquet_table.cc reuse FileMetaData 2018-03-15 19:57:38 -04:00
parquet_table.h reuse FileMetaData 2018-03-15 19:57:38 -04:00