sqlite-parquet-vtable

mirror of https://github.com/cldellow/sqlite-parquet-vtable.git synced 2025-07-22 18:33:29 +00:00

Author	SHA1	Message	Date
Colin Dellow	fd06ec5a23	test `rowid IS NULL` Found via coverage	2018-07-05 19:17:19 -04:00
Colin Dellow	ebb0eb7710	Add test case for #30	2018-07-04 19:59:55 -04:00
Colin Dellow	2e1ac92882	Revert "Add other random test case" This reverts commit 3bdc6f7078e39e56e21bb3fb965c01fd8447479e.	2018-07-04 19:49:42 -04:00
Colin Dellow	3bdc6f7078	Add other random test case	2018-07-04 19:48:36 -04:00
Colin Dellow	33f8dbe4f4	Add test case for #26	2018-07-04 19:45:08 -04:00
Colin Dellow	5b26a78c1f	Improve random query generation ...throw in the occasional `NOT`	2018-07-04 19:14:35 -04:00
Colin Dellow	0bdcc9895e	All-in-one build command `./make-linux` clones and builds: - arrow - brotli - lz4 - parquet - snappy - zlib - zstd - this project as a statically linked binary. Two Boost libs are still pulled in as shared libs, should probably fix that, too, for ultimate portability.	2018-06-24 21:11:07 -04:00
Colin Dellow	d3ab5ff3e7	Cache clauses -> row group mapping Create a shadow table. For `stats`, it'd be `_stats_rowgroups`. It contains three columns: - the clause (eg `city = 'Dawson Creek'`) - the initial estimate, as a bitmap of rowgroups based on stats - the actual observed rowgroups, as a bitmap This papers over poorly sorted parquet files, at the cost of some disk space. It makes interactive queries much more natural -- drilldown style queries are much faster, as they can leverage work done by previous queries. eg 'SELECT * FROM stats WHERE city = 'Dawson Creek' and question_id >= 1935 and question_id <= 1940` takes ~584ms on first run, but 9ms on subsequent runs. We only create entries when the estimates don't match the actual results. Fixes #6	2018-03-24 23:57:15 -04:00
Colin Dellow	d2c736f25a	Add LIMIT/OFFSET to random queries	2018-03-24 19:02:30 -04:00
Colin Dellow	51d0f27a68	don't segfault on low memory Fixes #8	2018-03-24 12:48:29 -04:00
Colin Dellow	6fa7bc3d0b	Add harness for low memory testing	2018-03-24 11:27:06 -04:00
Colin Dellow	8bf890ab66	Fix incorrect row pruning for non-text BYTE_ARRAY	2018-03-18 19:43:09 -04:00
Colin Dellow	893e4c63f5	Add testcase generator Very simplistics - select M fields, filters on N fields, slight bias to use values of same type of the field it's comparing against. No segfaults yet, but one test case that generates differing output when run against `nulls` and `nulls1`: ``` select rowid from nulls1 where binary_9 >= '56' and ts_5 < 496886400000; ```	2018-03-18 19:11:26 -04:00
Colin Dellow	b0c7b229dd	Create queries from templates if needed	2018-03-18 17:50:39 -04:00
Colin Dellow	7f2042742b	Also compare queries against SQLite itself	2018-03-18 17:49:12 -04:00
Colin Dellow	e2af2a07a4	Make rowid start from 1, not 0 Unclear whether this is strictly required, but I'm going to start using SQLite as an oracle, and it'll be simpler if our rowids match theirs.	2018-03-18 17:03:46 -04:00
Colin Dellow	078754467e	Generate queries from templates Huzzah, a bunch of failures have appeared.	2018-03-18 14:28:31 -04:00
Colin Dellow	e3f0dff083	Move queries/* to templates	2018-03-18 13:28:56 -04:00
Colin Dellow	65ea1b2f61	Rewrite tests for automatic generation Regularize the parquets - nulls and nonulls each come in 3 variants, with 1, 10 and 99 rows per rowgroup. All test queries are written against nullsA, no_nullsA. Next commit will introduce a tool to expand these template queries to go against the actual tables.	2018-03-18 13:11:29 -04:00
Colin Dellow	3b557f7fb0	Add explicit test for file not found ...caching the metadata moved where ParquetTable did I/O, which introduced a segfault on not found	2018-03-18 11:58:23 -04:00
Colin Dellow	a3af16eb54	Row-filtering for other string ops	2018-03-17 15:28:51 -04:00
Colin Dellow	753a490687	Tests for blobs	2018-03-16 23:53:08 -04:00
Colin Dellow	cbf388698b	BOOL and INT96 tests	2018-03-16 16:02:11 -04:00
Colin Dellow	110e3e3668	row group skipping for is [not] null queries	2018-03-12 21:09:00 -04:00
Colin Dellow	acc15256ec	Add rowgroup filtering for rowid	2018-03-12 20:42:50 -04:00
Colin Dellow	1f938a005d	More tests cases to deal with affinity I'm not sure how these manifest - whether SQLite retypes them based on column affinity before we see them, or whether they're provided as is.	2018-03-11 19:18:44 -04:00
Colin Dellow	095b576cc2	Scaffolding for row group filters, tests rowid is special since its column index is -1, so add explicit tests around it	2018-03-11 15:44:51 -04:00
Colin Dellow	5559a7b563	Fix when last rowgroup is not same size as first ...change test data to use 99 rows, so that when we have rowgroup size 10 we exercise this code.	2018-03-11 15:15:27 -04:00
Colin Dellow	d28ae86d15	Test unusable constraints	2018-03-10 13:38:34 -05:00
Colin Dellow	96fcafcd2f	Add test cases	2018-03-10 13:25:13 -05:00
Colin Dellow	b7c134efc0	test-queries: can debug a testcase `tests/test-queries regex` filters the test cases. If the resulting set has only one test case, run it under gdb.	2018-03-10 11:54:36 -05:00
Colin Dellow	2d616c54fb	More tests	2018-03-07 20:30:25 -05:00
Colin Dellow	35fcde926c	Rewrite SQL oracle harness	2018-03-07 20:20:34 -05:00
Colin Dellow	caefc23b1e	Add a pg oracle - define `datetime`, `printf` fns in pg so it produces similar output as sqlite - tidy up input data to be less wide To do: some fns to make it easy to generate a new test case. Probably want to mount all the 3 parquets simultaneously and refer to the sqlite table by the same name as the pg table.	2018-03-07 19:40:38 -05:00
Colin Dellow	0d4806ca6f	Rejig parquet generation - "fixed_size_binary" -> "binary_10" - make null parquet use rowgroups of sie 10: first rowgroup has no nulls, 2nd has all null, 3rd-10th have alternating nulls This is prep for making a Postgres layer to use as an oracle for generating test cases so that we have good coverage before implementing advanced `xBestIndex` and `xFilter` modes.	2018-03-06 21:02:26 -05:00
Colin Dellow	56245c1d3d	test case for nulls	2018-03-04 22:48:39 -05:00
Colin Dellow	67005623df	`ensureColumn` catches up when rows are skipped	2018-03-04 22:29:35 -05:00
Colin Dellow	bb3a9440f7	Add query test framework, fix xFilter	2018-03-04 21:05:26 -05:00
Colin Dellow	7edb5e472f	Support BLOBs	2018-03-04 17:20:59 -05:00
Colin Dellow	a4f368af9c	Add tests for unsupported types	2018-03-04 13:02:42 -05:00

40 Commits