A SQLite vtable extension to read Parquet files
Go to file
Colin Dellow cbf388698b BOOL and INT96 tests 2018-03-16 16:02:11 -04:00
parquet Don't use accessors 2018-03-15 23:04:11 -04:00
parquet-generator Fix when last rowgroup is not same size as first 2018-03-11 15:15:27 -04:00
tests BOOL and INT96 tests 2018-03-16 16:02:11 -04:00
.gitignore test-queries: can debug a testcase 2018-03-10 11:54:36 -05:00
LICENSE Initial commit 2018-03-02 18:37:08 -05:00
README.md Note about versions 2018-03-16 00:19:25 -04:00
build-sqlite Add script to fetch+build sqlite 2018-03-02 18:46:40 -05:00

README.md

sqlite-parquet-vtable

A SQLite virtual table extension to expose Parquet files as SQL tables.

Caveats

I'm not an experienced C/C++ programmer. This library is definitely not bombproof. It's good enough for my use case, and may be good enough for yours, too.

  • I don't use sqlite3_malloc and sqlite3_free for C++ objects
    • Maybe this doesn't matter, since portability isn't a goal
  • The C -> C++ interop definitely leaks some C++ exceptions
    • Obvious cases like file not found and unsupported Parquet types are OK
    • Low memory conditions aren't handled gracefully.

Building

  1. Install parquet-cpp
    1. Master appears to be broken for text row group stats; see https://github.com/cldellow/sqlite-parquet-vtable/issues/5 for which versions to use
  2. Run ./build-sqlite to fetch and build the SQLite dev bits
  3. Run ./parquet/make to build the module
    1. You will need to fixup the paths in this file to point at your local parquet-cpp folder.

Use

$ sqlite/sqlite3
sqlite> .load parquet/libparquet
sqlite> CREATE VIRTUAL TABLE demo USING parquet('parquet-generator/100-rows-1.parquet');
sqlite> SELECT * FROM demo;
...if all goes well, you'll see data here!...

Supported features

Index

Only full table scans are supported.

Types

These types are supported:

  • INT96 timestamps (exposed as milliseconds since the epoch)
  • INT8/INT16/INT32/INT64
  • UTF8 strings
  • BOOLEAN
  • FLOAT
  • DOUBLE
  • Variable- and fixed-length byte arrays

These are not supported:

  • UINT8/UINT16/UINT32/UINT64
  • DECIMAL