A package for working with large persistent data sets.

Build Statuscodecov


JuliaDB is a package for working with large persistent data sets

We recognized the need for an all-Julia, end-to-end tool that can

  • Load multi-dimensional datasets quickly and incrementally.

  • Index the data and perform filter, aggregate, sort and join operations.

  • Save results and load them efficiently later.

  • Use Julia's built-in parallelism to fully utilize any machine or cluster.

We built JuliaDB to fill this void.

JuliaDB is built on Dagger and IndexedTables

  • JuliaDB provides distributed array/table datastructures with convenient functions to load data from CSV.

  • JuliaDB is Julia all the way down. This means queries can be efficiently composed with packages from the entire Julia ecosystem.

Get started