DuckLake support is experimental and disabled by default. Set enable_ducklake = true to use it, and expect the behavior to change. Only DuckLake catalogs hosted on PostgreSQL are supported.
This guide walks you through setting up a DuckLake catalog on PostgreSQL with Parquet data files on local disk, and then reading that data from Firebolt Core. You’ll create a table with DuckDB and query it from Firebolt using READ_DUCKLAKE and LIST_DUCKLAKE_FILES.
Prerequisites
- Firebolt Core
- PostgreSQL — this guide runs it as a Docker container
- DuckDB — tested with v1.5.2
Step 1: Start PostgreSQL
DuckLake stores its catalog metadata in a SQL database. Start a PostgreSQL container, setting the user, password, and database name:
docker rm -f ducklake-postgres
docker run -d \
--name ducklake-postgres \
-e POSTGRES_USER=dl_user \
-e POSTGRES_PASSWORD=dl_pw \
-e POSTGRES_DB=dl_db \
-p 5432:5432 \
postgres:16
Step 2: Create a DuckLake table with DuckDB
-
Install DuckDB and start the DuckDB shell:
curl https://install.duckdb.org | sh
duckdb
The remaining commands in this step run inside the DuckDB shell.
-
Install and load the DuckLake extension:
INSTALL ducklake;
LOAD ducklake;
-
Create the DuckLake catalog in PostgreSQL and attach it. Use the same credentials you set for the PostgreSQL container. The
DATA_PATH option determines where the Parquet data files are written on local disk:
ATTACH 'ducklake:postgres:host=localhost port=5432 user=dl_user password=dl_pw dbname=dl_db'
AS pg_ducklake (DATA_PATH '/tmp/ducklake/', OVERRIDE_DATA_PATH true);
-
Create a table and insert some data:
-- Use the DuckLake catalog
USE pg_ducklake;
-- Create a table in the DuckLake catalog
CREATE TABLE my_first_ducklake_table (a INT, r FLOAT);
-- Insert sample data
INSERT INTO my_first_ducklake_table SELECT x, random() FROM generate_series(1, 10000) g(x);
-
Confirm the table exists, with its Parquet files written to local disk:
.timer on -- optional: show query latency
SELECT * FROM my_first_ducklake_table ORDER BY r DESC LIMIT 5;
Step 3: Start Firebolt Core
Start the Firebolt Core container. Two options matter for DuckLake:
--network host lets the Firebolt binary inside the container reach the PostgreSQL container.
-v /tmp/ducklake:/tmp/ducklake mounts the local directory where DuckDB wrote the Parquet files into the Firebolt container at the same path.
docker run -it \
--name firebolt-core \
--rm \
--privileged \
--ulimit memlock=8589934592:8589934592 \
--network host \
-v /tmp/ducklake:/tmp/ducklake \
ghcr.io/firebolt-db/firebolt-core:preview-rc
Step 4: Query the DuckLake table from Firebolt
The container from Step 3 runs in the foreground, so open a new terminal to connect to Firebolt Core. Any supported client works — connect to the query endpoint on port 3473, then run the SQL below.
-
Enable DuckLake support:
SET enable_ducklake = true;
-
Create a location object that points to the DuckLake catalog. Use the same credentials you set for the PostgreSQL container. Because the data files are on local disk, no endpoint or storage credentials are needed:
CREATE LOCATION my_ducklake_loc WITH
SOURCE = DUCKLAKE
CATALOG = 'postgresql://dl_user:dl_pw@127.0.0.1:5432/dl_db';
For the full syntax, see CREATE LOCATION (DuckLake).
-
List the Parquet files that make up your table:
SELECT *
FROM LIST_DUCKLAKE_FILES(
LOCATION => 'my_ducklake_loc',
SCHEMA => 'main', -- optional: 'main' is the default
TABLE => 'my_first_ducklake_table'
);
-
Read the data — the same query you ran in DuckDB:
SELECT *
FROM READ_DUCKLAKE(
LOCATION => 'my_ducklake_loc',
TABLE => 'my_first_ducklake_table'
)
ORDER BY r DESC
LIMIT 5;
-
Inspect the query plan to see how caching behaves across repeated runs. Run it twice and compare the two plans — the second run reads cached metadata and data:
EXPLAIN (ANALYZE)
SELECT *
FROM READ_DUCKLAKE(
LOCATION => 'my_ducklake_loc',
TABLE => 'my_first_ducklake_table'
)
ORDER BY r DESC
LIMIT 5;
Next steps
- READ_DUCKLAKE — Full reference, including reading from S3-compatible object storage, pinning a snapshot, and the supported data types.
- LIST_DUCKLAKE_FILES — Inspect a table’s data files and per-file statistics.
- CREATE LOCATION (DuckLake) — Store catalog connection details and credentials in a reusable location object.