Skip to main content

Snowflake

Connect Snowflake to Bauplan by treating Bauplan as an Iceberg REST catalog and creating externally managed Iceberg tables in Snowflake that point to your Bauplan tables and S3 storage. This lets you query Bauplan-managed data from Snowflake without copying it.

Snowflake uses the catalog integration to discover table metadata, and the external volume to read objects in your S3 bucket.

note

In Snowflake, this pattern relies on three objects: an external volume (S3 access), a catalog integration (Iceberg REST endpoint), and an externally managed Iceberg table. See Snowflake Documentation.

When to use this integration

  • You want Snowflake users to run SQL over Bauplan tables that live in object storage.
  • You prefer read access from Snowflake without moving data.
  • You deploy and evolve tables in Bauplan, then surface them in Snowflake for analytics.

Snowflake Open Catalog and Iceberg REST are based on the open Iceberg REST specification, which standardizes how engines discover table metadata.

Prerequisites

  • Bauplan Iceberg REST endpoint and API key
    • Endpoint: https://api.use1.aprod.bauplanlabs.com/iceberg
    • Auth: Bearer token (we suggest a read-only Bauplan user).
  • An S3 bucket that stores your Bauplan table data and Iceberg metadata.
  • Snowflake role with privileges to create external volumes, catalog integrations, and tables.

Step 1. Define an external volume in Snowflake

To define an external volume in Snowflake, you need to configure an S3 bucket and an IAM role so Snowflake can securely access the Iceberg tables in it. Next, create an IAM policy that grants Snowflake access to your S3 location. Attach the policy to the role, and use the security credentials generated by AWS for that role to access the files. Finally, register that bucket as an external volume in Snowflake to manage Iceberg tables directly.

Follow the configuration guide in Snowflake Documentation.

Step 2. Catalog federation with Iceberg REST

Create a catalog integration in Snowflake that points to an Iceberg-compatible REST catalog by supplying the REST endpoint, optional namespace/prefix, and choosing an auth method (OAuth, bearer token, or AWS SigV4/IAM). Then enable the integration and set an optional refresh interval so Snowflake can discover and manage Iceberg tables via that external catalog.

The ICEBERG_REST integration uses the REST OpenAPI specification to enumerate namespaces and tables. CATALOG_NAMESPACE can be set here or later at table creation time. The documentation changes often, so refer to the Snowflake Documentation.

CREATE OR REPLACE CATALOG INTEGRATION BAUPLAN_CATALOG_MAIN
CATALOG_SOURCE = ICEBERG_REST
TABLE_FORMAT = ICEBERG
CATALOG_NAMESPACE = 'bauplan' -- optional default namespace
REST_CONFIG = (
CATALOG_URI = 'https://api.use1.aprod.bauplanlabs.com/iceberg'
)
REST_AUTHENTICATION = (
TYPE = BEARER
BEARER_TOKEN = '<BAUPLAN_API_KEY>'
);

To check the catalog contents, use system functions to return JSON with namespaces and tables discovered from the REST catalog.

-- list namespaces
SELECT SYSTEM$LIST_NAMESPACES_FROM_CATALOG('BAUPLAN_CATALOG_MAIN');

-- list tables in a namespace
SELECT SYSTEM$LIST_ICEBERG_TABLES_FROM_CATALOG('BAUPLAN_CATALOG_MAIN', 'tpch_1');

Registering a Bauplan catalog is effectively equivalent to registering a certain data branch in that catalog. By default the branch is main.

It is possible to register a specific branch other than main by simply pointing to the branch in the endpoint specification.

CATALOG_URI[https://api.use1.aprod.bauplanlabs.com/iceberg/](https://api.use1.aprod.bauplanlabs.com/iceberg/ciao.ciro)<your_username>.<branch_name>

Step 3. Create an externally managed Iceberg table in Snowflake

Create a table object in Snowflake that points to a Bauplan table registered in the catalog and stored in your S3 bucket. CATALOG_TABLE_NAME identifies the table as it appears in the remote catalog. Use CATALOG_NAMESPACE to override the default namespace if needed. Do not prefix the table name with a namespace in CATALOG_TABLE_NAME. Refer to the Snowflake Documentation.

-- If you set CATALOG_NAMESPACE at the integration level, you can omit it here.
-- Otherwise, specify CATALOG_NAMESPACE at table creation.
CREATE ICEBERG TABLE dev_taxi_zones
CATALOG = 'BAUPLAN_CATALOG_MAIN'
EXTERNAL_VOLUME = 'BAUPLAN_EXTERNAL_ICEBERG_VOLUME'
CATALOG_TABLE_NAME = 'taxi_zones'. -- table name in the Bauplan catalog
-- CATALOG_NAMESPACE = 'tpch_1' -- optional override
AUTO_REFRESH = TRUE; -- optional

note

When a table is managed by an external catalog, Snowflake typically treats it as read-only from Snowflake’s side. Writes remain managed by the external engine and catalog. Confirm current write support in Snowflake for your catalog type and region if you plan to write. Snowflake Documentation