Handle schema conflicts during import
When importing a dataset, Bauplan scans your files to infer the table schema. If your files have inconsistent columns or types, the import will fail with a schema conflict.
This guide shows how to generate a custom schema plan, resolve conflicts, and apply the corrected plan.
When this happens, you might see an error like:
2025-05-26 14:12:14 WRN The produced plan contains conflicts
2025-05-26 14:12:14 ERR cannot automatically create table from search string. here are conflicts
To fix this, you will have to generate a custom import plan and manually resolve the conflicts.
Create a Table Plan
Instead of letting Bauplan infer the schema automatically, you'll
create a plan file and save it in a yml
file, where
potential schema conflicts can be solved either manually or
programmatically.
To create an import plan use
bauplan table create-plan --name <your_table_name> --search-uri 's3://your/s3/bucket/*.parquet' --save-plan your_table_plan.yml
This generates a table_plan.yml
file that includes:
- The inferred schema
- Detected conflicts
- Metadata about the data files
Understand the YAML Structure
The plan has a schema_info
section that looks like this:
schema_info:
conflicts:
- column_with_conflict: VendorID
reconcile_step: In the destination_datatype, please choose between (long or int)
detected_schemas:
- column_name: VendorID
src_datatypes:
- datatype: long
- datatype: int
dst_datatype:
- datatype: long
- datatype: int
For each conflicting column:
src_datatypes
lists the types Bauplan found across your files.dst_datatype
lists the possible target types -- you must pick one. By declaring the value of this field you will tell the system to cast a specific data type for a column.- The
conflicts:
field explains what to do by explicitly telling you thereconcile_step
.
Cast types and resolve conflicts
To resolve the conflict, edit the dst_datatype
list to
include only the type you want. For example:
From:
dst_datatype:
- datatype: long
- datatype: int
To:
dst_datatype:
- datatype: int
Once you've resolved all conflicts, your conflicts:
section should be empty:
conflicts: []
Apply the Plan and import data
Apply your edited schema plan:
bauplan table create-plan-apply --plan table_plan.yml
Then import the data as usual:
bauplan table import --name <your_table_name> --search-uri 's3://your/s3/bucket/*.parquet'
This manual step ensures you're making intentional decisions about your
schema - especially important when types like int
,
long
, or double
could affect downstream
logic or validation.
You're in full control of how your data is interpreted.