Handle schema conflicts during import¶
When importing a dataset, Bauplan scans your files to infer the table schema. If your files have inconsistent columns or types, the import will fail with a schema conflict.
This guide shows how to generate a custom schema plan, resolve conflicts, and apply the corrected plan.
When this Happens, you might see an error like:
2025-05-26 14:12:14 WRN The produced plan contains conflicts
2025-05-26 14:12:14 ERR cannot automatically create table from search string. here are conflicts
To fix this, you will have to generate a custom import plan and manually resolve the conflicts.
Create a Table Plan¶
Instead of letting Bauplan infer the schema automatically, you’ll create a plan file and save it in a yml file, where potential schema conflicts can be solved either manually or programmatically.
To create an import plan use
bauplan table create-plan --name <your_table_name> --search-uri 's3://your/s3/bucket/*.parquet' --save-plan your_table_plan.yml
This generates a table_plan.yml file that includes:
The inferred schema
Detected conflicts
Metadata about the data files
Understand the YAML Structure¶
The plan has a schema_info section that looks like this:
schema_info:
conflicts:
- column_with_conflict: VendorID
reconcile_step: In the destination_datatype, please choose between (long or int)
detected_schemas:
- column_name: VendorID
src_datatypes:
- datatype: long
- datatype: int
dst_datatype:
- datatype: long
- datatype: int
For each conflicting column:
src_datatypes lists the types Bauplan found across your files.
dst_datatype lists the possible target types — you must pick one. By declaring the calue of this field you will tell the system to cast a specific data type for a column.
The conflicts: field explains what to do by explicitly telling you the reconcile_step.
Cast types and resolve conflicts¶
To resolve the conflict, edit the dst_datatype list to include only the type you want. For example:
From:
dst_datatype:
- datatype: long
- datatype: int
To:
dst_datatype:
- datatype: int
Once you’ve resolved all conflicts, your conflicts: section should be empty:
conflicts: []
Apply the Plan and import data¶
Apply your edited schema plan:
bauplan table create-plan-apply --plan table_plan.yml
Then import the data as usual:
bauplan table import --name <your_table_name> --search-uri 's3://your/s3/bucket/*.parquet'
This manual step ensures you’re making intentional decisions about your schema — especially important when types like int, long, or double could affect downstream logic or validation.
You’re in full control of how your data is interpreted.