Handle schema conflicts during import¶

When importing a dataset, Bauplan scans your files to infer the table schema. If your files have inconsistent columns or types, the import will fail with a schema conflict.

This guide shows how to generate a custom schema plan, resolve conflicts, and apply the corrected plan.

When this happens, you might see an error like:

2025-05-26 14:12:14 WRN The produced plan contains conflicts
2025-05-26 14:12:14 ERR cannot automatically create table from search string. here are conflicts

To fix this, you will have to generate a custom import plan and manually resolve the conflicts.

Create a Table Plan¶

Instead of letting Bauplan infer the schema automatically, you’ll create a plan file and save it in a yml file, where potential schema conflicts can be solved either manually or programmatically.

To create an import plan use

bauplan table create-plan --name <your_table_name> --search-uri 's3://your/s3/bucket/*.parquet' --save-plan your_table_plan.yml

This generates a table_plan.yml file that includes:

The inferred schema
Detected conflicts
Metadata about the data files

Understand the YAML Structure¶

The plan has a schema_info section that looks like this:

schema_info:
  conflicts:
      - column_with_conflict: VendorID
      reconcile_step: In the destination_datatype, please choose between (long or int)
  detected_schemas:
      - column_name: VendorID
    src_datatypes:
        - datatype: long
        - datatype: int
      dst_datatype:
        - datatype: long
        - datatype: int

For each conflicting column:

src_datatypes lists the types Bauplan found across your files.
dst_datatype lists the possible target types — you must pick one. By declaring the value of this field you will tell the system to cast a specific data type for a column.
The conflicts: field explains what to do by explicitly telling you the reconcile_step.

Cast types and resolve conflicts¶

To resolve the conflict, edit the dst_datatype list to include only the type you want. For example:

From:

dst_datatype:
  - datatype: long
  - datatype: int

To:

dst_datatype:
  - datatype: int

Once you’ve resolved all conflicts, your conflicts: section should be empty:

conflicts: []

Apply the Plan and import data¶

Apply your edited schema plan:

bauplan table create-plan-apply --plan table_plan.yml

Then import the data as usual:

bauplan table import --name <your_table_name> --search-uri 's3://your/s3/bucket/*.parquet'

This manual step ensures you’re making intentional decisions about your schema — especially important when types like int, long, or double could affect downstream logic or validation.

You’re in full control of how your data is interpreted.