Skip to main content

Explore Data

Start by understanding what data is already available. Your Bauplan sandbox comes with pre-loaded public datasets.

Explore the tables in the main branch of the data lake that contain data about the taxi rides in NYC.

The agent will load the explore-data skill and use to: create a dedicated folder named data-exploration containing one or more Python file that will run the data analysis.

You can use your agent to ask specific questions like the following:

Can you show me the schema of taxi_fhvhv in the main branch and tell me what time range of data it covers?
Give me a preview of 5 rows from the taxi_fhvhv in the main branch table and tell me if there are anomalies in the table that I should be aware of

The agent can fetch the individual CLI commands described in the file .claude/bauplan_reference/bauplan_cli.md and use them to explore the data and answer complex questions on the spot.

When exploring data, the agent may:

  • Use the Bauplan CLI in .claude/bauplan_reference/bauplan_cli.md directly for queries, schema inspection, table listing.
  • invoke the explore-data skill for a comprehensive and reproducible profiling.
  • Generate a structured summary with schemas, row counts, and observations

Data exploration, is read-only. In carrying out these operations your agent shall not import or modify data.