Explore Data
Start by understanding what data is already available. Your Bauplan sandbox comes with pre-loaded public datasets.
Explore the tables in the main branch of the data lake that contain data about the taxi rides in NYC.
The agent will load the explore-data skill and use it to: create a dedicated folder named data-exploration containing one or more Python file that will run the data analysis. Expect this to take some time.
You can use your agent to ask specific questions like the following:
Can you show me the schema of taxi_fhvhv in the main branch and tell me what time range of data it covers?
Give me a preview of 5 rows from the taxi_fhvhv in the main branch table and tell me if there are anomalies in the table that I should be aware of
The agent can fetch the individual CLI commands described in the file .claude/bauplan_reference/bauplan_cli.md and use them to explore the data and answer complex questions on the spot.
When exploring data, the agent may:
- Use the Bauplan CLI in
.claude/bauplan_reference/bauplan_cli.mddirectly for queries, schema inspection, table listing. - invoke the
explore-dataskill for a comprehensive and reproducible profiling. - Generate a structured summary with schemas, row counts, and observations
Data exploration, is read-only. In carrying out these operations your agent shall not import or modify data.