Explore Data
Start by understanding what data is already available. Your Bauplan sandbox comes with pre-loaded public datasets.
Explore the tables in the main branch of the data lake that contain data about the taxi rides in NYC.
The agent will load the explore-data skill and use to: create a dedicated folder named data-exploration containing one or more Python file that will run the data analysis.
You can use your agent to ask specific questions like the following:
Can you show me the schema of taxi_fhvhv in the main branch and tell me what time range of data it covers?
Give me a preview of 5 rows from the taxi_fhvhv in the main branch table and tell me if there are anomalies in the table that I should be aware of
The agent can fetch the individual CLI commands described in the file .claude/bauplan_reference/bauplan_cli.md and use them to explore the data and answer complex questions on the spot.
When exploring data, the agent may:
- Use the Bauplan CLI in
.claude/bauplan_reference/bauplan_cli.mddirectly for queries, schema inspection, table listing. - invoke the
explore-dataskill for a comprehensive and reproducible profiling. - Generate a structured summary with schemas, row counts, and observations
Data exploration, is read-only. In carrying out these operations your agent shall not import or modify data.