Examples

Browse our collection of self-contained examples demonstrating various use cases with pre-built pipelines.

LLM to Tabular

From unstructured to structured data with LLMs

Convert PDFs into structured, analyzable tables using LLMs.

OpenAI PDF Processing Unstructured to Structured
Playlist recommendations with MongoDB

Playlist recommendations with MongoDB

Embedding-based recommender system for music playlists.

MongoDB Vector Search Recs
Iceberg Lakehouse Example

Iceberg Lakehouse Pipeline

Orchestrated WAP pattern for ingesting parquet files to Iceberg tables.

Prefect Pandas Iceberg
PDF analysis with bauplan and OpenAI

PDF analysis with bauplan and OpenAI

Analyze PDFs using Bauplan for data preparation and OpenAI’s GPT for text analysis

PDF Processing OpenAI
ML Pipeline Example

ML Model Training and Deployment Pipeline

End-to-end ML pipeline for predicting taxi trip tips.

Scikit-Learn Pandas Notebooks Streamlit
Entity Matching Example

Entity Matching with OpenAI

Product matching across e-commerce catalogs using LLMs.

OpenAI Streamlit Pandas DuckDB
Data Quality Example

Data Quality and Expectations

Implement data quality checks using expectations.

PyArrow Pandas DuckDB
Real-time Analytics Example

Near Real-time Analytics

Build near real-time analytics pipeline with WAP pattern and metrics visualization.

Prefect Streamlit DuckDB
Data Dashboard Example

Interactive Data Dashboard

Build an interactive dashboard to visualize taxi pickup locations in NYC.

Streamlit Pandas