Skip to main content

Examples

Serverless Data Product
Serverless Data Product

Serverless data product with built-in quality checks using Lambda and Bauplan.

dataprodlambda
RAG system with Pinecone
RAG system with Pinecone

Build a RAG system with Pinecone and OpenAI over StackOverflow data.

pineconeopenAI
Medallion Architecture + WAP Pattern
Medallion Architecture + WAP Pattern

End-to-end data engineering repo using Mage & the medallion architecture.

medallionmagepolars
From unstructured to structured data with LLMs
From unstructured to structured data with LLMs

Convert PDFs into structured, analyzable tables using LLMs.

openAIPDF processingunstructured to structured
Playlist recommendations with MongoDB
Playlist recommendations with MongoDB

Embedding-based recommender system for music playlists.

mongoDBvector searchrecs
Iceberg Lakehouse Pipeline
Iceberg Lakehouse Pipeline

Orchestrated WAP pattern for ingesting parquet files to Iceberg tables.

prefectpandasiceberg
PDF analysis with bauplan and OpenAI
PDF analysis with bauplan and OpenAI

Analyze PDFs using Bauplan for data preparation and OpenAI's GPT for text analysis

PDF processingopenAI
ML Model Training and Deployment Pipeline
ML Model Training and Deployment Pipeline

End-to-end ML pipeline for predicting taxi trip tips.

scikit-learnpandasnotebooksstreamlit
Entity Matching with OpenAI
Entity Matching with OpenAI

Product matching across e-commerce catalogs using LLMs.

openAIstreamlitpandasduckDB
Data Quality and Expectations
Data Quality and Expectations

Implement data quality checks using expectations.

pyArrowpandasduckDB
Near Real-time Analytics
Near Real-time Analytics

Build near real-time analytics pipeline with WAP pattern and metrics visualization.

prefectstreamlitduckDB
Interactive Data Dashboard
Interactive Data Dashboard

Build an interactive dashboard to visualize taxi pickup locations in NYC.

streamlitpandas