Now
Selected work
TransLink Transit Data Warehouse
End-to-end data warehouse built on real Vancouver GTFS transit data using medallion architecture. Handles domain-specific challenges like time values beyond 24:00, with dimensional models for time-based analysis and embedded data quality checks across all layers.
Global Retail Lakehouse · Microsoft Fabric
Lakehouse platform for a multi-region retail scenario using Bronze → Silver → Gold medallion layers. Unified data models for customers, products, and sales enabling consistent cross-region reporting at scale.
Airflow + Spark + AWS Pipeline
Containerized ETL pipeline with orchestration, retries, scheduling, and alerting built around production reliability patterns — not just happy-path flows. Demonstrates operational maturity beyond basic ingestion.
Healthcare FHIR Data Pipeline
Transforms FHIR-formatted healthcare JSON into structured analytical tables with a Streamlit dashboard for clinical insights. Relevant to health authority reporting and data normalization use cases.
Databricks End-to-End Pipeline
Medallion-based pipeline using Delta Lake and Unity Catalog. Designed for scalable processing and governed data access with transformations structured for clarity, reuse, and maintainability.
Customer Insights Pipeline
Integrates multiple customer datasets into unified reporting tables. Demonstrates multi-source join logic, transformation design, and analytics-ready output generation using Python and PostgreSQL.
Certifications
Microsoft Azure Data Fundamentals · DP-900
earned
Microsoft Fabric Analytics Engineer · DP-600
in progress
Databricks Data Engineer Associate
in progress
Connect