Data Engineer · Vancouver, BC

Building data systems
that hold up.

I design ETL pipelines and lakehouse architectures for organizations where data reliability is not optional. Focused on public sector and community-impact work.

Available for contracts & consulting
BA

Completing DP-600 Fabric Analytics Engineer cert
Building lakehouse pipelines in Fabric & Databricks
Preparing Databricks Data Engineer Associate
Targeting BC public sector & government data roles

TransLink Transit Data Warehouse

End-to-end data warehouse built on real Vancouver GTFS transit data using medallion architecture. Handles domain-specific challenges like time values beyond 24:00, with dimensional models for time-based analysis and embedded data quality checks across all layers.

Python SQL Medallion Architecture GTFS Data Quality
GitHub

Global Retail Lakehouse · Microsoft Fabric

Lakehouse platform for a multi-region retail scenario using Bronze → Silver → Gold medallion layers. Unified data models for customers, products, and sales enabling consistent cross-region reporting at scale.

Microsoft Fabric PySpark Delta Lake Medallion
GitHub

Airflow + Spark + AWS Pipeline

Containerized ETL pipeline with orchestration, retries, scheduling, and alerting built around production reliability patterns — not just happy-path flows. Demonstrates operational maturity beyond basic ingestion.

Apache Airflow PySpark AWS S3 Docker
GitHub

Healthcare FHIR Data Pipeline

Transforms FHIR-formatted healthcare JSON into structured analytical tables with a Streamlit dashboard for clinical insights. Relevant to health authority reporting and data normalization use cases.

Python Pandas Streamlit FHIR SQLite
GitHub

Databricks End-to-End Pipeline

Medallion-based pipeline using Delta Lake and Unity Catalog. Designed for scalable processing and governed data access with transformations structured for clarity, reuse, and maintainability.

Databricks Delta Lake Unity Catalog PySpark
GitHub

Customer Insights Pipeline

Integrates multiple customer datasets into unified reporting tables. Demonstrates multi-source join logic, transformation design, and analytics-ready output generation using Python and PostgreSQL.

Python PostgreSQL Apache Airflow
GitHub

Microsoft Azure Data Fundamentals · DP-900

earned

Microsoft Fabric Analytics Engineer · DP-600

in progress

Databricks Data Engineer Associate

in progress