Writing
Notes on data engineering, AI systems, and building things that work in production.
When Streaming Is Overkill (And When It Isn't)
Kafka and Flink are powerful, but most data problems don't need them. A practical framework for deciding when streaming infrastructure is actually justified.
From PDF to Dashboard: Building an AI Financial Pipeline
Payment data locked in unstructured PDFs, a finance team spending 10 hours a week on manual extraction. Here's the architecture that replaced it — Claude API, Airflow, dbt, and BigQuery.
dbt Dimensional Modeling for Streaming Pipelines
Dimensional modeling principles don't change just because your data comes from Kafka. How I structured the dbt layer in AthleteOS to stay clean as the schema evolved.
Self-Hosting n8n on Oracle Cloud: The Full Setup
A complete walkthrough of running n8n on Oracle Free Tier — Docker Compose, DuckDNS domain, Redis deduplication, and why this beats every SaaS automation tool I've tried.
Building a RAG System Over Structured Biometric Data
How I built the coaching layer in AthleteOS — embedding historical training sessions in pgvector and wiring LLM APIs to answer natural language queries over structured Snowflake data.
Test Post
A short test post to verify CMS → site pipeline