Real-Time Weather Data Pipeline

Category:

Data Engineering

Technology:

Python, SQL

Skillset:

dbt

This end-to-end analytics project fetches live weather data from an API, orchestrates data ingestion and transformation pipelines using Dockerized services, loads it into PostgreSQL, transforms it with dbt, and visualizes results with Tableau .

Project Structure

  • airflow/: Airflow DAGs and configs

  • dbt/my_project/: dbt models and project files

  • dbt/.dbt/profiles.yml: dbt profile for Postgres

  • postgres/: Postgres volume and initialization

  • api_request/insert_record.py: Python ETL scripts

  • docker-compose.yml: Service orchestration

External Services

Weatherstack API
  • Sign up at Weatherstack to get a free API key.

  • Use this key in your ingestion scripts to fetch current weather data.

Pipeline Steps

API Request & Data Ingestion (Python)
  • insert_record.py fetches live weather data via the Weatherstack API and inserts it into PostgreSQL using psycopg2.

  • Run locally: python3 api_request/insert_record.py

Setting up PostgreSQL
  • Defined in docker-compose.yml under service db.

  • Accessible locally on localhost:8000 (mapped from container port 5432).

  • Persisted to ./postgres/data.

Ingest Data into Postgres
  • The ingestion script connects to PostgreSQL (host db inside Docker network, or localhost:8000 from host).

Setting up Airflow
Creating an Airflow DAG
  • DAGs are stored in airflow/dags/.

  • Example DAG orchestrates:

  1. API ingestion (PythonOperator)

  2. dbt transformations (DockerOperator)

Setting up dbt
  • Runs in dbt_container.

  • Project files in /usr/app (mapped from ./dbt/my_project).

  • Profiles mounted from ./dbt/.dbt.

Creating dbt Models for Analytics
  • Models in dbt/my_project/models/ generate analytics tables (e.g., stg_weather_data, daily_average, weather_report).

Creating DAG Dependencies
  • Airflow DAG ensures tasks run in order: Ingest ➡ Transform ➡ Export

Exporting Data to CSV
  • Use Python (pandas or csv) or SQL COPY command to export data from PostgreSQL to a CSV in a mounted volume for use in BI tools.

Visualization & Reporting

Importing Data in Tableau

  • Option 1: Connect directly to PostgreSQL:

  • Host: localhost

  • Port: 8000

  • Database: db

  • User: user

  • Password: password

  • Schema: dev

  • Option 2: Import from CSV export.

Visualizing the Data
  • Create dashboards for daily averages, trends, and weather analytics grouped by date/city.