All work
Airbnb Market Analysis
2023·Sole engineer·shipped

Airbnb Market Analysis

A comprehensive machine learning and analytics project on Airbnb listing data. Built 6 ML models across regression, classification, and clustering tasks with 30+ engineered features to predict pricing and occupancy, then surfaced findings in dynamic Tableau dashboards.

Problem

Airbnb listing data is rich but noisy. Price prediction requires understanding location, amenities, reviews, host attributes, and seasonality simultaneously. Without proper feature engineering and model selection, pricing guidance is unreliable.

Solution

Python/SQL pipeline cleaned and engineered 30+ features from raw listing data. Six ML models — linear regression, random forest, gradient boosting, k-means clustering, logistic classification, and a neural baseline — were trained and compared. Snowflake stored the modeled dataset. Tableau dashboards presented findings interactively.

Architecture

Data prep

SQL and Python pipeline cleaned Airbnb data, handled missing values, encoded categorical features, and engineered 30+ derived features (price per amenity, distance to center, review velocity).

Modeling

6 models: linear regression, random forest, gradient boosting for price/occupancy prediction; k-means for neighborhood clustering; logistic classification for high-demand labeling.

Warehouse

Snowflake stored raw, cleaned, and feature-engineered datasets. SQL views fed the Tableau dashboards directly.

Visualization

Dynamic Tableau dashboards: pricing heatmaps, occupancy forecasts by neighborhood, model performance comparisons, and feature importance charts.

Highlights

  • 6 ML models across regression, classification, and clustering tasks.
  • 30+ engineered features capturing location, amenities, reviews, and seasonality.
  • Snowflake warehouse with SQL views feeding Tableau directly.
  • Dynamic Tableau dashboards with pricing heatmaps and occupancy forecasts.
  • R statistical analysis for model validation and residual diagnostics.