← Back to portfolio

Machine Learning · Classification

Weather Predictor using Logistic Regression & SVM

A binary classification model that predicts whether it will rain based on historical weather data using Logistic Regression and Support Vector Machines.

Overview

This project seeks to answer an age old question > Will it rain tomorrow? Using real weather observations like temperature, humidity, and rainfall this notebook walks through preprocessing, training, evaluation, and model comparison between Logistic Regression and SVM.

The goal is to understand tradeoffs between a simple, interpretable linear model and a more flexible and complex SVM classifier.

Data & Methods

Dataset

  • Historical daily weather observations (temperature, humidity, rainfall, etc.).
  • Target label indicating whether it rained the following day.

Preprocessing

  • Handled missing values and inconsistent entries.
  • Removed featured with mostly absent values.
  • Standardized/normalized numeric features using StandardScaler and dummy variables.
  • Split into train and test sets.

Models

  • Logistic Regression for a baseline linear classifier.
  • Support Vector Machine (SVM) classifier with various kernel.
  • Evaluated accuracy for each classifier and multiple SVM kernels.

Tech Stack

Python, scikit-learn, pandas, NumPy, matplotlib, Jupyter

Metrics

Support Vector Machine Classifier:


Logistic Regression Classifier:

Key Charts

SVM RBF kernel ROC Curve
SVM RBF kernel ROC Curve.

ROC curve can show us a general indication on how well the classifier is performing. SVM complexity makes it difficult to view true classification within multiple hyperplanes.

Analysis

Challenges & Learnings

Project Links