All work
Video Lecture Summarizer
2023·Sole engineer·shipped

Video Lecture Summarizer

A Python/Flask web application that takes recorded Google Meet lecture videos, transcribes the audio via SpeechRecognition, and generates concise bullet-point summaries using LexRank extractive summarization — helping students review long lectures efficiently.

Problem

Online lectures often run 60–90 minutes with no structured notes. Students either re-watch the full recording or miss key content. No tool existed that could automatically summarize lecture recordings into digestible takeaways.

Solution

Flask web interface accepts video uploads. PyDub extracts and preprocesses the audio. SpeechRecognition transcribes speech to text. NLTK tokenizes and cleans the transcript. LexRank extractive summarization identifies the most important sentences and outputs a structured summary.

Architecture

Ingestion

Flask web UI accepts Google Meet recording uploads (MP4). PyDub extracts audio track and splits into manageable chunks for processing.

Transcription

SpeechRecognition library processes audio chunks and produces a raw text transcript. Handles background noise via energy threshold tuning.

NLP

NLTK pipeline tokenizes, removes stopwords, and cleans the transcript. LexRank graph-based extractive summarization ranks sentences by centrality.

Output

Top-ranked sentences formatted into a structured bullet-point summary, displayed in the Flask UI and optionally downloadable as a text file.

Highlights

  • End-to-end pipeline: video upload → audio extraction → transcription → summarization.
  • PyDub audio preprocessing with chunk-based processing for long recordings.
  • LexRank extractive summarization — graph-based sentence ranking for key content.
  • Flask web interface with summary display and text export.