YouTube Comment Analysis with LLMs
This is a small personal project I built to analyze YouTube comments using Python, Natural Language Processing (NLP), and LLM-based summarization. It’s something I created to automatically gather insights from the comment sections on YouTube videos - from sentiment analysis to topic clustering and an LLM-driven summary of what people are saying.

Why Analyze YouTube Comments?
YouTube comments often contain a wealth of information—personal anecdotes, heartfelt reactions, detailed critiques, and sometimes heated discussions. If you’re a content creator or simply someone who wants to understand online communities better, these comment sections can be an incredible resource.
However, sifting through potentially hundreds or thousands of comments manually can be a chore. By automating the process, you can quickly discover the sentiment of a community, the common threads that tie people’s opinions together, and the standout comments that shape the overall conversation.
Give this a try if you’re …
- a content creator
- a market researcher
- curious about trends on viral videos.
How to Run
This tool is easy to set up. Take a look at the youtube_comment_analysis GitHub repository to see how straightforward it is to install dependencies, obtain the necessary API keys, and run a single command to generate results.
Single Video Analysis
To analyze a single video, e.g., this video by Mark Rober titled “You’ve Never Seen A Wheelchair Like This”, run the following script:
python execution/run_comment_analysis.py --video_id="QpwJEYGCngI"
Note that the video ID is the part of the URL after watch?v=
and before any &
that might be contained in the URL.
This script will fetch all comments and all replies and analyze the sentiment, cluster comments based on topic and provide an LLM-generated summary of the analysis.
Here’s the analysis summary for the Mark Rober video:
The sentiment analysis of the comments reveals that 78.8% of comments are positive, 20.3% are negative, and 0.8% are neutral. The majority of comments are overwhelmingly positive, with many expressing heartfelt appreciation, energetic enthusiasm, and emotional reactions to the heartwarming video.
The statements extracted from the comments show that most commenters agree that the wheelchair is amazing and impressive, Mark Rober is a kind and awesome person, the video is heartwarming and inspiring, and the kid and his parents are awesome and great. There are also comments praising Cash’s independence and inspiration, as well as the creativity and innovation in the video.
The clustering of topics reveals 11 distinct clusters, with the largest clusters being “Admiration and Appreciation” (19% of comments) and “Energetic Enthusiasm” (18% of comments). Other notable clusters include “Heartfelt Appreciation” (11% of comments), “Community Appreciation and Uplift” (14% of comments), and “Praise for Cash’s Father and Family” (5% of comments).
Overall, the comments are overwhelmingly positive, with many expressing admiration and appreciation for the video, Mark Rober, and the kid and his family.
Batch Mode (CSV File)
You can also run the analysis for multiple videos in batch operation.
python execution/run_csv_table.py --csv_path your_csv_file.csv
Your CSV file needs to have a field URL
containing the YouTube URLs (not video IDs) as the first column.
Since the analysis results are written back to the CSV file every time a video finishes, you can resume aborted runs just by restarting.
Closing Words
The most important aspect: have fun discovering new insights! Whether you’re analyzing feedback on your own content or you’re curious about how people react to major viral videos, this project can show trends and patterns you might otherwise miss at first glance.
Happy analyzing!
Enjoy Reading This Article?
Here are some more articles you might like to read next: