I used state-of-the-art language AI to analyze and index Joe Rogan’s podcast. Doing so resulted in a publicly available full-stack AI package. This blog demonstrates how I’ve made it happen. Stick around for a demo and amusing insights I distilled from the pods!
This summer, I had another strike of information FOMO: I’d never heard a Joe Rogan podcast.
I reached this “horrifying” conclusion as I met my techie friends at a pizza and beer-fueled after-work. Part of the group was COVID transplants from the US. They were talking about some guy named Joe Rogan, who was either the most extraordinary person alive or a blight on the world, depending on who was talking. After half an hour of “aha”, “Oh, yes, interesting” and “Oh, no, 🤯”… I finally came clean: “I have no idea who Joe Rogan is”. 😬
Back home, I felt overwhelmed by FOMO. It sounded like Joe Rogan was the king of debate, often exploring controversial topics with his guests. I like to challenge myself by looking at life through the lens of other people, so I sat down to queue up a bunch of episodes. But when I looked at the queue my brain exploded 🤯: 5,000+ hours of aliens, weed, and MMA.
I can deal with aliens and weed, but 5,000+ hours? That’s more than 200 days of listening back-to-back. Even if I consumed at 2x speed like some of my “entrepreneur” friends, it would take me a whole year if I stretched myself thin and listened 8 hours a day. And that’s not counting the 15 NEW hours of content each week. There was only one conclusion: I would never catch up. 😭
💡 Then I realized, “Enias, you work at an NLP startup. Just have the computer tell you what this guy thinks”.
AI to the rescue
What I needed was a solution that summarizes podcasts for me.
But of course, that wasn’t enough. I wanted a tool that responded to the random questions that popped into my brain. For example, this morning, I wondered: “What do Joe Rogan’s guests say about Cocaine? Spoiler alert: Joe Rogan and his guests like to talk about powdering their noses.
Luckily, AI is catching up. Speech-to-text turns podcasts into text. Language models summarize long-form content in seconds (and a hefty amount of 💰). It sounds like we have a solution. Let’s start building, shall we?
Disclaimer: I’m a founding engineer at Steamship. We’re building the fastest way to add language AI to products. To the NLP-savvy engineer, Steamship works like an NLP operating system that makes it easy to process and query natural language with few lines of Python.
So logically, I ate my own dog food and built the podcast app using Steamship. It’s bundled in a package you can import and use on other podcasts and audio files. Here’s how I did it:
1. Transcribe the podcasts
I used two transcription providers, AWS and Assembly.AI, to turn audio into text. After some basic testing, I found Assembly.AI to be better than AWS, especially when figuring out who was speaking using speaker diarization.
2. Analyse the transcriptions
I used a mix of models from HuggingFace, Assembly.AI, and OneAI to analyze the transcriptions. There isn’t a winner-takes-all to generate insights, and I found poor transcription quality and long podcasts degrade performance.
3. Orchestrate the NLP models
After plowing in Jupyter notebook for a day, I knew I couldn’t make it happen from just my laptop. Analyzing a single podcast takes 70 minutes. 40 minutes to transcribe the audio. Another 30 minutes to generate summaries and highlight language AI features. Combine the high latency with an unstable internet connection as a digital nomad, and you get chaos.
This is where Steamship’s packages come in. You can use them to bundle long, diverse workloads that run in the cloud and save the results for querying later.
Inside my “audio-analytics” package, I added an endpoint to download, transcribe, and analyze podcasts:
4. Store and make the insights queryable
Once the Steamship package is bundled and deployed, I can just import and use it on my computer while the logic runs in the cloud.
3 insights and a newsletter
Here are 3 insights I extracted from the podcast.
Have another beer
Sorry, I had to give in to the crowd. Indexing for Corona and the pandemic gives me these quotes:
- Andrew Shulz: “Corona actually saves lives”
- Kevin Hart: “The Pandemic showed me how our economy really F* works”
Most guests are Google fanboys
- Jordan Peterson idolizes Google employees as elite; 1% people. For those wondering, I’m part of the 99%. Call me ordinary.
- Edward Snowden praises Google’s encryption. Not bad when the most famous NSA agent praises your security. I feel safer as I type this in G Doc.
- Joe Rogan and Andrew Shulz google #spaceisfake and #worldisflat. That’s one way to spend your Sunday afternoon!
Social Media causes depression
Joe Rogan likes to ask his guests how they feel about social media, priming them to link it to depression. Funny because most of his guest live off social media. Kevin Hart recommends you look at yourself in the mirror to index your strengths and flaws. Don’t let people tell you something you don’t already know. Meanwhile, the real fighters of the pod hint that special herbs pump you back up and improve your self-image.
Bonus: Calf implants are cool in Miami (link)
This doesn’t need text beside this:
This whole project was so much fun to build that I’m creating a weekend newsletter out of it. I’ll share AI-generated insights and summaries from the Joe Rogan pod every weekend — you can subscribe here. Each e-mail will serve a summary of this week’s podcasts paired with 3 funny shots distilled from Joe’s 2050+ podcasts 🥃. The first release is scheduled for next Saturday!
I built a public demo where you can explore Joe Rogan’s podcasts through the lens of AI. Have fun here. If you want to use this app on your own podcast or audio, reach out with the link in the demo.