Hey everyone,
I have a confession to make: I often don't have the time or patience to sit through a whole YouTube video. I can read much faster than most people can talk, and I just want to get to the point. If you're like me, you've probably wished for an easy way to just grab the text from a video and skim it.
That's why I created ytt.py
, a simple command-line tool that does exactly that. It fetches the available transcript for any YouTube video and prints it right to your terminal.
How It Works
The script is pretty straightforward. It's built on top of the excellent youtube-transcript-api
Python library. The main logic I added was a robust function to extract the video ID from all the various, weird YouTube URL formats you might encounter, from standard watch?v=
links to shortened youtu.be
links and even embed URLs.
Once it has the video ID, it simply requests the transcript from the API (defaulting to English, but you can specify other languages) and prints the clean, plain-text version.
The Code
The script is self-contained and easy to use. Here's a look at the core logic for extracting the video ID and fetching the transcript:
def extract_video_id(url_or_id: str) -> str:
"""Extract YouTube video ID from URL or return ID if already provided."""
# Check if it's already a valid video ID format
if re.match(r"^[a-zA-Z0-9_-]{11}$", url_or_id):
return url_or_id
try:
parsed = urlparse(url_or_id)
# ... logic to handle various youtube domains and URL patterns ...
# Standard watch URL format
query_params = parse_qs(parsed.query)
video_id = query_params.get("v", [None])[0]
if not video_id or not re.match(r"^[a-zA-Z0-9_-]{11}$", video_id):
raise ValueError(f"Invalid video ID format in URL: {url_or_id}")
return video_id
except Exception as e:
raise ValueError(f"Invalid YouTube URL or video ID: {url_or_id}") from e
async def fetch_youtube_transcript(video_id: str, lang: str = "en") -> str:
"""Get YouTube transcript for a video ID."""
try:
transcript = YouTubeTranscriptApi().fetch(video_id, languages=[lang])
formatter = TextFormatter()
return formatter.format_transcript(transcript)
except Exception as e:
logger.error(f"Failed to get transcript for video {video_id}: {e}")
raise
Get The Tool
You can grab the full script from the Gist I created for it. It includes instructions on how to install the single dependency it needs to run.
It's a simple tool, but it's one I use all the time. Hopefully, some of you will find it just as useful for saving time and getting information more efficiently.
As always,
Michael Garcia a.k.a. TheCrazyGM