Obtener subtítulos de un vídeo de Youtube con Python

Tiempo de lectura: 2 minutos

Reading Time: < 1 minute

Hello, today we’re going to learn how to get subtitles from a YouTube video using a Python library.

The first thing we need to do is install the library:

pip install youtube-transcript-api

You can find this library here: https://pypi.org/project/youtube-transcript-api/

To use it and get the subtitles from a video, we need to:

Import the following dependencies:

from youtube_transcript_api import YouTubeTranscriptApi
import json

Then, create a function that takes a video code and language as input and generates its transcription:

def getSubtitles(videoId, language):
    strText = ""
    try:
        srt = YouTubeTranscriptApi.get_transcript(videoId, languages=[language])
        # Convert to JSON:
        json.dumps(srt)
 
        for i in srt:
            strText += i['text']
            strText += " "
    except:
        strText = "-1"
    return strText

In my example, I pass the video’s language and video code.

* You can find available languages with this code:

transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)

If the language is not specified, it will default to English.

Finally, the complete code looks like this:

from youtube_transcript_api import YouTubeTranscriptApi
import json

def getSubtitles(videoId, language):
    strText = ""
    try:
        srt = YouTubeReading Time: < 1 minute

Hello, today we're going to learn how to get subtitles from a YouTube video using a Python library.

The first thing we need to do is install the library:

pip install youtube-transcript-api

You can find this library here: https://pypi.org/project/youtube-transcript-api/

To use it and get the subtitles from a video, we need to:

Import the following dependencies:

from youtube_transcript_api import YouTubeTranscriptApi
import json

Then, create a function that takes a video code and language as input and generates its transcription:

def getSubtitles(videoId, language):
    strText = ""
    try:
        srt = YouTubeTranscriptApi.get_transcript(videoId, languages=[language])
        # Convert to JSON:
        json.dumps(srt)
 
        for i in srt:
            strText += i['text']
            strText += " "
    except:
        strText = "-1"
    return strText

In my example, I pass the video's language and video code.

* You can find available languages with this code:

transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)

If the language is not specified, it will default to English.

Finally, the complete code looks like this:

from youtube_transcript_api import YouTubeTranscriptApi
import json

def getSubtitles(videoId, language):
    strText = ""
    try:
        srt = YouTubeTranscriptApi.get_transcript(videoId, languages=[language])
        # Convert to JSON:
        json.dumps(srt)
 
        for i in srt:
            strText += i['text']
            strText += " "
    except:
        strText = "-1"
    return strText

Leave a Comment