/yt-fts

Youtube Full Text Search - Search all of a YouTube channel's subtitles from the command line

Primary LanguagePythonThe UnlicenseUnlicense

yt-fts

yt-fts is a simple python script that uses yt-dlp to scrape all of a youtube channels subtitles and load them into an sqlite database that is searchable from the command line. It allows you to query a channel for specific key word or phrase and will generate time stamped youtube urls to the video containing the keyword.

Installation

pip

pip install yt-fts

from source

git clone https://github.com/NotJoeMartinez/yt-fts
python3 -m venv .env
source .env/bin/activate
pip install -r requirements.txt
python3 -m yt-fts

Dependencies

This project requires yt-dlp installed globally. Platform specific installation instructions are available on the yt-dlp wiki.

pip

python3 -m pip install -U yt-dlp

MacOS/Homebrew

brew install yt-dlp

Windows/winget

winget install yt-dlp

Usage

Usage: yt-fts [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  delete    delete [channel id]
  download  download [channel url]
  export    export [search text] [channel id]
  list      Lists channels
  search    search [search text] [channel id]

download

Will download all of a channels vtt files into your database

yt-fts download "https://www.youtube.com/@TimDillonShow/videos"

--channel-id [channel_id]

If download fails you can manually input the channel id with the --channel-id flag. The channel url should still be an argument

yt-fts download --channel-id "UC4woSp8ITBoYDmjkukhEhxg" "https://www.youtube.com/@TimDillonShow/videos" 

--language [en/fr/es/etc..]

Specify subtitles language

yt-fts download --language de "https://www.youtube.com/@TimDillonShow/videos" 

--number-of-jobs [num_threads]

Speed up downloads with multi threading

yt-fts download --number-of-jobs 6 "https://www.youtube.com/@TimDillonShow/videos"

list

List all of your downloaded channels

yt-fts list

output:

Listing channels
  id  channel_name         channel_url
----  -------------------  ---------------------------------------------------------------
   1  The Tim Dillon Show  https://www.youtube.com/channel/UC4woSp8ITBoYDmjkukhEhxg/videos
   2  Lex Fridman          https://www.youtube.com/channel/UCSHZKyawb77ixDdsGog4iWA/videos
   3  Traversy Media       https://www.youtube.com/channel/UC29ju8bIPH5as8OGnQzwJyA/videos

search

you can specify which channel to search in using the id or channel_name and it will print a url to that point in the video.

Usage: yt-fts search [OPTIONS] SEARCH_TEXT [CHANNEL]

  Search for a specified text within a channel or all channels. SEARCH_TEXT is
  the text to search for. CHANNEL is the name or id of the channel to search
  in. CHANNEL is required unless the '--all' option is specified.

Options:
  --all   Search in all channels. If not specified, a channel name or id is
          required.
  • The search string does not have to be a word for word and match
  • Use Id if you have channels with the same name or channels that have special characters in their name
  • Search strings are limited to 40 characters.

Ex:

yt-fts search "life in the big city" "The Tim Dillon Show"
# or 
yt-fts search "life in the big city" 1  # assuming 1 is id of channel

output:

The Tim Dillon Show: "164 - Life In The Big City - YouTube"

    Quote: "van in the driveway life in the big city"
    Time Stamp: 00:30:44.580
    Link: https://youtu.be/dqGyCTbzYmc?t=1841

Search all channels

Use --all to search all channels in your database

Ex:

yt-fts search "text to search" --all

Advanced Search Syntax

The search string supports sqlite Enhanced Query Syntax. which includes things like prefix queries which you can use to match parts of a word.

Ex:

yt-fts search "rea* kni* Mali*" "The Tim Dillon Show" 

output:

The Tim Dillon Show: "#200 - Knife Fights In Malibu | The Tim Dillon Show - YouTube"

    Quote: "real knife fight down here in Malibu I"
    Time Stamp: 00:45:39.420
    Link: https://youtu.be/e79H5nxS65Q?t=2736

Export

Similar to search except it will export all of the search results to a csv with the format: Channel Name,Video Title,Quote,Time Stamp,Link as it's headers

yt-fts export "life in the big city" "The Tim Dillon Show"

You can export from all channels in your database as well

yt-fts export "life in the big city" --all

Delete

Will delete a channel from your database

yt-fts delete [channel_id]