pytubefix.exceptions.RegexMatchError, 3.1-rc1
Closed this issue · 2 comments
Describe the bug
Getting a REGEX Error on extracting the video URL
--> python3
Python 3.11.5 (main, Sep 7 2023, 00:00:00) [GCC 11.4.1 20230605 (Red Hat 11.4.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pytubefix
>>> YOUTUBE_URL = str('https://www.youtube.com/@pbsspacetime/videos')
>>> x = pytubefix.Channel(YOUTUBE_URL)
>>> for index, VIDEO in enumerate(x.video_urls, start=1):
... yt = pytubefix.YouTube(str(VIDEO), use_oauth=True, allow_oauth_cache=True)
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/opt/projects/mytube/venv/lib64/python3.11/site-packages/pytubefix/__main__.py", line 96, in __init__
self.video_id = extract.video_id(url)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/projects/mytube/venv/lib64/python3.11/site-packages/pytubefix/extract.py", line 133, in video_id
return regex_search(r"(?:v=|\/)([0-9A-Za-z_-]{11}).*", url, group=1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/projects/mytube/venv/lib64/python3.11/site-packages/pytubefix/helpers.py", line 129, in regex_search
raise RegexMatchError(caller="regex_search", pattern=pattern)
pytubefix.exceptions.RegexMatchError: regex_search: could not find match for (?:v=|\/)([0-9A-Za-z_-]{11}).*
Expected behavior
I expect it to parse the URL normally
It seems able to do this normally on a staticaly typed URL, but extracting it from a channel name seems to be an issue
Desktop (please complete the following information):
- OS: Red Hat Enterprise Linux release 9.3 (Plow)
- Python Version 3.11.5
- pytubefix 3.1-rc1
Additional context
Testing a release candidate
The video_urls
method is inherited from the Playlist class. Try using the .videos
method, it returns YouTube object.
If you just need the url of each video, try this:
YOUTUBE_URL = str('https://www.youtube.com/@pbsspacetime/videos')
x = Channel(YOUTUBE_URL)
for index, VIDEO in enumerate(x.videos, start=1):
print(VIDEO.watch_url)
Ultimately, that worked however -- this code has been working since its inception and only became an issue after I upgraded to 3.0.0 ... so something changed. This was the change that worked for me:
# Iterate through the playlist
for index, VID in enumerate(x.videos, start=1):
VIDEO = VID.watch_url
I had to keep VIDEO
intact, because of how other function are depending on that holding the video URL, otherwise, the VIDEO variable, just contained the YouTube object.