subreddit.stream.submissions() stops after an hour or so
AltF02 opened this issue · 9 comments
Describe the bug
subreddit.stream.submissions() stops after an hour or so
To Reproduce
Steps to reproduce the behavior:
- Create an submissions stream
- Run for a couple of hours
- Observe
Expected behavior
submissions.stream() never stops
Code/Logs
subreddit = await reddit.subreddit("dankmemes+memes+okbuddyretard+specialsnowflake+pewdiepiesubmissions")
async for submission in subreddit.stream.submissions(skip_existing=True):
keywords = DataBase.get_keywords()
matching = [s for s in keywords if s[0].lower() in submission.title.lower()]
if matching:
await self.send_notification(submission, matching[0])
It reaches the end of the for loop before just dying, so I'm not expecting there being a blocking method that may be causing this
System Info
- OS: Ubuntu 18.04.5 LTS x86_64
- Python: 3.8.3
- Async PRAW Version: 7.1.0
My current work around is to break the loop after a 100 posts or so. And start it over again
This makes sense in part due to Reddit marking the Access Token as invalid after an hour, therefore PRAW needs to re-request a new token. But what doesn't make sense is that this should occur in the background. Is any exception being printed? In async functions, exceptions do not terminate the program, but instead, get printed to STDERR if not handled.
No, I'm not aware of any exceptions sadly
I ran across this issue. If the last item grabbed from the stream is deleted, asyncPRAW never grabs the next item. It has something to do with the "before" variable, but I don't fully understand why that is. There are no exceptions in the background.
To Reproduce
This might be a tad verbose, but it uses a mod account to both make a post and remove it, so appropriate mod permissions must be set on the account and environment variables set. This will run forever, so Ctrl+C to stop it when done.
import asyncio, asyncpraw
import os
loop = asyncio.get_event_loop()
reddit = asyncpraw.Reddit(client_id=os.getenv("REDDIT_CLIENT_ID"),
client_secret=os.getenv("REDDIT_CLIENT_SECRET"),
user_agent=os.getenv("REDDIT_USERAGENT"),
username=os.getenv("REDDIT_USERNAME"),
password=os.getenv("REDDIT_PASSWORD"),
loop = loop)
reddit.validate_on_submit = True
async def reddit_submissions():
subreddit = await reddit.subreddit(os.getenv("REDDIT_SUBREDDIT"))
while True:
try:
async for submission in subreddit.stream.submissions(skip_existing=True):
print(f"New submission: {str(submission.title)}")
except:
print("Unexpected error occured")
exception += 1
async def reddit_make_posts():
subreddit = await reddit.subreddit(os.getenv("REDDIT_SUBREDDIT"))
await asyncio.sleep(5) # 5 seconds to ensure that the reddit_submissions starts successfully
submission = await subreddit.submit("This post will be deleted", "self")
print("Submitted 1st")
await asyncio.sleep(10) # 10 seconds to ensure that the reddit_submissions grabs it
await submission.mod.remove()
print("Removed")
await asyncio.sleep(10) # 10 seconds to ensure that reddit removes this
submission = await subreddit.submit("This post will never be grabbed", "self")
print("Submitted 2nd")
loop.create_task(reddit_submissions())
loop.create_task(reddit_make_posts())
loop.run_forever()
Is this reproducible in normal PRAW or what?
This is very interesting and it would also affect main PRAW as well. This could be why some streams randomly stop producing results. I wonder if this is related to praw-dev/praw#1025.
Does this occur to items deleted as well?
My testing with main PRAW does not have the same bug. Main PRAW seems to occasionally poll the new.json endpoint without the before
parameter.
Looking through PRAW vs asyncPRAW, the only difference in the StreamGenerator that could trigger this is one line:
PRAW: if not exclude_before:
https://github.com/praw-dev/praw/blob/a1f7e015a8a80c08ef70069d341e45bd74f9145e/praw/models/util.py#L177
asyncPRAW: if not exclude_before and before_attribute:
asyncpraw/asyncpraw/models/util.py
Line 185 in d9b61d3
Verified the PR works with the sample code provided.