praw-dev/asyncpraw

subreddit.stream.submissions() stops after an hour or so

AltF02 opened this issue · 9 comments

Describe the bug
subreddit.stream.submissions() stops after an hour or so

To Reproduce
Steps to reproduce the behavior:

  1. Create an submissions stream
  2. Run for a couple of hours
  3. Observe

Expected behavior
submissions.stream() never stops

Code/Logs

        subreddit = await reddit.subreddit("dankmemes+memes+okbuddyretard+specialsnowflake+pewdiepiesubmissions")
        
        async for submission in subreddit.stream.submissions(skip_existing=True):
            keywords = DataBase.get_keywords()
            matching = [s for s in keywords if s[0].lower() in submission.title.lower()]
            if matching:
                await self.send_notification(submission, matching[0])

It reaches the end of the for loop before just dying, so I'm not expecting there being a blocking method that may be causing this

System Info

  • OS: Ubuntu 18.04.5 LTS x86_64
  • Python: 3.8.3
  • Async PRAW Version: 7.1.0

My current work around is to break the loop after a 100 posts or so. And start it over again

This makes sense in part due to Reddit marking the Access Token as invalid after an hour, therefore PRAW needs to re-request a new token. But what doesn't make sense is that this should occur in the background. Is any exception being printed? In async functions, exceptions do not terminate the program, but instead, get printed to STDERR if not handled.

No, I'm not aware of any exceptions sadly

I ran across this issue. If the last item grabbed from the stream is deleted, asyncPRAW never grabs the next item. It has something to do with the "before" variable, but I don't fully understand why that is. There are no exceptions in the background.

To Reproduce
This might be a tad verbose, but it uses a mod account to both make a post and remove it, so appropriate mod permissions must be set on the account and environment variables set. This will run forever, so Ctrl+C to stop it when done.

import asyncio, asyncpraw
import os

loop = asyncio.get_event_loop()

reddit = asyncpraw.Reddit(client_id=os.getenv("REDDIT_CLIENT_ID"),
                  client_secret=os.getenv("REDDIT_CLIENT_SECRET"),
                  user_agent=os.getenv("REDDIT_USERAGENT"),
                  username=os.getenv("REDDIT_USERNAME"),
                  password=os.getenv("REDDIT_PASSWORD"),
                  loop = loop)
reddit.validate_on_submit = True

async def reddit_submissions():
    subreddit = await reddit.subreddit(os.getenv("REDDIT_SUBREDDIT"))
    while True:
        try:
            async for submission in subreddit.stream.submissions(skip_existing=True):
                print(f"New submission: {str(submission.title)}")
        except:
            print("Unexpected error occured")
            exception += 1

async def reddit_make_posts():
    subreddit = await reddit.subreddit(os.getenv("REDDIT_SUBREDDIT"))
    await asyncio.sleep(5) # 5 seconds to ensure that the reddit_submissions starts successfully
    submission = await subreddit.submit("This post will be deleted", "self")
    print("Submitted 1st")
    await asyncio.sleep(10) # 10 seconds to ensure that the reddit_submissions grabs it
    await submission.mod.remove()
    print("Removed")
    await asyncio.sleep(10) # 10 seconds to ensure that reddit removes this
    submission = await subreddit.submit("This post will never be grabbed", "self")
    print("Submitted 2nd")

loop.create_task(reddit_submissions())
loop.create_task(reddit_make_posts())
loop.run_forever()

Is this reproducible in normal PRAW or what?

This is very interesting and it would also affect main PRAW as well. This could be why some streams randomly stop producing results. I wonder if this is related to praw-dev/praw#1025.

Does this occur to items deleted as well?

My testing with main PRAW does not have the same bug. Main PRAW seems to occasionally poll the new.json endpoint without the before parameter.

Looking through PRAW vs asyncPRAW, the only difference in the StreamGenerator that could trigger this is one line:

PRAW: if not exclude_before:
https://github.com/praw-dev/praw/blob/a1f7e015a8a80c08ef70069d341e45bd74f9145e/praw/models/util.py#L177

asyncPRAW: if not exclude_before and before_attribute:

if not exclude_before and before_attribute:

Verified the PR works with the sample code provided.