ivan-rivera/RedditExtractor

get_reddit() returning posts that do not include search terms.

Closed this issue · 1 comments

Hello!

I have used the following code:

selfcare_politics <- get_reddit(
search_terms = "politics OR political OR democracy OR vote OR voting OR voted
OR republican OR democrat OR liberal OR conservative
OR politician OR government OR country OR polarization
OR president OR citizen OR activism OR protest OR boycott OR riot",
subreddit = "selfcare",
page_threshold = 1000000,
cn_threshold = 2,
sort_by = "comments",
)

This code runs without error, and returns a normal dataframe of comments and metadata. However, after manually insepcting several rows, I found that it was returning posts with no mention of any of my search terms, either in the title, post content, or any of the comments on the post.

Put differently, get_reddit() is running without any error messages but returning irrelevant posts. This could be effecting many users who do not realize.

Sorry for the delayed response, I haven't been giving enough attention to this library lately.

Regarding the issue you've raised: I don't think you are using the OR operators correctly, I think you need to use "|" instead, more info here. Another factor that may play a role here is your sorting condition, make sure it is set on "relevance" rather than "comments".

Either way, this relates to the Reddit API itself rather than the RE package, so I'm closing this ticket for now.