AttributeError: 'Submission' object has no attribute 'd_'
AmeyHengle opened this issue · 1 comments
The Api documentation provides a method to directly convert a submission object into a DataFrame using the special attribute '_d' .
However, in practice, I am getting an error that there is no such attribute.
Posting the error below:
AttributeError Traceback (most recent call last)
in
24 ))
25
---> 26 df = pd.DataFrame([obj.d_ for obj in submissions])
27 df.to_csv('../Mental_Health/AskReddit.csv')
in (.0)
24 ))
25
---> 26 df = pd.DataFrame([obj.d_ for obj in submissions])
27 df.to_csv('../Mental_Health/AskReddit.csv')
~\Anaconda3\lib\site-packages\praw\models\reddit\base.py in getattr(self, attribute)
33 if not attribute.startswith("_") and not self._fetched:
34 self._fetch()
---> 35 return getattr(self, attribute)
36 raise AttributeError(
37 "{!r} object has no attribute {!r}".format(
~\Anaconda3\lib\site-packages\praw\models\reddit\base.py in getattr(self, attribute)
36 raise AttributeError(
37 "{!r} object has no attribute {!r}".format(
---> 38 self.class.name, attribute
39 )
40 )
AttributeError: 'Submission' object has no attribute 'd_'
Interesting! It's hard to debug this without seeing the code for how you requested obj
. My suspicion here is that you instantiated the psaw.PushshiftAPI
instance with a praw.Reddit
instance, right? If that's the case, obj
is an instance of praw.models.Submission
(you can check this with type(obj)
). That special d_
attribute is specific to the psaw Submission model. If you instantiate the psaw.PushshiftAPI
instance without passing it a praw.Reddit
instance, it will return objects with the d_
attribute. If you give it a praw.Reddit
instance, it will return praw objects, which don't have this attribute.
Assuming this is what's going on, I think there are two main reasons to instantiate psaw with a praw instance: to ensure you are fetching the current state of the reddit Thing rather than the snapshot in pushshift's archive (e.g. if you need the current score on the object or want to ignore deleted items), or because you are passing the results to code that was designed around praw objects and you want to ensure compatibility. If you don't fall into either of these use cases, you can probably just instantiate psaw without passing it a praw instance and you'll get the magic d_
attribute.
I haven't tested this, but per the praw docs I suspect you could do something like this:
subm_dicts = [{k:getattr(praw_obj, k) for k in vars(praw_obj)} for praw_obj in submissions]
df = pd.DataFrame(subm_dicts)
If my suspicion is off target here, I'm going to need more information to help you figure out what's going on. Let me know if this answers your question. If you still need help, please share enough code to permit me to replicate the issue on my end.