dhudsmith/pull_twitter

"in_reply_to_user_id" is coming as float, but should be a string.

patrick-lee-warren opened this issue · 7 comments

The other ids seem to be correctly specified as str, but this one is coming in with a decimal.

Same deal for pinned_tweet_id, for the user pull.

I see this behavior as well. twitter alchemy correctly parses both of these as integers, so the issue must be on the pandas/saving side of things. We should see if things go wrong at this line, for example.

Carl's theory is that the missing values are causing the pandas series to be coerced to float.

We aren't ever going to be any mathematical operations on any of these ids. There is no reason to treat the as anything but strings, ever.

This has now been resolved in Pull Request #22. The float data-typed id columns specified above have been converted to remove the decimal and can now be interpreted as strings correctly.

I made the necessary change in twitter alchemy: dhudsmith/twitter-alchemy@fa00f93#diff-ea93147253fd85a5596145f5e805a803cdad3416b82c4d75b81a99006f345784

I also changed the requirements file to match: d679581

@NickDeas, I tested and it looks ok. I think we just need to rip out the fix_floats logic to restore compatibility with python >= 3.6

fix_floats and references removed, and ReadMe reverted to old compatibility. In my testing it looked to work as well.