micahflee/semiphemeral

Allow deleting old tweets by downloading your twitter data

micahflee opened this issue · 4 comments

Similar to the --unlike setting, it would be cool to allow people to delete their old tweets in situations where the Twitter API only returns the most recent 3000 tweets, and refuses to return anything older.

Related to #54, #41, #12

This would be great!

I'm not really familiar with the semiphemeral codebase, but I just ended up writing a script for this myself since it seemed pretty simple: https://github.com/galonsky/delete_twitter_via_archive/blob/main/delete_via_archive.py

I'd be happy to add to semiphemeral if someone could point me to the write place.

EDIT: oops, didn't mean to post too early.
i looked through the codebase of semiphemeral and i found a function that actually imports tweets, but excludes the "imported" tweets (i.e. the excluded_import function).

Warning: janky temporary solution, use at your own risk

first, clone the repo (i hope you know how to do that)

second, modify tweets.js for python

i tweaked the code to avoid marking the tweets as excluded, modified tweet.js first line window.YTD.tweet.part0 = [ to {"data": [, and appended } to the end of the tweet.js file making a valid json file.

third, patch the code (with `git apply`)
diff --git a/semiphemeral/db.py b/semiphemeral/db.py
index 64cfa8d..4e6013a 100644
--- a/semiphemeral/db.py
+++ b/semiphemeral/db.py
@@ -56,7 +56,7 @@ class Tweet(Base):
         self.lang = status.lang
         self.source = status.source
         self.source_url = status.source_url
-        self.text = status.full_text
+        self.text = status.text # full_text as an attribute doesn't exist
         self.in_reply_to_screen_name = status.in_reply_to_screen_name
         self.in_reply_to_status_id = status.in_reply_to_status_id
         self.in_reply_to_user_id = status.in_reply_to_user_id
diff --git a/semiphemeral/import_export.py b/semiphemeral/import_export.py
index 829dfb7..2080a95 100644
--- a/semiphemeral/import_export.py
+++ b/semiphemeral/import_export.py
@@ -1,5 +1,6 @@
 import click
 import json
+import tweepy
 
 from .db import Tweet
 
@@ -50,17 +51,17 @@ class ImportExport:
                 self.common.session.query(Tweet).filter_by(status_id=status_id).first()
             )
             if tweet:
-                tweet.exclude_from_delete = True
                 self.common.session.add(tweet)
                 tweet.excluded_summarize()
             else:
                 try:
                     status = self.twitter.api.get_status(status_id)
                     tweet = Tweet(status)
-                    tweet.exclude_from_delete = True
                     self.common.session.add(tweet)
                     tweet.excluded_fetch_summarize()
                 except tweepy.error.TweepError as e:
-                    click.echo("Error for tweet {}: {}".format(tweet.status_id, e))
+                    click.echo("Error for tweet {}: {}".format(status_id, e))
 
         self.common.session.commit()

some lines were bug fixing, probably because the code wasn't 100% tested.

then, use the python script to extract tweet ids from modified tweet.js for semiphemeral
import json
file = open('tweet.js')
tweets = json.load(file)
tweet_ids = [int(i['tweet']['id']) for i in tweets['data']]]
file = open('tweet_ids.json', 'w')
json.dump(tweet_ids, file)

(this is a copy from what I can remember. if it's broken, let me know)

finally, "import" the tweets

run the app.py file in the repo with the excluded_import argument, providing the filename to the --filename argument. example:

python3 app.py excluded_import --filename tweet_ids.json

you can then start deleting using the normal semiphemeral command (because both the program and the repo program would save into the same place).

the solution that I made worked for me, and deleted the old tweets.

EDIT: there seems to be #23, which actually was an attempt to implement it but it seems to never get merged.

kees commented

With #23 now merged, I think this bug can be closed. If not, please reopen. :)