shahp7575/urban-dictionary-dataset

Script seems to be out of date.

Opened this issue · 1 comments

First of all, thank very much for your work. It helped me a lot.
Since the last dataset you shared on kaggle is out of date, I tried to re-run your script to get all the latest words.
But I got the error: "line 65, in get_char_pages_url results = raw.find('div', {'class': 'pagination-centered'}).find_all(ch)[-2:] AttributeError: 'NoneType' object has no attribute 'find_all'"

I'd like check if you could update your script, please.

Regarding to issue mentioned above, need the following changes to make it work again:

Replace line #65 to "results = raw.find('div', {'class': 'pagination'}).find_all('a')[0:]"
Replace line #85 to "results = raw.find('ul', {'class': 'mt-3 columns-2 md:columns-3'}).find_all('a')[0:]"
Replace line #107 to "word = raw.find('div', {'class': 'definition'}).find('a', {'class': 'word'}).get_text('\n')"
Replace line #120 to "with open(data_file_path, 'a', encoding="utf-8") as data_file:"