ewenme/transfers

Let `club_involved_name` be easily linkable

Opened this issue · 0 comments

First of all, thanks a lot for this incredible dataset.
However, I find a small flaw in it: the club_involved_name feature contains club names as written in the text of the correspondent Transfermarkt entry. However, these names are often inconsistent with the names in club_name. Having the same names on both columns would ease the analysis of the data - e.g., allowing to join the involved club name with its own league, to study flows between leagues.
In Transfermarkt, the name to use is the title of the very same a HTML tag, should be an easy fix. I'd love to help with a pull request, but I had a look at the source code and R is out of my league. In the future I could think of proposing a Python alternative to scrape the data.
Again, congratulation on such a useful repo.