/mojim-lyrics

Code for constructing the MojimLyrics, a dataset of lyrics from popular Chinese artists.

Primary LanguagePythonMIT LicenseMIT

Mojim Popular Lyrics

A dataset of lyrics for popular Chinese songs, obtained from the website Mojim.com.

Songs Artists Tokens Size
39747 230 14,406,854 38.4MB

Data is provided under the folder data. The data presented here is a subset of popular artist song lyrics, and is provided for the purpose of natural language research under fair use.

If you want to reconstitute the dataset using the provided script, please remember to be considerate and avoid using too much bandwidth.

Citation

If this dataset is useful to your research, you can refer to it with the following citation (paper citation coming):

@article{crothers2023bloom,
  title={In BLOOM: Creativity and Affinity in Artificial Lyrics and Art},
  author={Crothers, Evan and Viktor, Herna and Japkowicz, Nathalie},
  journal={arXiv preprint arXiv:2301.05402},
  year={2023}
}