Chinese-Poetry-and-Cultural-Sentiment-Dataset

The dataset for Chinese Poetry and Cultural Sentiment.

Content

How We Made this Dataset

Due to the limited number of sculptures from the Wei, Jin, Southern and Northern Dynasties, and Sui Dynasties in Chinese museums, the information available on their official websites is insufficient for our research. We used web crawling to collect data from the museum websites to address this issue. We then used this data to query wiki pages and organized the search results using ChatGPT to generate text data that met our needs. The output generated by ChatGPT was used as input and searched again, using ChatGPT to retrieve and generate the final text data. The name of the sculpture in question is contemporaneous with the artifact described in this section of the text. We used the output name to conduct iterative searches and ultimately obtained the desired dataset.

How To Use this Dataset

The data obtained has been saved in the ‘dataset’ folder under the file name ‘data.json’. Each entry in this file contains information about a sculpture artifact, with ‘id’ being the assigned number, ‘title’ being the name of the cultural relic, and ‘text’ being its description.
However, due to the uncontrolled iterative search answers provided by ChatGPT, the dataset contains some duplicate data and descriptions that do not pertain to sculpture relics.

Project We Reference

With gratitude, we utilized the dataset from the GitHub project chinese-poetry to access Tang poetry and Song lyrics.
We sincere thanks.