Repository to hold the code for our UMICH SIADS Master of Applied Data Science Capstone project, with a focus on fact checking OpenAI's GPT-3.
- Git clone this repository
- Create a new Python or Anaconda environment for this project
pip install -r requirements.txt
Will install:
- Pandas (Operate on tabular data)
- TQDM (prints progress bars)
- Wikidata (structured queries to Wikidata; we're also firing off SparQL queries with the Python
Requests
built-in package) - OpenAI (GPT-3 interface, among other things)
- Seaborn (Visualization Package)
- Plotly (Visualization Package)
- python-dotenv (Secrets management)
- Specific versions of Python 3 Built-ins (Requests, matplotlib)
To replicate this analysis, open the Notebooks folder, and step through the numbered notebooks. The notebooks are numbered because certain stages build upon data retrieved in earlier stages, with exploratory data analysis (EDA) separated from long-running data gathering code.
- Code wholly generated by this team is released under the CC-0 License ("Universal Public Domain")
- Other code used in this project released under a specific license can be reviewed in the
LICENSE
folder.
- Taxonomic Wikidata data was used in the creation of the prompts for this project. The data is available under CC-0 license, as per the Wikidata Data Access Policy.
- OpenAI GPT 3.5 was used in the creation of the responses. The use thereof is in accordance with the OpenAI Sharing & Publication Policy.