/Ground-Truthing-GPT3.5

Fact checking OpenAI's GPT-3 with structured queries from the Wikidata Taxonomy

Primary LanguageJupyter NotebookCreative Commons Zero v1.0 UniversalCC0-1.0

Ground Truthing GPT3.5

Repository to hold the code for our UMICH SIADS Master of Applied Data Science Capstone project, with a focus on fact checking OpenAI's GPT-3.

team_ambitious_project_poster pptx (1)

Setup

  1. Git clone this repository
  2. Create a new Python or Anaconda environment for this project
  3. pip install -r requirements.txt

Will install:

  • Pandas (Operate on tabular data)
  • TQDM (prints progress bars)
  • Wikidata (structured queries to Wikidata; we're also firing off SparQL queries with the Python Requests built-in package)
  • OpenAI (GPT-3 interface, among other things)
  • Seaborn (Visualization Package)
  • Plotly (Visualization Package)
  • python-dotenv (Secrets management)
  • Specific versions of Python 3 Built-ins (Requests, matplotlib)

Running the code

To replicate this analysis, open the Notebooks folder, and step through the numbered notebooks. The notebooks are numbered because certain stages build upon data retrieved in earlier stages, with exploratory data analysis (EDA) separated from long-running data gathering code.

Licenses

  • Code wholly generated by this team is released under the CC-0 License ("Universal Public Domain")
  • Other code used in this project released under a specific license can be reviewed in the LICENSE folder.

Data Access Statement

image