This project analyzes ChatGPT conversations to extract and refine prompts, providing insights into common themes and patterns in user queries. It uses OpenAI's GPT-4o model to process the conversations and generate reusable prompts.
- Extracts questions from ChatGPT conversation JSON files
- Analyzes questions to identify common themes and patterns
- Generates reusable prompts based on the analysis
- Aggregates and refines prompts to create a concise set of high-quality prompts
- Uses asynchronous processing for improved performance
- Implements robust error handling and logging
- Python 3.7+
- Poetry for dependency management
-
Clone the repository:
git clone https://github.com/yourusername/chickadee.git cd chickadee
-
Install dependencies using Poetry:
poetry install
-
Set up your OpenAI API key as an environment variable:
export OPENAI_API_KEY='your-api-key-here'
The main script (chickadee.py
) uses the following configuration:
- OpenAI model: gpt-4o
- Max tokens per batch: 6000
- Input file:
test.json
(ChatGPT conversation export) - Output files:
refined_prompts.txt
: Contains the refined promptsrefinement_analysis.txt
: Contains the analysis of the refinement process
You can modify these settings in the main()
function of the script.
-
Ensure your ChatGPT conversation export (in JSON format) is in the same directory as the script, named
test.json
. -
Run the script using Poetry:
poetry run python chickadee.py
-
The script will process the conversations, analyze the questions, and generate refined prompts. Progress and results will be logged to the console.
-
After completion, check the
refined_prompts.txt
andrefinement_analysis.txt
files for the results.
chickadee.py
: Main script containing the conversation analysis logictest.json
: Input file containing ChatGPT conversations (not included in the repository)refined_prompts.txt
: Output file containing the refined promptsrefinement_analysis.txt
: Output file containing the analysis of the refinement processpyproject.toml
: Poetry configuration file for managing dependencies
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for providing the GPT-4 model
- The Instructor library for structured outputs from language models
- Logfire for efficient logging