This repository contains code for a tool named datesim, which processes images containing text (such as screenshots of conversations) and simulates conversations based on the extracted text using AI models.
- Image Processing: Upload images containing text (e.g., screenshots of WhatsApp conversations).
- Text Extraction: Extract text from the uploaded images using Tesseract OCR (Optical Character Recognition).
- Conversation Simulation: Simulate conversations based on the extracted text using AI models from OpenAI.
- Database Integration: Store formatted conversation data in a SQLite database.
- Python 3.11
- Tesseract OCR installed
- OpenAI API key
- Together API key
- Dependencies listed in
requirements.txt
-
Clone the repository:
git clone https://github.com/your-username/datesim.git
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables:
-
Create a
.env
file in the project directory. -
Add your OpenAI API key and Together API key in the
.env
file:OPENAI_API_KEY=your-openai-api-key TOGETHER_API_KEY=your-together-api-key
-
-
Update the path to the Tesseract executable (
pytesseract.pytesseract.tesseract_cmd
) if necessary. -
Run the application:
streamlit run app.py
- Launch the application.
- Enter your username.
- Upload images containing text.
- Click "Process Images" to extract text and simulate conversations.
- Then, click "Run Simulation" to simulate conversations.
Contributions are welcome! If you'd like to contribute to this project, please fork the repository