Welcome to the Laravel LLM Dataset Explorer with Gemini AI! This project demonstrates how you can leverage the power of Large Language Models (LLMs), specifically Gemini AI, to interact with datasets in a natural, conversational way. By combining Laravel, Gemini AI, and CoreNLP, we enable intelligent querying and dynamic responses from structured data (such as a book dataset).
This project showcases how LLMs can transform structured datasets into interactive experiences where users can query and explore data via natural languageβjust like chatting with a human. Whether you're interacting with books, movies, or any other dataset, this is an example approach of how AI and data can converge to create powerful conversational tools.
- Gemini AI-Powered Chatbot: Uses Gemini AI to process and respond to natural language queries about the dataset.
- Dynamic Dataset Interaction: Query your dataset with plain text and get intelligent, context-aware responses.
- Context-Aware Conversations: AI understands the context and intent of your queries to retrieve the most relevant data.
- CoreNLP Integration: Enhances the modelβs ability to understand complex user queries and improve data extraction.
- Conversation Management: Save and reset conversation history with the chatbot to refine your testing and interactions.
- Docker & Laravel Sail: Built on Laravel Sail for a seamless local development environment using Docker.
Start by cloning this repository to your local machine:
git clone https://github.com/your-repo/laravel-llm-dataset-explorer.git cd laravel-llm-dataset-explorer
- PHP & Composer: Install the required PHP packages.
composer install
Copy .env.example
to .env
and update the necessary configuration values, such as your Gemini AI API key and dataset settings.
cp .env.example .env
Make sure to set the following variables in your .env
file:
GEMINI_API_KEY
: Your Gemini AI API key.GEMINI_BASE_URL
: The base URL for the Gemini API.GEMINI_REQUEST_TIMEOUT
: Request timeout for the Gemini API (in seconds).
Ensure that Docker Desktop is running, and then start the Laravel Sail development environment:
vendor/sail up
This will start all necessary Docker containers for your application, and you can access it locally.
Set up your database by running Laravel migrations:
php artisan migrate
If you want to seed the database with sample user for the login, you can run:
php artisan db:seed
-
Handle Query:
POST /api/chat
- Request Body:
{ "query": "Hi, there! What are you designed to do?" }
- Response:
{ "response": { "text": "I am designed to help you explore and query a dataset of books. I can perform the following tasks:\n\n* **Find books** based on criteria like author, title, language, year, or other details. For example, you could ask me \"Find books written by Giacomo Leopardi\" or \"List all books published in 1922\".\n* **Count books** that match specific criteria. You might ask \"How many books are in the dataset written in French?\" or \"How many books were published before 1900?\".\n* **Reset our conversation** if you want to start over.\n\nEssentially, I'm your personal book dataset assistant! Let me know what you'd like to explore.\n", "source": "Laravel LLM ChatBot", "timestamp": "2015-10-26T12:00:00Z" } }
-
User: "How many books do you have in your dataset?"
-
ChatBot:
There are 100 books matching your criteria.
-
User: "And how many of those are written in italian?"
-
ChatBot:
There are 5 books matching your criteria.
-
User: "Show me italian books under the 500 pages"
-
ChatBot:
Here are some books that match your criteria:\nTitle: Poems, Author: Giacomo Leopardi, Country: Italy, Language: Italian, Year: 1818, Pages: 184, ImageLink: images/poems-giacomo-leopardi.jpg, Link: \n\nTitle: Confessions of Zeno, Author: Italo Svevo, Country: Italy, Language: Italian, Year: 1923, Pages: 412, ImageLink: images/confessions-of-zeno.jpg, Link: https://en.wikipedia.org/wiki/Zeno%27s_Conscience\n
- User: "Please delete our conversation history."
- ChatBot: "Conversation history has been reset."
Handles user queries and interacts with Gemini AI. It processes requests and uses the BookService to return dataset interaction responses.
Manages the dataset of books, query intent with CoreNLP and handles the filtering and retrieval of data based on user queries, such as "Italian books" or "books from the 14th century".
Handles incoming queries via the API endpoint. It uses the ChatBotService to interpret queries and provide relevant responses based on the dataset.
Make sure to set these environment variables in your .env
file:
CORE_NLP_SERVER_URL
: The endpoint for the CoreNLP server.CORE_NLP_SERVER_PORT
: The port for the CoreNLP server.GEMINI_API_KEY
: Your Gemini API key for LLM-powered responses.GEMINI_BASE_URL
: The endpoint for accessing the Gemini AI services.GEMINI_REQUEST_TIMEOUT
: Timeout for API requests (in seconds).
config/chatBot.php
: Contains chatbot instructions and function schemas.config/gemini.php
: Configuration for Gemini AI API integration.
This project demonstrates the use of Gemini AI to interact with a structured dataset (books, in this case). The system leverages advanced technologies to interpret natural language queries and deliver relevant, human-like responses.
- CoreNLP: Analyzes and extracts meaning from user inputs, enabling precise filtering and querying of the dataset.
- Gemini AI with Function Calling: Dynamically invokes predefined functions when specific intents are detected in the user's query.
We implemented Gemini AI's function calling to handle structured queries. When a specific function is triggered (e.g., retrieving book details), the response is fixed and deterministic, ensuring:
- Cost-Efficiency: By using concise, predefined responses, we optimize token usage and reduce API costs.
- Predictability: The returned information is precise and contextually relevant, minimizing unexpected results.
To further optimize token usage, the results from findBooks()
function calls are not re-processed with Gemini AI to return a natural language answer. Additionally, these results are not saved in the conversation history (context). This design choice helps in limiting the token usage under the Gemini AI free tier. However, for production:
- Result Optimization: Responses can be re-prompted to Gemini AI for a more natural language refinement.
- Efficient Context Management: Proper history context handling should be implemented to balance conversational continuity with cost efficiency.
This setup makes Gemini AI an effective bridge between human-like conversation and structured datasets, maintaining balance between dynamic interactivity and practical cost management.
This project is a proof of concept and a study prototype. While it showcases the potential of LLMs in dataset interaction, it is not optimized for production use.
- Database Integration: Replace the JSON-based dataset with Laravel's database system and ElasticSearch for robust and scalable querying.
- Caching Mechanisms: Use caching to optimize response times and reduce computational costs.
- Optimized Context Management: Currently, the results from
findBooks()
are not saved in the conversation context history. This design avoids rapid consumption of tokens under the Gemini AI free tier. However, for production:- Result Optimization: Responses can be re-prompted to Gemini AI for a more natural language refinement.
- Efficient Context Management: Proper history context handling should be implemented to balance conversational continuity with cost efficiency.
- Security Measures: Implement strong authentication, request validation, and data sanitization mechanisms.
π This project serves as a valuable starting point for exploring how conversational AI can interact with structured datasets. Use it as a foundation to build a production-ready solution tailored to your needs.
This project is licensed under the MIT License - see the LICENSE link for details.