Eloquent is a sophisticated, locally-hosted frontend for open-weight Large Language Models, designed around a dual-GPU architecture, advanced memory systems, and multi-modal AI interaction. It provides a powerful and private environment for interacting with, testing, and managing large language models.
Eloquent is designed to leverage a dual-GPU setup for optimal performance. One GPU is dedicated to primary model inference, while the second GPU handles memory-related tasks, ensuring smooth and responsive interactions even with large models.
The application features a persistent memory system that analyzes conversations and stores relevant information about user preferences, expertise, and context. This allows for truly personalized AI responses that improve over time.
-
Customizable AI Personas: Create and customize AI characters with detailed backgrounds, personalities, and speech patterns.

-
Dynamic Conversation Management: Enjoy features like auto-generated chat titles, editable messages, and a full conversation history.
-
World Lore Integration: Build rich, consistent fictional worlds with keyword-triggered contextual injection from a custom lore book.
Eloquent includes an advanced Auto-Character Creator — a tool that streamlines the process of creating AI characters through natural dialogue.
- Have a conversation with an AI about your new character idea.
- When you're ready, click the Auto-Generate button.
- Eloquent sends a prompt to the model, asking it to fill out a detailed character JSON structure.
- The response is parsed and rendered into a fully editable character sheet, complete with persona, dialogue style, memory usage, and visual portrait options.
This system enables rapid iteration and experimentation with new AI personas.
- Upload and parse a wide range of document types (PDF, DOCX, TXT, etc.).
- Eloquent intelligently chunks and embeds document content for efficient retrieval.
- Relevant document chunks are automatically injected into the conversation context.
- Single Model Testing: Evaluate the performance of individual models with automated LLM judges providing scores and feedback.
- Comparison Testing: Pit models against each other in head-to-head competitions with ELO rating calculations.
- Advanced ELO Algorithm: The system uses a category-weighted, performance-scaled rating system to provide a nuanced view of model capabilities.
- Image Generation: Generate images directly within the chat interface using either AUTOMATIC1111 or the built-in "EloDiffusion" engine powered by stable-diffusion.cpp.
- ADetailer Integration: Automatically enhance generated images with face and detail correction using ADetailer.
- Voice-to-Text and Text-to-Speech: Interact with the AI using your voice and receive spoken responses.
-
Focus Mode: A clean, distraction-free chat interface for immersive conversations.

-
Call Mode: An audio-only interaction mode for a more natural, hands-free experience.

- A Windows operating system.
- Node.js version 21.7.3 is required for guaranteed compatibility.
- An NVIDIA GPU is highly recommended for full feature support.
- Clone or download the repository to your local machine.
- Run the
install.batscript located in the root directory. This will install all necessary Python and Node.js dependencies. - Once the installation is complete, run the
run.batscript to launch the Eloquent application.
To enable Call Mode (real-time voice input and speech playback), Eloquent uses NVIDIA’s nemo-toolkit.
Due to complex dependencies, it must be installed after the main setup:
pip install nemo-toolkit==2.3.1While Eloquent was developed on Windows, the core technologies (Python, FastAPI, React) are cross-platform. However, running it on other operating systems requires manual setup.
- Compatibility: High
- Notes:
- Eloquent can run very effectively on Linux with an NVIDIA GPU, but requires manual setup.
- The
install.batandrun.batscripts are for Windows only. You must create equivalent.shshell scripts to perform the installation and launch steps. - You will need to manually install system-level dependencies like
espeak-ngfor the TTS service:sudo apt-get install espeak-ng
- Full functionality assumes an NVIDIA GPU and the appropriate CUDA drivers.
- Compatibility: Low / Not Recommended
- Limitations:
- The application is designed for a dual-GPU setup using NVIDIA's CUDA platform. Most Macs, especially modern ones with Apple Silicon (M1/M2/M3), do not have NVIDIA GPUs.
- The application would likely fall back to a very slow CPU-only mode, defeating its primary purpose.
- macOS cannot run
.batfiles. - Manual installation of dependencies like
espeak-ng(via Homebrew) would be required.
Eloquent is built with a React frontend and a Python backend using FastAPI. Here's a breakdown of the key components:
- AppContext.jsx: The core of the frontend, this context provider manages the application's state, including models, conversations, settings, and API interactions.
- Chat.jsx: The main chat interface component, responsible for displaying messages, handling user input, and integrating various features like character selection, RAG, and multi-modal interactions.
- ModelTester.jsx: This component provides the interface for the ELO testing system, allowing users to set up and run single-model or comparison tests.
- SimpleChatImageButton.jsx: Manages the image generation dialog and ADetailer integration.
- FocusModeOverlay.jsx & CallModeOverlay.jsx: These components provide the specialized interfaces for the Focus and Call modes.
- main.py: The main FastAPI application file, defining API endpoints and managing the application's lifecycle.
- inference.py: Handles the core text generation logic, providing a unified interface for interacting with different model backends.
- memory_intelligence.py: The heart of the memory system, responsible for semantic retrieval, creation, and curation of memories.
- tts_service.py: This service provides text-to-speech functionality, supporting multiple TTS engines.


