This API is a starting point for building AI chat applications! It's built in Typescript with NestJs.
- Ensure docker is running on your machine
- Copy .env.dist to .env and set api keys as needed
- Install dependencies:
yarn install
- Start containers and run migrations:
yarn docker:up
yarn prisma:migrate
yarn prisma:generate
- Start the application:
yarn dev
- POST /chats with a title "test chat"
- GET /chats/{id}/stream and subscribe to the SSE stream on the front end
- PATCH /chats/{id} with a message "hello" and type "human" to send a message to the chat
- If you have subscribed to the SSE stream you should see a response from the AI streamed to the front end
- PATCH /chats/{id} another message
- PATCH /chats/{id} ...
- The full openapi spec is here
The API loosely follows the principles of Domain-Driven Design (DDD) and Command Query Responsibility Segregation (CQRS).
NestJS promotes a modular structure. Each domain or bounded context can be represented as a module. Each module then has its application, domain, and infrastructure layers.
This contains commands, queries, and their respective handlers. It's also where DTOs (Data Transfer Objects) are defined.
This includes your core business logic - entities (or aggregates), value objects, domain services, events, and domain-specific interfaces for repositories and providers.
Houses technical implementations such as controllers (for HTTP routes), concrete repository implementations, and other providers.
The directory structure maintains the separation of concerns recommended by DDD while leveraging NestJS's modular system to organize code around domain boundaries effectively.
|-- /src
| | /ai (domain module)
| | |-- /application
| | | |-- /command
| | | | |-- commands and command handlers
| | | |-- /query
| | | | |-- queries and query handlers, DTOs, DTO providers and response builders
| | |-- /domain
| | | |-- /model
| | | | |-- models and repository interfaces, provider interfaces
| | | |-- /service
| | | | |-- domain services
| | |-- /infrastructure
| | | |-- /persistence
| | | | |-- repository implementations
| | | |-- /controllers
| | | | |-- controllers for HTTP routes
| | | |-- /providers
| | | | |-- third party providers
| | /common (cross-cutting concerns, shared utilities, core functionality)
| | |-- ... (similar structure as above)
Out of the box the API uses the OpenAiChatResponseProvider which uses the OpenAI API to generate responses. There's also a LangChainChatResponseProvider which uses the LangChainJs library and a HuggingFaceChatResponseProvider which uses the HuggingFace API. Ollama is supported with the OllamaChatResponseProvider also. These providers implement the ChatResponseProvider interface and can be swapped around easily using dependency injection.
Out of the box the API uses a Postgres database with Prisma as the ORM. This is also easily swappable.
As part of the Query side of CQRS, the API uses Redis to cache all DTO's for lightning-fast responses. The LangChainOpenAiRedisBufferChatResponseProvider uses Redis and memory to cache message history and responses from the OpenAI API.
The rest API is documented using Swagger here
An AWS CDK application is included in the AWS directory. This will deploy the API to AppRunner with an RDS database and an ElastiCache Redis cluster in a custom VPC.