Serverless document chat application

This sample application allows you to ask natural language questions of any PDF document you upload. It combines the text generation and analysis capabilities of an LLM with a vector search of the document content. The solution uses serverless services such as Amazon Bedrock to access foundational models, AWS Lambda to run LangChain, and Amazon DynamoDB for conversational memory.

See the accompanying blog post on the AWS Serverless Blog for a detailed description and follow the deployment instructions below to get started.

Warning This application is not ready for production use. It was written for demonstration and educational purposes. Review the Security section of this README and consult with your security team before deploying this stack. No warranty is implied in this example.

Note This architecture creates resources that have costs associated with them. Please see the AWS Pricing page for details and make sure to understand the costs before deploying this stack.

Key features

Amazon Bedrock for serverless embedding and inference
LangChain to orchestrate a Q&A LLM chain
FAISS vector store
Amazon DynamoDB for serverless conversational memory
AWS Lambda for serverless compute
Frontend built in React, TypeScript, TailwindCSS, and Vite.
Run locally or deploy to AWS Amplify Hosting
Amazon Cognito for authentication

How the application works

A user uploads a PDF document into an Amazon S3 bucket through a static web application frontend.
This upload triggers a metadata extraction and document embedding process. The process converts the text in the document into vectors. The vectors are loaded into a vector index and stored in S3 for later use.
When a user chats with a PDF document and sends a prompt to the backend, a Lambda function retrieves the index from S3 and searches for information related to the prompt.
A LLM then uses the results of this vector search, previous messages in the conversation, and its general-purpose capabilities to formulate a response to the user.

Deployment instructions

Prerequisites

AWS SAM CLI
Python 3.11 or greater

Cloning the repository

Clone this repository:

git clone git@github.com:aws-samples/serverless-pdf-chat.git

Amazon Bedrock setup

This application can be used with a variety of LLMs via Amazon Bedrock. See Supported models in Amazon Bedrock for a complete list.

By default, this application uses Titan Embeddings G1 - Text to generate embeddings and Anthropic's Claude v2 model for responses.

Important Before you can use these models with this application, you must request access in the Amazon Bedrock console. See the Model access section of the Bedrock User Guide for detailed instructions. By default, this application is configured to use Amazon Bedrock in the us-east-1 Region, make sure you request model access in that Region (this does not have to be the same Region that you deploy this stack to).

If you want to change the default models or Bedrock Region, edit Bedrock and BedrockEmbeddings in backend/src/generate_response/main.py and backend/src/generate_embeddings/main.py:

Bedrock(
   model_id="anthropic.claude-v2", #adjust to use different model
   region_name="us-east-1", #adjust if not using us-east-1
)

If you select models other than the default, you must also adjust the IAM permissions of the GenerateEmbeddingsFunction and GenerateResponseFunction resources in the AWS SAM template:

GenerateResponseFunction:
  Type: AWS::Serverless::Function
  Properties:
    # other properties
    Policies:
      # other policies
      - Statement:
          - Sid: "BedrockScopedAccess"
            Effect: "Allow"
            Action: "bedrock:InvokeModel"
            Resource:
              - "arn:aws:bedrock:*::foundation-model/anthropic.claude-v2" # adjust with different model
              - "arn:aws:bedrock:*::foundation-model/amazon.titan-embed-text-v1" # adjust with different model

Deploy the application with AWS SAM

Change to the backend directory and build the application:
```
cd backend
sam build
```
Deploy the application into your AWS account:
```
sam deploy --guided
```
For Stack Name, choose serverless-pdf-chat.
For the remaining options, keep the defaults by pressing the enter key.

AWS SAM will now provision the AWS resources defined in the backend/template.yaml template. Once the deployment is completed successfully, you will see a set of output values similar to the following:

CloudFormation outputs from deployed stack
-------------------------------------------------------------------------------
Outputs
-------------------------------------------------------------------------------
Key                 CognitoUserPool
Description         -
Value               us-east-1_gxKtRocFs

Key                 CognitoUserPoolClient
Description         -
Value               874ghcej99f8iuo0lgdpbrmi76k

Key                 ApiGatewayBaseUrl
Description         -
Value               https://abcd1234.execute-api.us-east-1.amazonaws.com/dev/
-------------------------------------------------------------------------------

You can find the same outputs in the Outputs tab of the serverless-pdf-chat stack in the AWS CloudFormation console. In the next section, you will use these outputs to run the React frontend locally and connect to the deployed resources in AWS.

Run the React frontend locally

Create a file named .env.development in the frontend directory. Vite will use this file to set up environment variables when we run the application locally.

Copy the following file content and replace the values with the outputs provided by AWS SAM:

VITE_REGION=us-east-1
VITE_API_ENDPOINT=https://abcd1234.execute-api.us-east-1.amazonaws.com/dev/
VITE_USER_POOL_ID=us-east-1_gxKtRocFs
VITE_USER_POOL_CLIENT_ID=874ghcej99f8iuo0lgdpbrmi76k

Next, install the frontend's dependencies by running the following command in the frontend directory:

npm ci

Finally, to start the application locally, run the following command in the frontend directory:

npm run dev

Vite will now start the application under http://localhost:5173. As the application uses Amazon Cognito for authentication, you will be greeted by a login screen. In the next step, you will create a user to access the application.

Create a user in the Amazon Cognito user pool

Perform the following steps to create a user in the Cognito user pool:

Navigate to the Amazon Cognito console.
Find the user pool with an ID matching the output provided by AWS SAM above.
Under Users, choose Create user.
Enter an email address and a password that adheres to the password requirements.
Choose Create user.

Change back to http://localhost:5173 and log in with the new user's credentials.

Optional: Deploying the frontend with AWS Amplify Hosting

You can optionally deploy the React frontend with Amplify Hosting. Amplify Hosting enables a fully-managed deployment of the React frontend in an AWS-managed account using Amazon S3 and Amazon CloudFront.

To set up Amplify Hosting:

Fork this GitHub repository and take note of your repository URL, for example https://github.com/user/serverless-pdf-chat/.
Create a GitHub fine-grained access token for the new repository by following this guide. For the Repository permissions, select Read and write for Content and Webhooks.
Create a new secret called serverless-pdf-chat-github-token in AWS Secrets Manager and input your fine-grained access token as plaintext. Select the Plaintext tab and confirm your secret looks like this:
```
github_pat_T2wyo------------------------------------------------------------------------rs0Pp
```
Run the following command in the backend directory to prepare the application for deployment:
```
sam build
```
Next, to edit the AWS SAM deploy configuration, run the following command:
```
sam deploy --guided
```
This time, for Parameter Frontend, input amplify.
For Parameter Repository, input the URL of your forked GitHub repository.
Leave all other options unchanged by pressing the enter key.

AWS SAM will now deploy the React frontend with Amplify Hosting. Navigate to the Amplify console to check the build status. If the build does not start automatically, trigger it via the Amplify console.

Cleanup

Delete any secrets in AWS Secrets Manager created as part of this walkthrough.
Empty the Amazon S3 bucket created as part of the AWS SAM template.
Run the following command in the backend directory of the project to delete all associated resources resources:
```
sam delete
```

Security

This application was written for demonstration and educational purposes and not for production use. The Security Pillar of the AWS Well-Architected Framework can support you in further adopting the sample into a production deployment in addition to your own established processes. Take note of the following:

The application uses encryption in transit and at rest with AWS-managed keys where applicable. Optionally, use AWS KMS with DynamoDB, SQS, and S3 for more control over encryption keys.
This application uses Powertools for AWS Lambda (Python) to log to inputs and ouputs to CloudWatch Logs. Per default, this can include sensitive data contained in user input. Adjust the log level and remove log statements to fit your security requirements.
API Gateway access logging and usage plans are not activiated in this code sample. Similarly, S3 access logging is currently not enabled.
In order to simplify the setup of the demo, this solution uses AWS managed policies associated to IAM roles that contain wildcards on resources. Please consider to further scope down the policies as you see fit according to your needs. Please note that there is a resource wildcard on the AWS managed AWSLambdaSQSQueueExecutionRole. This is a known behaviour, see this GitHub issue for details.
If your security controls require inspecting network traffic, consider adjusting the AWS SAM template to attach the Lambda functions to a VPC via its VpcConfig.

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

SumitLubal/serverless-pdf-chat