API documentation is available in Swagger UI in the following URL: http://cb-highered-brain-swagger.s3-website-us-east-1.amazonaws.com/
The static website serving the UI is hosted on the cb-highered-brain-swagger
S3 Bucket.
The automatically-generated JSON file with Swagger/OpenAPI models for the API that feed the UI are in docs/swagger_dist/docs/api-cb-highered-brain.json
.
A helper script located in swagger_update.py
is run periodically to export the JSON file from API Gateway and upload it into the corresponding S3 Bucket.
All the data related to the chatbot is stored in DynamoDB noSQL databases.
Baseline, "static" data includes information on students' characteristics (e.g. gender), degrees' characteristics (e.g. price), and credit options' characteristics (e.g. interest rate).
Student data is augmented by information collected during the interaction session, and is stored in the sessions
table.
*-students-*
- Partition key:
web_id
(String) - Sort key: NA
- GSI: NA
- Partition key:
*-degrees-*
- Partition key:
option_id
(String) - Sort key: NA
- GSI: NA
- Partition key:
*-credits-*
- Partition key:
credit_id
(String) - Sort key: NA
- GSI: NA
- Partition key:
*-sessions-*
- Partition key:
session_id
(String) - Sort key: NA
- GSI:
- Name:
user_id-index
- Partition key:
user_id
- Sort key: NA
- Name:
- Partition key:
User-Session data is stored in *-sessions-*
, where session_id
is the partition key, and user_id
is only an attribute. We avoid using user_id
as a partition key (with session_id
as a sort key) because DynamoDB scales efficiently by assigning a table's partition to different nodes, and we expect the generic "public user" to experience a high number of concurrent operations. Tables partitioned by session sidestep this issue. See here for more information.
Because accessing the sessions of a given user is important, we implement a Global Secondary Index (GSI) on the user_id
attribute. The endpoint methods associated to this GSI are of the form /user/
.
The two DynamoDB tables described above are generated using the program defined in createtable.py
.
Run the program using the following syntax:
python createtable.py [-s [stage]] [-t [table]]
Options and defaults are:
-s, --stage=dev
-t, --table-type=logs
Table types are either sessions
or logs
. Individual table specs (i.e. naming, environment variables, primary keys, GSI) are all specified in the script itself.