askFSDL is a demonstration of a retrieval-augmented question-answering application.
You can try it out via the Discord bot frontend in the Full Stack Discord!
We use our educational materials as a corpus: the Full Stack LLM Bootcamp, the Full Stack Deep Learning course.
So the resulting application is great at answering questions like
- What are the differences between PyTorch, TensorFlow, and JAX?
- How do I build an ML team?
- Which is cheaper: running experiments on cheap, slower GPUs or fast, more expensive GPUs?
- What's a data flywheel?
We use langchain
to organize our LLM invocations and prompt magic.
We stood up a MongoDB instance on
Atlas
to store our cleaned and organized document corpus.
See the Running ETL to Build the Document Corpus
notebook.
For fast search of relevant documents to insert into our prompt, we use a FAISS index.
We host the application backend on Modal, which provides serverless execution and scaling. That's also where we execute batch jobs, like writing to the document store and refreshing the vector index.
We host the Discord bot,
written in discord.py
,
on a free-tier
AWS EC2
instance,
which we provision and configure with
Pulumi.
We use Gantry to monitor model behvaior in production and collect feedback from users.