swastikmaiti

Pinned Repositories

Build-Docker-for-LlamaIndex-Agentic-RAG-System
Docker implementation of Llama Index Agentic RAG. Developing a RAG system requires multiple component such as LLM, Vector-DB, UI, etc. In this work we perform containerization of entire system.
Language:Python1 2 01
Embedding-Quantization
To make LLM faster we need faster retrieval system. Here comes Embedding Quantization. Embedding quantization is great technique to save cost on Vector DB, significantly faster retrieval while preserving retrieval performance.
Language:Jupyter Notebook4 1 00
Fine-tuning-BART
Fine Tuning is a cost-efficient way of preparing a model for specialized tasks. Fine-tuning reduces required training time as well as training datasets. We have open-source pre-trained models. Hence, we do not need to perform full training every time we create a model.
Language:Jupyter Notebook1 1 00
Intro-to-RAG-with-CODEGEMMA-7B
LLM is a very powerful tool. It often performs more than required (hallucinations) and may tend to generate output in a pattern it finds best. We need RAG to harness the power of LLM in a controlled manner. In this work we implement a simple RAG system with Codegemma and an in-memory Vector Database.
Language:Jupyter Notebook2 1 00
Llama-2-7B-Chat-PEFT
PEFT is a wonderful tool that enables training a very large model in a low resource environment. Quantization and PEFT will enable widespread adoption of LLM.
Language:Jupyter Notebook3 1 11
LlamaIndex-Agent
A RAG system is just the beginning of harnessing the power of LLM. The next step is creating an intelligent Agent. In Agentic RAG the Agent makes use of available tools, strategies and LLM to generate response in a specialized way. Unlike a simple RAG, an Agent can dynamically choose between tools, routing strategy, etc.
Language:Jupyter Notebook8 1 05
LlamaIndex-Agent-with-Reasoning-Loop
Simple agents are good for 1-to-1 retrieval system. For more complex task we need multi steps reasoning loop. In a reasoning loop the agent can break down a complex task into subtasks and solve them step by step while maintaining a conversational memory.
Language:Jupyter Notebook1 1 02
Meta-Llama3-8B-Chat-Instruct-LoRA
PEFT (LoRA) with Meta-Llama3-8B-Chat-Instruct
Language:Jupyter Notebook1 1 00
Phi3-No-GPU-No-Worry
GPU constrained! No More. Microsoft released Phi3 specially designed for memory/compute constrained environments. The model support ONXX CPU runtime which offers amazing inference speed even on mobile cpu.
Language:Jupyter Notebook2 1 00
Vector_Database
Implementing Vector Database on CoNaLa dataset to retrieve program snippets relevant to user queries. This is a very simple simulation of a Vector Database.
Language:Jupyter Notebook1 1 00

swastikmaiti's Repositories

swastikmaiti/LlamaIndex-Agent
A RAG system is just the beginning of harnessing the power of LLM. The next step is creating an intelligent Agent. In Agentic RAG the Agent makes use of available tools, strategies and LLM to generate response in a specialized way. Unlike a simple RAG, an Agent can dynamically choose between tools, routing strategy, etc.
Language:Jupyter Notebook8 1 05
swastikmaiti/Embedding-Quantization
To make LLM faster we need faster retrieval system. Here comes Embedding Quantization. Embedding quantization is great technique to save cost on Vector DB, significantly faster retrieval while preserving retrieval performance.
Language:Jupyter Notebook4 1 00
swastikmaiti/Llama-2-7B-Chat-PEFT
PEFT is a wonderful tool that enables training a very large model in a low resource environment. Quantization and PEFT will enable widespread adoption of LLM.
Language:Jupyter Notebook3 1 11
swastikmaiti/Intro-to-RAG-with-CODEGEMMA-7B
LLM is a very powerful tool. It often performs more than required (hallucinations) and may tend to generate output in a pattern it finds best. We need RAG to harness the power of LLM in a controlled manner. In this work we implement a simple RAG system with Codegemma and an in-memory Vector Database.
Language:Jupyter Notebook2 1 00
swastikmaiti/Phi3-No-GPU-No-Worry
GPU constrained! No More. Microsoft released Phi3 specially designed for memory/compute constrained environments. The model support ONXX CPU runtime which offers amazing inference speed even on mobile cpu.
Language:Jupyter Notebook2 1 00
swastikmaiti/Build-Docker-for-LlamaIndex-Agentic-RAG-System
Docker implementation of Llama Index Agentic RAG. Developing a RAG system requires multiple component such as LLM, Vector-DB, UI, etc. In this work we perform containerization of entire system.
Language:Python1 2 01
swastikmaiti/Fine-tuning-BART
Fine Tuning is a cost-efficient way of preparing a model for specialized tasks. Fine-tuning reduces required training time as well as training datasets. We have open-source pre-trained models. Hence, we do not need to perform full training every time we create a model.
Language:Jupyter Notebook1 1 00
swastikmaiti/LlamaIndex-Agent-with-Reasoning-Loop
Simple agents are good for 1-to-1 retrieval system. For more complex task we need multi steps reasoning loop. In a reasoning loop the agent can break down a complex task into subtasks and solve them step by step while maintaining a conversational memory.
Language:Jupyter Notebook1 1 02
swastikmaiti/Meta-Llama3-8B-Chat-Instruct-LoRA
PEFT (LoRA) with Meta-Llama3-8B-Chat-Instruct
Language:Jupyter Notebook1 1 00
swastikmaiti/Vector_Database
Implementing Vector Database on CoNaLa dataset to retrieve program snippets relevant to user queries. This is a very simple simulation of a Vector Database.
Language:Jupyter Notebook1 1 00
swastikmaiti/ThesisWork
This repository contains code on Thesis Work : Encoder Training for Neural Machine Translation in Resource Constrained Settings