Skills and projects covered in LLM and ChatGPT folders:
Use Hugging Face datasets and large models; set up tokenizer config
Build a knowledge-based question answering / search system
- Convert a dataset to vectors and save them in a vector library (FAISS) or database (Chroma)
- Vectorize a query (with filters in it) and saved the output as context
- Combine context with original prompt as new prompts to generate search results
This module also has tutorials on Pinecone and Weaviate
Build tree LLM-chain-based models
- LLM1 moderates the comments generated by LLM2
- Use LLM with LangChain agents (Wiki, Google, Python REPL) to do automatic data analyses
- An LLM agent that allows user to have free chat with documents (e.g., Shakespeare's books)
Fine-tune LLMs with Hugging Face, Tensorboard, and DeepSpeed (multiple GPU cluster support) on a traditional IMDB classification; evaluate summarization performance with NLTK and ROUGE
Hugging Face Disaggregator (for q quick demographic analysis) and evaluate (for toxicity), gender expression generation, and SHAP (for interpretability, i.e., token-level contribution for the final generated output)
MLOps of a sample model with MLFlow, focusing on model registeration, versioning, monitoring/performance Tracking, and pushing to the production
- use QLoRa and hf PEFT to tune GPT-neo model on casual language modeling
- use deepspeed to tune mpt model on instruction data
- use XTurning on LLaMA 7b on question answering
- prompt engineering demos and guideline
- use of langchain, including prompt management, external agents, and evaluation tools
- fine-tuning service on chatgpt with generated data from gpt4
- A tutotial of AutoGen, a multi-agent LLM package
- transformer algorithm in torch
- fine-tuning with LORA and PEFT
- fine-tuning with prompt-tuning and PEFT
- quantization and deploylment
- L2: data ingestion using SQL and storage on GCP
- L3: use kubeflow for automation orchestratian + fine-tuning on vetexAI with QLoRA
- L4: make predictions and pull out metrics on safety scores, such as harassment and severity
Section 7: Azure AI APIs
This section includes several cases of Azure AI services, including Azure OpenAI, AI Studio, Speech (TTS & STT), AI Search, Document Intelligence, Storage, Vision, Language Service (translation, sentiment analysis, lang detection, intent & entity), and Semantic Kernel (similar to LangChain)
Section 8: Evaluate with Langfuse
Iterative development is crucial for LLM apps. I use a platform called Langfuse, similar to W&B and MLFlow. The code includes setting up Llama3 on Groq, logs, prompts, and datasets for LLM monitoring and improvement.
- A sample of my reserach links on LLM topics working as an Applied Scientist (NLP), including latest model update, prompt engineering, practice on fine-tuning and alignment, literatures, and downstream tasks and so on
- Fine-tuning huggingface LLMs on AWS SageMaker
- Fine_tuning huggingface LLMs within azureml