HUJA9's Stars
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
jindongwang/transferlearning
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
marcotcr/lime
Lime: Explaining the predictions of any machine learning classifier
py-why/dowhy
DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
uber/causalml
Uplift modeling and causal inference with machine learning algorithms
OpenBMB/ToolBench
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
huggingface/alignment-handbook
Robust recipes to align language models with human and AI preferences
llm-attacks/llm-attacks
Universal and Transferable Attacks on Aligned Language Models
PKU-Alignment/safe-rlhf
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
yaodongC/awesome-instruction-dataset
A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)
allenai/natural-instructions
Expanding natural instructions
causaltext/causal-text-papers
Curated research at the intersection of causal inference and natural language processing.
zhijing-jin/Causality4NLP_Papers
A reading list for papers on causality for natural language processing (NLP)
wjmaddox/swa_gaussian
Code repo for "A Simple Baseline for Bayesian Uncertainty in Deep Learning"
HowieHwong/TrustLLM
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
voidism/DoLa
Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
SCLBD/BackdoorBench
AI-secure/DecodingTrust
A Comprehensive Assessment of Trustworthiness in GPT Models
P2333/Bag-of-Tricks-for-AT
Empirical tricks for training robust models (ICLR 2021)
LLM-Tuning-Safety/LLMs-Finetuning-Safety
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.
thunlp/OpenBackdoor
An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)
cooperleong00/Awesome-LLM-Interpretability
A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..
Aligner2024/aligner
Achieving Efficient Alignment through Learned Correction
causalNLP/corr2cause
Data and code for the Corr2Cause paper (ICLR 2024)
nrimsky/LM-exp
LLM experiments done during SERI MATS - focusing on activation steering / interpreting activation spaces
niconi19/LLM-Conversation-Safety
[NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
GodXuxilie/PromptAttack
An LLM can Fool Itself: A Prompt-Based Adversarial Attack (ICLR 2024)
wang2226/Trojan-Activation-Attack