soheeyang
PhD student/Intern at UCL/DeepMind. Previously MS student at KAIST AI and research engineer at Naver Clova. NLP & ML. Wherever curiosity leads me.
UCL/DeepMindLondon, United Kingdom
soheeyang's Stars
tatsu-lab/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
google-research/tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
meta-llama/llama3
The official Meta Llama 3 GitHub site
koekeishiya/yabai
A tiling window manager for macOS based on binary space partitioning
microsoft/JARVIS
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
openai/chatgpt-retrieval-plugin
The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.
karpathy/llama2.c
Inference Llama 2 in one file of pure C
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
BlinkDL/RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
databrickslabs/dolly
Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
OptimalScale/LMFlow
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
openai/transformer-debugger
TransformerLensOrg/TransformerLens
A library for mechanistic interpretability of GPT-style language models
Farama-Foundation/chatarena
ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.
openai/automated-interpretability
allenai/natural-instructions
Expanding natural instructions
google-research-datasets/dstc8-schema-guided-dialogue
The Schema-Guided Dialogue Dataset
reasoning-machines/pal
PaL: Program-Aided Language Models (ICML 2023)
princeton-nlp/LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
google-deepmind/synjax
manyoso/haltt4llm
This project is an attempt to create a common metric to test LLM's for progress in eliminating hallucinations which is the most serious current problem in widespread adoption of LLM's for many real purposes.
jax-ml/oryx
Oryx is a library for probabilistic programming and deep learning built on top of Jax.
TransformerLensOrg/CircuitsVis
Mechanistic Interpretability Visualizations using React
ArthurConmy/Automatic-Circuit-Discovery
OSU-NLP-Group/GrokkedTransformer
Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'
google-research-datasets/presto
A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs
seonghyeonye/TAPP
[AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following
edenbiran/RippleEdits
Evaluating the Ripple Effects of Knowledge Editing in Language Models
Nix07/finetuning
This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking".
edenbiran/HoppingTooLate
Exploring the Limitations of Large Language Models on Multi-Hop Queries