ilyalasy

Machine Learning Engineer

TUWienVienna, Austria

ilyalasy's Stars

karpathy/llm.c
LLM training in simple, raw C/CUDA
Language:Cuda24.4k2.8k
teacherpeterpan/Logic-LLM
The project page for "LOGIC-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning"
Language:C26041
IBM/ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.
Language:Python21612
EleutherAI/elk
Keeping language models honest by directly eliciting knowledge encoded in their activations.
Language:Python18633
princeton-nlp/SWE-agent
[NeurIPS 2024] SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges.
Language:Python13.7k1.4k
microsoft/LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
Language:Python3.7k282
noamgat/lm-format-enforcer
Enforce the output format (JSON Schema, Regex etc) of a language model
Language:Python1.5k69
SakanaAI/evolutionary-model-merge
Official repository of Evolutionary Optimization of Model Merging Recipes
Language:Python1.2k90
lucidrains/soft-moe-pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
Language:Python2459
IST-DASLab/qmoe
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
Language:Python26222
sergree/matchering
🎚️ Open Source Audio Matching and Mastering
Language:Python1.8k184
mlfoundations/task_vectors
Editing Models with Task Arithmetic
Language:Python42937
arcee-ai/mergekit
Tools for merging pretrained large language models.
Language:Python4.8k439
pjlab-sys4nlp/llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
Language:Python88346
lucidrains/st-moe-pytorch
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
Language:Python29325
ilyalasy/moe-routing
Analysis of token routing for different implementations of Mixture of Experts
Language:Jupyter Notebook8
casper-hansen/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Language:Python1.8k212
koxudaxi/datamodel-code-generator
Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.
Language:Python2.8k304
karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Language:Python9.2k865
XueFuzhao/OpenMoE
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
Language:Python1.4k71
THU-KEG/KEPLER
Source code for TACL paper "KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation".
Language:Python19623
facebookresearch/LAMA
LAnguage Model Analysis
Language:Python1.4k184
hadasah/btm
Language:Python716
deepseek-ai/DeepSeek-MoE
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Language:Python1k49
OFA-Sys/Ditto
A self-ailgnment method for role-play. Benchmark for role-play. Resources for "Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment".
Language:Jupyter Notebook16417
dojoteef/storium-gpt2
Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"
Language:Python393
fabrahman/char-centric-story
Codebase for character-centric story understanding
Language:Jupyter Notebook101
BerriAI/litellm
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
Language:Python13.9k1.6k
ianarawjo/ChainForge
An open-source visual programming environment for battle-testing prompts to LLMs.
Language:TypeScript2.3k179
c32168/dyntamic
Generate pydantic models from JSON Schema
Language:Python214

ilyalasy

ilyalasy's Stars

karpathy/llm.c

teacherpeterpan/Logic-LLM

IBM/ModuleFormer

EleutherAI/elk

princeton-nlp/SWE-agent

microsoft/LMOps

noamgat/lm-format-enforcer

SakanaAI/evolutionary-model-merge

lucidrains/soft-moe-pytorch

IST-DASLab/qmoe

sergree/matchering

mlfoundations/task_vectors

arcee-ai/mergekit

pjlab-sys4nlp/llama-moe

lucidrains/st-moe-pytorch

ilyalasy/moe-routing

casper-hansen/AutoAWQ

koxudaxi/datamodel-code-generator

karpathy/minbpe

XueFuzhao/OpenMoE

THU-KEG/KEPLER

facebookresearch/LAMA

hadasah/btm

deepseek-ai/DeepSeek-MoE

OFA-Sys/Ditto

dojoteef/storium-gpt2

fabrahman/char-centric-story

BerriAI/litellm

ianarawjo/ChainForge

c32168/dyntamic