ojaffe

None

Pinned Repositories

Deception
Language:Jupyter Notebook00
evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Language:Python0 0 00
llm-attacks
Universal and Transferable Attacks on Aligned Language Models
Language:Python0 0 00
sandbagging_probes
Language:Python0 1 00
tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
Language:Python01
evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Language:Python15.2k 263 2152.6k
mle-bench
MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering
Language:Python562 9 1262
SWE-bench
[ICLR 2024] SWE-bench: Can Language Models Resolve Real-world Github Issues?
Language:Python2.1k 26 166363
MLAgentBench
Language:Python257 6 635
aideml
AIDE: the state-of-the-art machine learning engineer agent, generating machine learning solution code from natural language descriptions.
Language:Python662 19 1174

ojaffe/Polyphonic-OMR
Automatically identifies notes within an image of a musical piece
Language:Python10
ojaffe/batch_export
MuseScore plugin to convert various input formats into various output formats
Language:QML00
ojaffe/Deception
Language:Jupyter Notebook00
ojaffe/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Language:Python0 0 00
ojaffe/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Language:Python00
ojaffe/llama2.py
Inference Llama 2 in one file of pure Python
Language:Python00
ojaffe/llm-attacks
Universal and Transferable Attacks on Aligned Language Models
Language:Python0 0 00
ojaffe/polyphonic-omr-baseline
Code used in research that led to the paper "An Empirical Evaluation of End-to-End Polyphonic Optical Music Recognition" (ISMIR 2021)
Language:QML00
ojaffe/Remove-First-Score
Plugin for MuseScore, removes files
Language:QML00
ojaffe/sandbagging_probes
Language:Python0 1 00
ojaffe/tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
Language:Python01
ojaffe/tinystories_robust_probes
Language:Python
ojaffe/TruthfulQA-Finetuning
Efficient finetuning of huggingface GPT-2 models on TruthfulQA with a single GPU.
Language:Python