NicoNico6
Ph.D student in Hasso Plattner Institute, Potsdam University, Germany. Mainly focusing on efficient deep neural network research.
Hasso Plattner Institute (HPI)Potsdam, German
Pinned Repositories
green-bit-llm
A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.
low_bit_llama
Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs
BNext
Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket
bitorch-engine
A toolkit enhances PyTorch with specialized functions for low-bit quantized neural networks.
Deep-Learning-Interview-Book
深度学习面试宝典(含数学、机器学习、深度学习、计算机视觉、自然语言处理和SLAM等方向)
Hyper-BinaryNet
This is the code for paper "Gradients Matters: Designing Binarized Neural Network via Enhanced Infornation Flow"
lq-lora
MMS
医药管理信息系统
pytorch-tutorial
PyTorch Tutorial for Deep Learning Researchers
ShadowRemoval
An impleamention of Towards Ghost-free Shadow Removal using pytorch
NicoNico6's Repositories
NicoNico6/bitorch-engine
A toolkit enhances PyTorch with specialized functions for low-bit quantized neural networks.
NicoNico6/AgentK
An autoagentic AGI that is self-evolving and modular.
NicoNico6/AQLM
Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf
NicoNico6/Awesome-LLM-Compression
Awesome LLM compression research papers and tools.
NicoNico6/BitBLAS
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
NicoNico6/buffer-of-thought-llm
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
NicoNico6/ChatMLX
ChatMLX is a large model real-time conversation app implemented using MLX.🚧
NicoNico6/dotai
NicoNico6/ETO
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents
NicoNico6/evolutionary-model-merge
Official repository of Evolutionary Optimization of Model Merging Recipes
NicoNico6/fast-hadamard-transform
Fast Hadamard transform in CUDA, with a PyTorch interface
NicoNico6/gallama
NicoNico6/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
NicoNico6/KIVI
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
NicoNico6/KVQuant
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
NicoNico6/langgraph
Build resilient language agents as graphs.
NicoNico6/LangGraph-Swift
🚀 LangGraph for Swift. A library for building stateful, multi-actor applications with LLMs, built to work jointly with langchain-swift
NicoNico6/llamafile
Distribute and run LLMs with a single file.
NicoNico6/lm-polygraph
NicoNico6/metal-flash-attention
FlashAttention (Metal Port)
NicoNico6/mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
NicoNico6/octopus-v4
AI for all: Build the large graph of the language models
NicoNico6/PainlessInferenceAcceleration
NicoNico6/private_llm
NicoNico6/PruneMe
NicoNico6/Pruner-Zero
Evolving Symbolic Pruning Metric from scratch
NicoNico6/QQQ
QQQ is an innovative and hardware-optimized W4A8 quantization solution.
NicoNico6/ShiftAddLLM
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
NicoNico6/SpinQuant
Code repo for the paper "SpinQuant LLM quantization with learned rotations"
NicoNico6/transformerlab-app
Experiment with Large Language Models