Real-bojack's Stars
geekan/MetaGPT
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
openai/swarm
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
langgenius/dify
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
aiwaves-cn/agents
An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents
SamuelSchmidgall/AgentClinic
Agent benchmark for medical diagnosis
OpenBMB/XAgent
An Autonomous LLM Agent for Complex Task Solving
gangiswag/llm-reranker
Marker-Inc-Korea/AutoRAG
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
BBuf/how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
Bruce-Lee-LY/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
percent4/embedding_rerank_retrieval
本项目是针对RAG中的Retrieve阶段的召回技术及算法效果所做评估实验。使用主体框架为LlamaIndex.
AnswerDotAI/rerankers
A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.
HeKun-NVIDIA/CUDA-Programming-Guide-in-Chinese
This is a Chinese translation of the CUDA programming guide
linkedin/Liger-Kernel
Efficient Triton Kernels for LLM Training
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
xgqdut2016/cuda_code
easy cuda code
DefTruth/CUDA-Learn-Notes
📚150+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
nvixnu/pmpp__programming_massively_parallel_processors
Examples and exercises from the book Programming Massively Parallel Processors - A Hands-on Approach. David B. Kirk and Wen-mei W. Hwu (Third Edition)
NVIDIA/cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
gpu-mode/lectures
Material for gpu-mode lectures
facebookresearch/mae
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
abetlen/llama-cpp-python
Python bindings for llama.cpp
kwai/Megatron-Kwai
[USENIX ATC '24] Accelerating the Training of Large Language Models using Efficient Activation Rematerialization and Optimal Hybrid Parallelism
microsoft/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
microsoft/DeepSpeedExamples
Example models using DeepSpeed
infiniflow/ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
labring/FastGPT
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.
xorbitsai/inference
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
ollama/ollama
Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.