sara4dev's Stars
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
papers-we-love/papers-we-love
Papers from the computer science community to read and discuss.
ollama/ollama
Get up and running with Llama 3, Mistral, Gemma 2, and other large language models.
ggerganov/llama.cpp
LLM inference in C/C++
meta-llama/llama
Inference code for Llama models
ray-project/ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
microsoft/autogen
A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap
apache/skywalking
APM, Application Performance Monitoring System
karpathy/llm.c
LLM training in simple, raw C/CUDA
qdrant/qdrant
Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
chroma-core/chroma
the AI-native open-source embedding database
AI4Finance-Foundation/FinGPT
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
jdx/mise
dev tools, env vars, task runner
rapidsai/cudf
cuDF - GPU DataFrame Library
triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Syllo/nvtop
GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
rancher-sandbox/rancher-desktop
Container Management and Kubernetes on the Desktop
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
ROCm/ROCm
AMD ROCm™ Software - GitHub Home
diggerhq/digger
Digger is an open source IaC orchestration tool. Digger allows you to run IaC in your existing CI pipeline ⚡️
Netflix/bpftop
bpftop provides a dynamic real-time view of running eBPF programs. It displays the average runtime, events per second, and estimated total CPU % for each program.
predibase/lorax
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
ImplFerris/LearnRust
Rust Learning Resources
huggingface/optimum-nvidia
nixys/nxs-universal-chart
The Helm chart you can use to install any of your applications into Kubernetes/OpenShift
bazel-contrib/rules_oci
Bazel rules for building OCI containers
anyscale/ray-summit-2023-training