Pinned Repositories
Inference_Portal
Instagram-prediction
gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
django-vectordb
A fast and scalable app that adds vector search capabilities to your Django applications. It offers low latency, fast search results, native Django integration, and automatic syncing between your models and the vector index. Incremental updates are also supported out of the box.
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs