OctoAI

Optimizing machine learning using machine learning

Seattle

Pinned Repositories

Apple-M1-BERT
3X speedup over Apple’s TensorFlow plugin by using Apache TVM on M1
Language:Python138 51 211
deformable-attention-kernel
TVMScript kernel for deformable attention
Language:Python25 42 04
macho-dyld
Custom dyld version inherited from original Apple dyld implementation
Language:C++17 44 01
octoai-textgen-cookbook
Simple getting-started code examples for LLM applications powered by OctoAI
Language:Python46 29 121
octocloud-templates
Language:Python3 27 03
octoml-llm-qa
A code sample that shows how to use 🦜️🔗langchain, 🦙llama_index and a hosted LLM endpoint to do a standard chat or Q&A about a pdf document
Language:Python19 31 38
octoml-profile
Home for OctoML PyTorch Profiler
114 3 29
synr
A library for syntactically rewriting Python programs, pronounced (sinner).
Language:Python68 41 411
triton-client-rs
A client library in Rust for Nvidia Triton.
Language:Rust30 36 24
tvm2onnx
An open-source tool created by OctoML that converts TVM-optimized models to code runnable in ONNX Runtime.
Language:Python17 33 11

OctoAI's Repositories

octoml/octoai-textgen-cookbook
Simple getting-started code examples for LLM applications powered by OctoAI
Language:Python46 29 121
octoml/macho-dyld
Custom dyld version inherited from original Apple dyld implementation
Language:C++17 44 01
octoml/mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
Language:Python5 1 168
octoml/octoai-apps
A collection of OctoAI-based demos.
Language:TypeScript5 28 10
octoml/fern-config
Configuration for generating SDKs and Documentation.
Language:MDX2 22 03
octoml/flashinfer
FlashInfer: Kernel Library for LLM Serving
Language:Cuda2 0 0
octoml/pre-commit-kustomize
pre-commit hook which runs kustomize docker image (use with https://github.com/pre-commit/pre-commit)
Language:Dockerfile2 1 0
octoml/.github
1 39 0
octoml/EAGLE
OctoML Implementation of EAGLE-1 and EAGLE-2
Language:Python1 0 0
octoml/llama-recipes
Examples and recipes for Llama 2 model
Language:Jupyter Notebook1 1 0
octoml/vllm-project
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python1 1 0
octoml/langchain
⚡ Building applications with LLMs through composability ⚡
Language:Python0 1 00
octoml/demo-design-system
Language:TypeScript2 0
octoml/docker_auth
Authentication server for Docker Registry 2
Language:Go1 0
octoml/go-jose
An implementation of JOSE standards (JWE, JWS, JWT) in Go
Language:Go1 0
octoml/go-oidc
A Go OpenID Connect client.
Language:Go1 0
octoml/homebrew-tap
Homebrew Tap of OctoML products and tools.
Language:Ruby4 0
octoml/lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
Language:Python0 0
octoml/msi-fe
Language:Python1 0
octoml/multicloud-asset-code-review-example
Multicloud Asset Code Review Public Repo example.
Language:Python1 0
octoml/octo-bots-mirror
Language:Python1 0
octoml/octoai-model-examples
A set of models you can build and deploy on octoai
Language:Python31 01
octoml/octoai-solutions
A collection of reference solutions built on top of OctoAI SaaS
Language:Python3 0
octoml/photobooth-bg-gen
Language:TypeScript1 0
octoml/pinecone-rag-demo
Pinecone + Vercel RAG application, showcasing a comparison between chat with no context and using a Pinecone index for context
Language:HTML1 0
octoml/RULER
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
Language:Python0 0
octoml/sagemaker-examples
Language:Jupyter Notebook22 0
octoml/TensorRT-LLM-release
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++1 0
octoml/tflint-ruleset-google
TFLint ruleset for terraform-provider-google
Language:Go1 0
octoml/unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Language:HTML1 0