Pinned Repositories
ai-documents
docs
This repo contains documents of the OPEA project
Gaudi-tutorials
Tutorials for running models on First-gen Gaudi and Gaudi2 for Training and Inference. The source files for the tutorials on https://developer.habana.ai/
GenAIComps
GenAI components at micro-service level; GenAI service composer to create mega-service
GenAIEval
Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety, and hallucination
GenAIExamples
Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open Platform for Enterprise AI (OPEA) project.
intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
model_server
A scalable inference server for models optimized with OpenVINO™
neural-speed
An innovation library for efficient LLM inference via low-bit quantization and sparsity
oneAPI-samples
Samples for Intel oneAPI toolkits
xiguiw's Repositories
xiguiw/ai-documents
xiguiw/docs
This repo contains documents of the OPEA project
xiguiw/Gaudi-tutorials
Tutorials for running models on First-gen Gaudi and Gaudi2 for Training and Inference. The source files for the tutorials on https://developer.habana.ai/
xiguiw/GenAIComps
GenAI components at micro-service level; GenAI service composer to create mega-service
xiguiw/GenAIEval
Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety, and hallucination
xiguiw/GenAIExamples
Generative AI Examples is a collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open Platform for Enterprise AI (OPEA) project.
xiguiw/intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
xiguiw/model_server
A scalable inference server for models optimized with OpenVINO™
xiguiw/neural-speed
An innovation library for efficient LLM inference via low-bit quantization and sparsity
xiguiw/oneAPI-samples
Samples for Intel oneAPI toolkits
xiguiw/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs