zhiyuan8
Building @NexaAI , Previously at @google and Amazon Lab126. Passionate AI developer, committed to lifelong learning.
Nexa AI Inc
Pinned Repositories
Awesome-LLMs-on-device
Awesome LLMs on Device: A Comprehensive Survey
nexa-sdk
Run the latest LLMs and VLMs across GPU, NPU, and CPU with PC (Python/C++) & mobile (Android & iOS) support, running quickly with OpenAI gpt-oss, Granite4, Qwen3VL, Gemma 3n and more.
octopus-v4
AI for all: Build the large graph of the language models
AutoCompressors
[EMNLP 2023] Adapting Language Models to Compress Long Contexts
Docker-Kubernetes-GKE
Kubernetes, Google K8s Engine (K8S), Docker, Docker Compose, Docker Swarm
FastAPI-websocket-tutorial
Build dynamic, secure APIs with FastAPI: Features DB integration, real-time WebSocket, streaming, and efficient request handling with middleware, powered by Starlette and Pydantic.
LLM-Search-Recommendation-System
Search Algorithms and related LLM & agent frameworks
Python-concurrency
Python Concurrency Web Crawler, and Redis
qnn-mha2sha
speech_detection
A real-time analyzer to detect normal speech/abusive speech/noise
zhiyuan8's Repositories
zhiyuan8/FastAPI-websocket-tutorial
Build dynamic, secure APIs with FastAPI: Features DB integration, real-time WebSocket, streaming, and efficient request handling with middleware, powered by Starlette and Pydantic.
zhiyuan8/speech_detection
A real-time analyzer to detect normal speech/abusive speech/noise
zhiyuan8/LLM-Search-Recommendation-System
Search Algorithms and related LLM & agent frameworks
zhiyuan8/Python-concurrency
Python Concurrency Web Crawler, and Redis
zhiyuan8/qnn-mha2sha
zhiyuan8/AutoCompressors
[EMNLP 2023] Adapting Language Models to Compress Long Contexts
zhiyuan8/Docker-Kubernetes-GKE
Kubernetes, Google K8s Engine (K8S), Docker, Docker Compose, Docker Swarm
zhiyuan8/rust-tutorial
zhiyuan8/twitter-Recommendation-Algorithm
Source code for Twitter's Recommendation Algorithm
zhiyuan8/zhiyuan8
My github Profile
zhiyuan8/homebrew-go
zhiyuan8/homebrew-go-release