intel-analytics/ipex-llm
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, Axolotl, etc.
PythonApache-2.0
Stargazers
- bhattiPlexObject Solutions, Inc.
- blazingsiyanShanghai
- codekkembryonic
- denisfitz57
- dispalt@goodcover
- erikreppel@ourzora
- fairchildProcore
- fggarcia
- ghosthamletThe Rest Is Silence of Code
- ivanistheoneMinireference Co.
- jiangplusshenzhen, china
- joshcutler@onda-ai
- jthelin@microsoft
- laknathApplied Artificial Intelligence Institute, Deakin University
- lizhizhouMorgan Stanley
- MerlinDEHamburg, Germany
- montanaflynn
- nimitagrWise (@transferwise)
- nlothianvarious, Apache
- nwt-patrick
- paranoSan Francisco, CA
- piskvorky@RARE-Technologies
- plokhotnyukDisney Streaming
- qbig@us3r-network
- Rahul-Raviprasad@ServiceNow
- ramiyer
- ravipintoOracle
- rmnoonSan Francisco, CA
- RX-01
- soroushmehrMicrosoft Research (prev. Maluuba and MILA-UdeM)
- stvhanna
- tensortalkYou're on TensorTalk.com!
- tsubame959
- wahtherewahhere
- yubozhaoSan Francisco
- zsherman@datadog