Pinned Repositories
ipex-llm
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc
neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
ai-documents
diffusers
š¤ Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
intel-extension-for-tensorflow
IntelĀ® Extension for TensorFlow*
neural-compressor
IntelĀ® Neural Compressor (formerly known as IntelĀ® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep learning frameworks to pursue optimal inference performance.
oneapi-hackathon
ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Unnati-FakeNews-Detection
Unnati_workshop
nazneenn's Repositories
nazneenn/Unnati_workshop
nazneenn/oneapi-hackathon
nazneenn/Unnati-FakeNews-Detection
nazneenn/ai-documents
nazneenn/diffusers
š¤ Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
nazneenn/intel-extension-for-tensorflow
IntelĀ® Extension for TensorFlow*
nazneenn/neural-compressor
IntelĀ® Neural Compressor (formerly known as IntelĀ® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep learning frameworks to pursue optimal inference performance.
nazneenn/ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.