/LLMS-OSX

This list is tailored for users with Apple Silicon devices interested in experimenting with Large Language Models (LLMs).

LLMS

Introduction

This guide is tailored for users with Apple Silicon devices interested in experimenting with Large Language Models (LLMs). It includes summaries of key repositories and resources, focusing on those optimized for Apple Silicon hardware. The guide covers a variety of topics, including Retrieval-Augmented Generation (RAG), training and fine-tuning LLMs, vision-language models (VLM), datasets, and user interfaces (UI). Each section provides links to repositories and tools, helping users explore and implement LLMs effectively on their Apple Silicon devices. apple

Retrieval-Augmented Generation (RAG) Repositories:

ChatLLM.cpp

  • Repository: chatllm.cpp
  • Summary: This repository includes documentation and code for implementing Retrieval-Augmented Generation (RAG) models using LLMs, providing guidelines and examples for integrating retrieval mechanisms with language models.

ModelScope RAG

  • Repository: ModelScope RAG
  • Summary: This Jupyter notebook provides a tutorial on implementing RAG with reranking using the Llamaindex library. It showcases how to enhance retrieval-based generation with ranking mechanisms for improved performance.

Quad-AI RAG

  • Repository: Quad-AI RAG
  • Summary: This notebook demonstrates the implementation of RAG using llama.cpp for effective information retrieval and response generation. It includes step-by-step instructions and code snippets.

MLX RAG GGUF

  • Repository: MLX RAG GGUF
  • Summary: This repository contains code and documentation for implementing RAG with GGUF embeddings using the MLX framework. It focuses on integrating various embedding techniques for enhanced retrieval.

FlagOpen Embedding

  • Repository: FlagOpen Embedding
  • Summary: The repository provides tools and methods for embedding generation and integration in retrieval-augmented generation systems. It includes various models and techniques for embedding tasks.

LM Cocktail

  • Repository: LM Cocktail
  • Summary: This project focuses on combining different embedding techniques and models to create a robust retrieval system for augmented generation tasks. It includes various configurations and examples.

OSX Apple Silicon:

OSX Apple Silicon Gist

  • Repository: OSX Apple Silicon Gist
  • Summary: A gist providing detailed steps and configurations for running MLX models on Apple Silicon devices, including setup instructions and performance tips.

Hugging Face MLX Models

  • Repository: Hugging Face MLX Models
  • Summary: A collection of models optimized for MLX, available on Hugging Face. This repository provides a variety of models suitable for different ML tasks, optimized for performance and efficiency.

MLX Examples

  • Repository: MLX Examples
  • Summary: This repository contains example implementations and use cases for MLX models, including step-by-step guides and code snippets for various ML tasks.

Phi-3 Vision MLX

  • Repository: Phi-3 Vision MLX
  • Summary: A project showcasing the integration of vision models with the MLX framework, providing examples and implementation details for vision-related tasks.

LILM

  • Repository: LILM
  • Summary: This repository focuses on implementing and optimizing language models for low-resource settings, utilizing the MLX framework for enhanced performance on Apple Silicon devices.

ML Stable Diffusion

  • Repository: ML Stable Diffusion
  • Summary: A project dedicated to implementing and optimizing stable diffusion models for Apple Silicon, addressing performance issues and providing solutions for efficient execution.

MLX LORA Fine-tuning

  • Repository: MLX LORA Fine-tuning
  • Summary: Documentation and code for fine-tuning models using LORA within the MLX framework, providing step-by-step instructions and examples.

MLX Discussions

  • Repository: MLX Discussions
  • Summary: A forum for discussing various aspects of the MLX framework, including usage tips, troubleshooting, and community contributions.

NanoGPT MLX

  • Repository: NanoGPT MLX
  • Summary: A minimal implementation of GPT models optimized for the MLX framework, providing code and documentation for quick setup and execution.

LM Evaluation Harness

  • Repository: LM Evaluation Harness
  • Summary: Tools and benchmarks for evaluating language models, including standardized tests and performance metrics for various tasks.

PikaGPT

  • Repository: PikaGPT
  • Summary: A lightweight implementation of GPT models optimized for efficiency and performance on Apple Silicon devices, providing examples and setup instructions.

Local LLM Training Apple Silicon

  • Repository: Local LLM Training Apple Silicon
  • Summary: A project focusing on training local LLMs on Apple Silicon devices, providing code and documentation for setup and execution.

MLX Notes

  • Repository: MLX Notes
  • Summary: A collection of notes and guides for using the MLX framework, covering various aspects from setup to advanced usage.

LORA Gist

  • Repository: LORA Gist
  • Summary: A gist providing detailed instructions for fine-tuning models using LORA, including code snippets and setup tips.

Llama3 Mac Silicon Example

  • Repository: Llama3 Mac Silicon Example
  • Summary: An example notebook demonstrating the use of Llama3 models on Mac Silicon, including setup, execution, and performance tips.

Deep Dive into AI with MLX

  • Repository: Deep Dive into AI with MLX
  • Summary: A comprehensive guide and tutorial for diving deep into AI using the MLX framework, covering various models and use cases.

Vision-Language Models (VLM):

MLX Llava Finetuning

  • Repository: MLX Llava Finetuning
  • Summary: A repository focused on fine-tuning vision-language models using the MLX framework, providing examples and detailed instructions.

Bunny

  • Repository: Bunny
  • Summary: A project dedicated to developing and optimizing vision-language models, providing code, documentation, and examples for various tasks.

LoRA/Training/Fine-Tuning:

InternLM Ecosystem

  • Repository: InternLM Ecosystem
  • Summary: Documentation and code for integrating and using InternLM models within various ecosystems, focusing on training and fine-tuning.

InternLM Fine-tuning

  • Repository: InternLM Fine-tuning
  • Summary: A repository providing detailed instructions and code for fine-tuning InternLM models, including setup and execution guidelines.

ModelScope Swift

  • Repository: ModelScope Swift
  • Summary: A project dedicated to optimizing and deploying models using the Swift framework, providing examples and documentation for various tasks.

LitGPT

  • Repository: LitGPT
  • Summary: A repository focused on implementing and optimizing GPT models using the Lightning framework, providing examples and detailed instructions for training and fine-tuning.

Datasets:

The Cauldron

  • Repository: The Cauldron
  • Summary: A dataset repository on Hugging Face providing a collection of datasets for training and evaluating machine learning models, including detailed descriptions and usage guidelines.

GUI:

LibreChat

  • Repository: LibreChat
  • Summary: LibreChat is an open-source project for building and managing chat interfaces for language models, providing customizable features and integration options for various platforms.

Docker etc

  • brew install llama.cpp
  • llama-cli --hf-repo reach-vb/Meta-Llama-3.1-8B-Instruct-Q6_K-GGUF --hf-file meta-llama-3.1-8b-instruct-q6_k.gguf -p "Sup?" --ctx-size 8192

The model uses ~7.0GB RAM.

Unsorted