skaiphd
Principal Data Scientist | Machine Learning | Deep Learning |NLP | Time Series | LLM | GenAI
USA
Pinned Repositories
-DS101X-Statistical-Thinking-for-Data-Science-and-Analytics
Columbia University edX Course DS101X Statistical Thinking for Data Science and Analytics
-Python
1d-tokenizer
This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation
DeepLearning_Projects
GenerativeAI_LLM_Usecase
Machine-Learning-with-Python
Practice and tutorial-style notebooks covering wide variety of machine learning techniques
MachineLearning_Projects
Natural_Language_Processing_Projects
Research_Papers_Analysis
TimeSeries_Projects
skaiphd's Repositories
skaiphd/bm42_eval
Evaluation of bm42 sparse indexing algorithm
skaiphd/Crowdstrike_BSOD_Fixer
skaiphd/data-migration-desktop-tool
skaiphd/DPO-ST
[ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
skaiphd/generative-ai-for-beginners11
12 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
skaiphd/kan-polar
Kolmogorov-Arnold Networks in MATLAB
skaiphd/LightRAG
The "PyTorch" library for LLM applications.
skaiphd/llama-models
Utilities intended for use with Llama models.
skaiphd/LLM-Finetuning
LLM Finetuning with peft
skaiphd/LLMs-in-Finance
LLMs in Finance - Generative AI - AI Agents
skaiphd/mlflow
Open source platform for the machine learning lifecycle
skaiphd/pandas-ai
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
skaiphd/playbook
The Digital Services Playbook
skaiphd/promptgarage
A workbench application to test out different prompts on a variety of AI models to see how they perform
skaiphd/txtai
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
skaiphd/vertex-ai-samples
Notebooks, code samples, sample apps, and other resources that demonstrate how to use, develop and manage machine learning and generative AI workflows using Google Cloud Vertex AI.
skaiphd/awesome-open-data-annotation
Open Source Data Annotation & Labeling Tools
skaiphd/battle-of-the-semantics
GraphRag vs Embeddings
skaiphd/claude-engineer
Claude Engineer is an interactive command-line interface (CLI) that leverages the power of Anthropic's Claude-3.5-Sonnet model to assist with software development tasks. This tool combines the capabilities of a large language model with practical file system operations and web search functionality.
skaiphd/CoreNLP
CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
skaiphd/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
skaiphd/denser-retriever
An enterprise-grade AI retriever designed to streamline AI integration into your applications, ensuring cutting-edge accuracy.
skaiphd/environeer
Control Hugging Face Spaces, datasets, and models as directed acyclic graphs
skaiphd/llama2.c
Inference Llama 2 in one file of pure C
skaiphd/llama31_synthetic_data
Synthetic data generator using Llama 3.1
skaiphd/local_llama
This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies.
skaiphd/python-bootcamp-duke
Official syllabus for AIPI 503 - Python Bootcamp for Data and AI
skaiphd/python-training
Python training for business analysts and traders
skaiphd/SpinQuant
Code repo for the paper "SpinQuant LLM quantization with learned rotations"
skaiphd/surfkit
A toolkit for building AI agents that use devices