/ai-resources

Personal Artificial Intelligence Resources List

ai-resources

Personal Artificial Intelligence Resources List.

Contents

Pre-Trained Models

  • audio-pretrained-model - A collection of Audio and Speech pre-trained models.
  • awesome-deeplearning - Pre-trained models from the awesome-deeplearning repository.
  • camelot - A Python library to extract tabular data from PDFs.
  • coreml-models - Largest list of models for Core ML (for iOS 11+).
  • cv-pretrained-model - A collection of computer vision pre-trained models.
  • efficientnet-pytorch - A PyTorch implementation of EfficientNet and EfficientNetV2.
  • huggingface - Browse the model hub to discover, experiment and contribute to new state of the art models.
  • layout-parser - A unified toolkit for Deep Learning Based Document Image Analysis.
  • mmf - A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
  • modelzoo - Models and code that perform audio processing, speech synthesis, and other audio related tasks.
  • nlp-pretrained-model - A collection of Natural language processing pre-trained models.
  • nlp-recipes - Natural Language Processing Best Practices & Examples.
  • openvino - Pre-trained Deep Learning models and demos (high quality and extremely fast).
  • PaddlePaddle - Awesome pre-trained models toolkit based on PaddlePaddle.
  • PaddleOCR - Awesome multilingual OCR toolkits based on PaddlePaddle.
  • pyannote-audio - Neural building blocks for speaker and speech detection.
  • pytorch-image-models - PyTorch image models, scripts, pretrained weights
  • stylegan - A collection of pre-trained StyleGAN models to download.
  • tabula - Tabula is a tool for liberating data tables trapped inside PDF files.
  • tfhub - Search and discover hundreds of trained, ready-to-deploy machine learning models.
  • unilm - Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Deep Learning

  • amazon-dsstne - Deep Scalable Sparse Tensor Network Engine.
  • caffe - A fast open framework for deep learning.
  • chainer - A flexible framework of neural networks for deep learning.
  • cntk - An open source deep-learning toolkit.
  • deepdetect - It makes state of the art machine learning easy to work with and integrate into existing applications.
  • deeplearning4j - Open-source, distributed, scientific computing for the JVM.
  • fastai - The fast.ai deep learning library, lessons, and tutorials.
  • gym - A toolkit for developing and comparing reinforcement learning algorithms.
  • keras - Deep Learning for humans.
  • mxnet - A flexible and efficient library for deep learning.
  • neon - Intel® Nervana™ reference deep learning framework.
  • neupy - NeuPy is a Python library for Artificial Neural Networks and Deep Learning.
  • neural-enhance - Super Resolution for images using deep learning.
  • Paddle - PArallel Distributed Deep LEarning.
  • singa - Distributed deep learning system.
  • sonnet - TensorFlow-based neural network library.
  • swflow - Simplified interface for TensorFlow for Deep Learning.
  • tensorflow - Computation using data flow graphs for scalable - machine learning.
  • tensorpack - A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility.
  • tflearn - Deep learning library featuring a higher-level API for TensorFlow.

General Purpose Machine Learning

  • aerosolve - A machine learning package built for humans.
  • AmpliGraph - Python library for Representation Learning on Knowledge Graphs.
  • catboost - An open-source gradient boosting library with categorical features support.
  • dmtk - Microsoft Distributed Machine Learning Toolkit.
  • fastFM - fastFM: A Library for Factorization Machines.
  • fklearn - Functional Machine Learning.
  • h2o - Open Source Fast Scalable Machine Learning Platform For Smarter Applications.
  • imbalanced-learn - A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning.
  • imodels - Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling.
  • JSAT - Java Statistical Analysis Tool, a Java library for Machine Learning.
  • libffm - A Library for Field-aware Factorization Machines.
  • libfm - Library for factorization machines.
  • LightGBM - A fast, distributed, high performance gradient boosting based on decision tree algorithms.
  • madlib - It is an open-source library for scalable in-database analytics.
  • metric-learn - Metric learning algorithms in Python.
  • mlens - ML-Ensemble – high performance ensemble learning.
  • mllib - MLlib is Apache Spark's scalable machine learning library.
  • moa - It is an open source framework for Big Data stream mining.
  • orange3 - Interactive data analysis.
  • pycobra - Python library implementing ensemble methods and visualisation tools including Voronoi tesselations.
  • pyod - A Python Toolbox for Scalable Outlier Detection (Anomaly Detection).
  • rep - Machine Learning toolbox for Humans.
  • river - Online machine learning in Python.
  • scikit-learn - Machine learning in Python.
  • shogun - Unified and efficient Machine Learning since 1999.
  • weka - It is a collection of machine learning algorithms for data mining tasks.
  • xgboost - Scalable, Portable and Distributed Gradient Boosting Library.

Natural Language Processing

  • allennlp - An open-source NLP research library, built on PyTorch.
  • anago - A Python library for sequence labeling implemented in Keras.
  • CoreNLP - Stanford CoreNLP: A Java suite of core NLP tools.
  • dimsum16 - Detecting Minimal Semantic Units and their Meanings - (DiMSUM).
  • finetune - Scikit-learn style model finetuning for NLP.
  • flair - A very simple framework for state-of-the-art NLP.
  • flashtext - Extract Keywords from sentence or Replace keywords in sentences.
  • fuzzywuzzy - Fuzzy String Matching in Python.
  • gensim - Topic Modelling for Humans.
  • gluon - A toolkit that enables easy text preprocessing to help you speed up your NLP research.
  • Kashgari - NLP Transfer learning framework for text-labeling and text-classification.
  • magnitude - A fast, efficient universal vector embedding utility package.
  • mallet - It is a Java-based package for machine learning applications to text.
  • nltk - Natural Language Toolkit.
  • pattern - Web mining module for Python, with tools for scraping, NLP, ML, network analysis and viz.
  • polyglot - Multilingual text (NLP) processing toolkit.
  • rasa - Open source machine learning framework to automate text- and voice-based conversations.
  • senpy - A sentiment and emotion analysis server in Python.
  • snips-nlu - Snips Python library to extract meaning from text.
  • spaCy - Industrial-strength Natural Language Processing (NLP) in Python.
  • textacy - A Python library for performing a variety of NLP tasks.
  • TextBlob - Simple, Pythonic, text processing.
  • textgenrnn - Easily train your own text-generating neural network on any text dataset.
  • word2vec - Python interface to Google word2vec.

Time Series Forecasting

  • auto-ts - Automatically build models on time series datasets with a single line of code.
  • darts - A python library for easy manipulation and forecasting of time series.
  • pmdarima - Time series analysis (including auto arima) for Python.
  • prophet - A procedure for forecasting time series data based on an additive model.
  • pyflux - Open source time series library for Python.
  • pysts - A Python package for time series classification.
  • scikit-hts - Hierarchical time series forecasting for humans.
  • sktime-dl - A sktime companion package for deep learning based on TensorFlow.
  • sktime - A unified framework for machine learning with time series.
  • statsmodels.tsa - Time Series analysis from statsmodels package.
  • traces - A Python library for unevenly-spaced time series analysis.
  • tsai - Time series Timeseries Deep Learning Pytorch fastai.
  • tsfresh - Automatic extraction of relevant features from time series.

Causal Inference

  • causallib - Modular causal inference analysis and model evaluations.
  • causalml - Uplift modeling and causal inference with machine learning algorithms.
  • causalnex - Helps data scientists to infer causation rather than observing correlation.
  • dowhy - A Python library for causal inference that supports explicit modeling and testing of causal assumptions.
  • EconML - Automated Learning and Intelligence for Causation and Economics.

Statistical and Probabilistic Modelling

  • BayesianOptimization - A Python implementation of global optimization with gaussian processes.
  • edward - A probabilistic programming language in TensorFlow.
  • hmmlearn - Hidden Markov Models in Python, with scikit-learn like API.
  • lifelines - Survival analysis in Python.
  • lifetimes - Lifetime value in Python.
  • lightweight_mmm - Easy to use Bayesian Marketing Mix Modeling (MMM).
  • mord - Ordinal regression algorithms.
  • pomegranate - Fast, flexible and easy to use probabilistic modelling in Python.
  • pyglmnet - Python implementation of elastic-net regularized generalized linear models.
  • pymc3 - Probabilistic Programming in Python.
  • python-mle - A Python package for performing Maximum Likelihood Estimates.
  • RoBo - A Robust Bayesian Optimization framework.
  • statsmodels - Statistical modeling and econometrics in Python.
  • tea-lang - DSL for experimental design and statistical analysis.
  • pingouin - Statistical package in Python based on Pandas.

Auto Machine Learning

  • adanet - AdaNet is a lightweight TensorFlow-based framework for AutoML.
  • AlphaPy - Automated Machine Learning AutoML for Python.
  • auto-sklearn - Automated Machine Learning with scikit-learn.
  • auto_ml - Automated machine learning for analytics & production.
  • autogluon - AutoML for Text, Image, and Tabular Data.
  • autokeras - Accessible AutoML for deep learning.
  • automl-gs - AutoML tool that offers a zero code/model definition interface to getting an optimized model.
  • diaml - Semi-automated machine learning pipelines.
  • FLAML - A fast and lightweight AutoML library.
  • ludwig - Ludwig is a toolbox that allows to train deep learning models without coding.
  • MLBox - It is a powerful Automated Machine Learning python library.
  • onepanel-automl - Onepanel AutoML.
  • optuna - A hyperparameter optimization framework.
  • pycaret - An open-source, low-code machine learning library in Python.
  • SMAC3 - Sequential Model-based Algorithm Configuration.
  • TPOT - Tree-Based Pipeline Optimization Tool.
  • TransmogrifAI - Automated machine learning for structured data.
  • xcessiv - A web-based application for automated hyperparameter tuning and stacked ensembling in Python.

Feature Engineering

  • categorical-encoding - A library of sklearn compatible categorical variable encoders.
  • datacleaner - A Python tool that automatically cleans data sets and readies them for analysis.
  • feature-selector - Feature selector is a tool for dimensionality reduction of machine learning datasets.
  • featuretools - Automated feature engineering.
  • gokinjo - A feature extraction library based on k-nearest neighbor algorithm in Python.
  • hypertools - A Python toolbox for gaining geometric insights into high-dimensional data.
  • umap - A dimension reduction technique that can be used for visualisation.

Model Management

  • BentoML - Model serving made easy.
  • cog - Containers for machine learning.
  • cookiecutter-ds - Logical and flexible project structure for doing and sharing data science work.
  • ds-process-management - Resources for Data Science Process management.
  • dvc - Data & models versioning for ML projects, make them shareable and reproducible.
  • firefly - Function as a service.
  • hopsworks - Full-stack platform for scale-out data science.
  • kedro - A Python library for building robust production-ready data and analytics pipelines.
  • lore - A python framework to make machine learning approachable.
  • marvin - The toolbox helps data scientists to develop, test, and run marvin engines.
  • metaflow - Build and manage real-life data science projects with ease.
  • mlflow - Open source platform for the machine learning lifecycle.

Diagnostic, Inpection or Interpretation

  • anchor - High-Precision Model-Agnostic Explanations.
  • ann-visualizer - A python library for visualizing Artificial Neural Networks with Keras.
  • awesome-interpretable-machine-learning - Opinionated list of resources facilitating model interpretability.
  • eli5 - A library for debugging/inspecting machine learning classifiers and explaining their predictions.
  • explainerdashboard - Quickly build Explainable AI dashboards.
  • interpret - Fit interpretable models. Explain blackbox machine learning.
  • lime - Explaining the predictions of any machine learning classifier.
  • lucid - A collection of infrastructure and tools for research in neural network interpretability.
  • PDPbox - Python partial dependence plot toolbox.
  • SHAP - A unified approach to explain the output of any machine learning model.
  • what-if-tool - Easy-to-use interface for expanding understanding of a black-box classification/regression model.
  • yellowbrick - Visual analysis and diagnostic tools to facilitate machine learning model selection.

Data Visualization

  • altair - Declarative statistical visualization library for Python.
  • animatplot - A python package for animating plots build on matplotlib.
  • bokeh - Interactive Web Plotting for Python.
  • chartify - Python library that makes it easy for data scientists to create charts.
  • dash - Interactive, Reactive Web Apps for Python.
  • folium - Python Data to Leaflet.js Maps.
  • ft-visual-vocabulary - The core of a newsroom-wide training session aimed at improving chart literacy.
  • holoviews - Stop plotting your data - annotate your data and let it visualize itself.
  • ipyvolume - 3d plotting for Python in the Jupyter notebook based on IPython widgets using WebGL.
  • matplotlib - Plotting with Python.
  • plotnine - A grammar of graphics for Python.
  • scattertext - Beautiful visualizations of how language differs among document types.
  • scikit-plot - An intuitive library to add plotting functionality to scikit-learn objects.
  • seaborn - Statistical data visualization.
  • speedml - Speedml is a Python package to speed start machine learning projects.
  • streamlit - The fastest way to build custom ML tools.
  • vega - A visualization grammar.
  • veles - Binary data analysis and visualization tool.
  • vispy - Interactive scientific visualization that is designed to be fast, scalable, and easy to use.
  • wordcloud - A little word cloud generator in Python.

Auto Data Visualization

  • AutoViz - Automatically visualize any dataset, any size with a single line of code.
  • dataprep - The easiest way to prepare data in Python.
  • dtale - Visualizer for pandas data structures.
  • PandasGUI - A GUI for Pandas DataFrames.
  • pandas-profiling - Create HTML profiling reports from pandas DataFrame objects.
  • sweetviz - Visualize and compare datasets, target values and associations, with one line of code.

DataFrame Libraries

  • cuDF - GPU DataFrame Library.
  • dask - Parallel computing with task scheduling.
  • datatables - A Python package for manipulating 2-dimensional tabular data structures.
  • modin - Speed up your Pandas workflows by changing a single line of code.
  • pandas - Fast, powerful, flexible and easy to use open source data analysis and manipulation tool.
  • pandas_flavor - The easy way to write your own flavor of Pandas.
  • sklearn-pandas - Pandas integration with sklearn.
  • terality - Serverless data processing engine.
  • vaex - Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python.

Misc

  • deap - Distributed Evolutionary Algorithms in Python
  • feather - Fast, interoperable binary data frame storage for Python and R.
  • gplearn - Genetic Programming in Python.
  • PyGAD - Python 3 library for building the genetic algorithm and training machine learning algorithms.
  • gtdata - Download and play with key datasets from Google Trend.
  • librosa - Python library for audio and music analysis.
  • m2cgen - Transform ML models into a native code with zero dependencies
  • mahout - It is a distributed linear algebra framework and mathematically expressive Scala DSL.
  • mlxtend - A library of extension and helper modules for Python's data analysis and machine learning libraries.
  • pythia - A modular framework for Visual Question Answering research from Facebook AI Research (FAIR).
  • snorkel - A system for quickly generating training data with weak supervision.

Tutorials and Examples

Lists