data-mining
There are 5813 repositories under data-mining topic.
eriklindernoren/ML-From-Scratch
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
JaidedAI/EasyOCR
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
academic/awesome-datascience
:memo: An awesome Data Science repository to learn and apply for real world problems.
EthicalML/awesome-production-machine-learning
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
microsoft/LightGBM
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
piskvorky/gensim
Topic Modelling for Humans
rasbt/python-machine-learning-book
The "Python Machine Learning (1st edition)" book code repository and info resource
tangyudi/Ai-Learn
人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
yzhao062/pyod
A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
sktime/sktime
A unified framework for machine learning with time series
yzhao062/anomaly-detection-resources
Anomaly detection related books, papers, videos, and toolboxes
catboost/catboost
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
jivoi/awesome-ml-for-cybersecurity
:octocat: Machine Learning for Cyber Security
microsoft/RD-Agent
Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are committed to automating these high-value generic R&D processes through R&D-Agent, which lets AI drive data-driven AI. 🔗https://aka.ms/RD-Agent-Tech-Report
faridrashidi/kaggle-solutions
🏅 Collection of Kaggle Solutions and Ideas 🏅
MontFerret/ferret
Declarative web scraping
biolab/orange3
🍊 :bar_chart: :bulb: Orange: Interactive data analysis
rasbt/mlxtend
A library of extension and helper modules for Python's data analysis and machine learning libraries.
r0f1/datascience
Curated list of Python resources for data science.
deanmalmgren/textract
extract text from any document. no muss. no fuss.
alibaba/Alink
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
rob-med/awesome-TS-anomaly-detection
List of tools & datasets for anomaly detection on time-series data.
Kanaries/graphic-walker
An open source alternative to Tableau. Embeddable visual analytic
automeris-io/WebPlotDigitizer
Computer vision assisted tool to extract numerical data from plot images.
tirthajyoti/Papers-Literature-ML-DL-RL-AI
Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning
dblalock/bolt
10x faster matrix and vector operations
WZBSocialScienceCenter/pdftabextract
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
invoice-x/invoice2data
Extract structured data from PDF invoices
youngfish42/Awesome-FL
Comprehensive and timely academic information on federated learning (papers, frameworks, datasets, tutorials, workshops)
WenjieDu/PyPOTS
A Python toolkit/library for reality-centric machine/deep learning and data mining on partially-observed time series, including SOTA neural network models for scientific analysis tasks of imputation/classification/clustering/forecasting/anomaly detection/cleaning on incomplete industrial (irregularly-sampled) multivariate TS with NaN missing values
PaddlePaddle/Research
novel deep learning research works with PaddlePaddle
benedekrozemberczki/awesome-fraud-detection-papers
A curated list of data mining papers about fraud detection.
404notf0und/AI-for-Security-Learning
安全场景、基于AI的安全算法和安全数据分析业界实践
safe-graph/graph-fraud-detection-papers
A curated list of graph-based fraud, anomaly, and outlier detection papers & resources
Yimeng-Zhang/feature-engineering-and-feature-selection
A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.
zslucky/awesome-AI-books
Some awesome AI related books and pdfs for learning and downloading, also apply some playground models for learning