data-analysis
There are 34879 repositories under data-analysis topic.
apache/superset
Apache Superset is a Data Visualization and Data Exploration Platform
scikit-learn/scikit-learn
scikit-learn: machine learning in Python
pandas-dev/pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
metabase/metabase
The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data :bar_chart:
streamlit/streamlit
Streamlit — A faster way to build and share data apps.
gradio-app/gradio
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
gchq/CyberChef
The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis
microsoft/Data-Science-For-Beginners
10 Weeks, 20 Lessons, Data Science for All!
AMAI-GmbH/AI-Expert-Roadmap
Roadmap to becoming an Artificial Intelligence Expert in 2022
sinaptik-ai/pandas-ai
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
lukasmasuch/best-of-ml-python
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
dataease/dataease
🔥 人人可用的开源 BI 工具,数据可视化神器。An open-source BI tool alternative to Tableau.
allinurl/goaccess
GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.
airbytehq/airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Kanaries/pygwalker
PyGWalker: Turn your dataframe into an interactive UI for visual analysis
akfamily/akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
ydataai/ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
tangyudi/Ai-Learn
人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
guipsamora/pandas_exercises
Practice your pandas skills!
OpenRefine/OpenRefine
OpenRefine is a free, open source power tool for working with messy data and improving it
statsmodels/statsmodels
Statsmodels: statistical modeling and econometrics in Python
Yorko/mlcourse.ai
Open Machine Learning Course
yzhao062/pyod
A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
rapidsai/cudf
cuDF - GPU DataFrame Library
gonum/gonum
Gonum is a set of numeric libraries for the Go programming language. It contains libraries for matrices, statistics, optimization, and more
jeecgboot/jimureport
「数据可视化:报表、大屏、数据看板」积木报表是一款类Excel操作风格,在线拖拽设计的报表工具和和数据可视化产品。功能涵盖: 报表设计、大屏设计、打印设计、图形报表、仪表盘门户设计等,完全免费!秉承“简单、易用、专业”的产品理念,极大的降低报表开发难度、缩短开发周期、解决各类报表难题。
Alluxio/alluxio
Alluxio, data orchestration for analytics and machine learning in the cloud
scikit-learn-contrib/imbalanced-learn
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
growthbook/growthbook
Open Source Feature Flagging and A/B Testing Platform
flyteorg/flyte
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
rhiever/Data-Analysis-and-Machine-Learning-Projects
Repository of teaching materials, code, and data for my data analysis and machine learning projects.
qinwf/awesome-R
A curated list of awesome R packages, frameworks and software.
pachyderm/pachyderm
Data-Centric Pipelines and Data Versioning
cloudquery/cloudquery
The open source ELT framework powered by Apache Arrow
microsoft/TaskWeaver
A code-first agent framework for seamlessly planning and executing data analytics tasks.
airbnb/knowledge-repo
A next-generation curated knowledge sharing platform for data scientists and other technical professions.