data-analysis

There are 37081 repositories under data-analysis topic.

  • superset

    apache/superset

    Apache Superset is a Data Visualization and Data Exploration Platform

    Language:TypeScript68.9k1.5k11.9k16.1k
  • scikit-learn/scikit-learn

    scikit-learn: machine learning in Python

    Language:Python64k2.1k11.8k26.4k
  • pandas-dev/pandas

    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

    Language:Python47.1k1.1k27.9k19.3k
  • metabase/metabase

    The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data :bar_chart:

    Language:Clojure44.5k63923.6k6k
  • streamlit/streamlit

    Streamlit — A faster way to build and share data apps.

    Language:Python42.1k3215.6k3.9k
  • gradio-app/gradio

    Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

    Language:Python40.5k1866.1k3.1k
  • CyberChef

    gchq/CyberChef

    The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis

    Language:JavaScript33.1k3931.1k3.7k
  • Data-Science-For-Beginners

    microsoft/Data-Science-For-Beginners

    10 Weeks, 20 Lessons, Data Science for All!

    Language:Jupyter Notebook31.3k5131276.6k
  • AI-Expert-Roadmap

    AMAI-GmbH/AI-Expert-Roadmap

    Roadmap to becoming an Artificial Intelligence Expert in 2022

    Language:JavaScript30.5k959672.5k
  • BettaFish

    666ghj/BettaFish

    微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。

    Language:Python23.3k1501464.4k
  • lukasmasuch/best-of-ml-python

    🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

  • sinaptik-ai/pandas-ai

    Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

    Language:Python22.5k1699022.2k
  • dataease/dataease

    🔥 人人可用的开源 BI 工具,数据可视化神器。An open-source BI tool alternative to Tableau.

    Language:Java22.3k1726.9k3.9k
  • airbytehq/airbyte

    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

    Language:Python20k18615.5k4.9k
  • goaccess

    allinurl/goaccess

    GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.

    Language:C19.9k2892.4k1.2k
  • pygwalker

    Kanaries/pygwalker

    PyGWalker: Turn your dataframe into an interactive UI for visual analysis

    Language:Python15.4k88248845
  • akshare

    akfamily/akshare

    AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库

    Language:Python14.3k2302.5k2.6k
  • ydataai/ydata-profiling

    1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

    Language:Python13.2k1478491.8k
  • tangyudi/Ai-Learn

    人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域

  • guipsamora/pandas_exercises

    Practice your pandas skills!

    Language:Jupyter Notebook11.8k306678.7k
  • OpenRefine/OpenRefine

    OpenRefine is a free, open source power tool for working with messy data and improving it

    Language:Java11.6k4553.3k2.1k
  • statsmodels/statsmodels

    Statsmodels: statistical modeling and econometrics in Python

    Language:Python11.1k2835.6k3.3k
  • Yorko/mlcourse.ai

    Open Machine Learning Course

    Language:Python10.3k5751345.7k
  • yzhao062/pyod

    A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques

    Language:Python9.6k1493541.5k
  • rapidsai/cudf

    cuDF - GPU DataFrame Library

    Language:C++9.3k1567.3k982
  • gonum/gonum

    Gonum is a set of numeric libraries for the Go programming language. It contains libraries for matrices, statistics, optimization, and more

    Language:Go8.2k117644570
  • jeecgboot/jimureport

    「数据可视化:报表、大屏、数据看板」积木报表是一款类Excel操作风格,在线拖拽设计的报表工具和和数据可视化产品。功能涵盖: 报表设计、大屏设计、打印设计、图形报表、仪表盘门户设计等,完全免费!秉承“简单、易用、专业”的产品理念,极大的降低报表开发难度、缩短开发周期、解决各类报表难题。

    Language:Java7.7k1024.1k1.8k
  • Alluxio/alluxio

    Alluxio, data orchestration for analytics and machine learning in the cloud

    Language:Java7.1k4392.2k3k
  • growthbook

    growthbook/growthbook

    Open Source Feature Flagging and A/B Testing Platform

    Language:TypeScript7.1k291.3k624
  • scikit-learn-contrib/imbalanced-learn

    A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

    Language:Python7.1k1366121.3k
  • flyte

    flyteorg/flyte

    Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.

    Language:Go6.6k2533.5k757
  • rhiever/Data-Analysis-and-Machine-Learning-Projects

    Repository of teaching materials, code, and data for my data analysis and machine learning projects.

    Language:Jupyter Notebook6.5k331312.1k
  • qinwf/awesome-R

    A curated list of awesome R packages, frameworks and software.

    Language:R6.3k408271.5k
  • pachyderm/pachyderm

    Data-Centric Pipelines and Data Versioning

    Language:Go6.3k1523.1k568
  • cloudquery

    cloudquery/cloudquery

    Data pipelines for cloud config and security data. Build cloud asset inventory, CSPM, FinOps, and vulnerability management solutions. Extract from AWS, Azure, GCP, and 70+ cloud and SaaS sources.

    Language:Go6.2k612.2k542
  • microsoft/TaskWeaver

    A code-first agent framework for seamlessly planning and executing data analytics tasks.

    Language:Python6k68236761