big-data-analytics
There are 643 repositories under big-data-analytics topic.
ydataai/ydata-profiling
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
ICT-BDA/EasyML
Easy Machine Learning is a general-purpose dataflow-based system for easing the process of applying machine learning algorithms to real world tasks.
dongsuo/vue-data-board
A Data Analysis Board in Vue.
mahmoudparsian/pyspark-tutorial
PySpark-Tutorial provides basic algorithms using PySpark
v6d-io/v6d
vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)
MrXujiang/v6.dooring.public
可视化大屏解决方案, 提供一套可视化编辑引擎, 助力个人或企业轻松定制自己的可视化大屏应用.
caioricciuti/ch-ui
Use CH-UI to work with your data from Click House self-hosted with a user-friendly interface. CH-UI is a modern and feature-rich user interface for ClickHouse databases. It offers an intuitive platform for querying ClickHouse databases, executing queries, and visualizing metrics about your instance.
metatron-app/metatron-discovery
Powerful & Easy way for big data discovery
lithops-cloud/lithops
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
rouyang2017/SISSO
A data-driven method combining symbolic regression and compressed sensing for accurate & interpretable models.
Ashish7129/Graph_Sampling
Graph Sampling is a python package containing various approaches which samples the original graph according to different sample sizes.
archivesunleashed/aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Thomas-George-T/Movies-Analytics-in-Spark-and-Scala
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
FTiniNadhirah/Coursera-and-EdX-courses-answers
This is about learning courses in Coursera. All the answers given written by myself
RajdeepBiswas/Manufacturing-Quality-Inspection
I have built the computer vision models in 3 different ways addressing different personas, because not all companies will have a resolute data science team. quality-control manufacturing big-data-analytics jupyter-notebook cognitive services industry solutions
RajdeepBiswas/AI_Enabled_Image_Bucketization
Bucketize an image based on exhaust data and AI generated data. industry-solutions azure azure machine learning services computer-vision big data big data analytics machine learning image recognition manufacturing quality control cognitive services
panstacks/pandata
The Pandata scalable open-source analysis stack
drshahizan/BDM
Course covers big data fundamentals, processes, technologies, platform ecosystem, and management for practical application development.
maniram-yadav/Big_DataHadoop_Projects
Big data projects implemented by Maniram yadav
trieu/leo-cdp-free-edition
The binary build of LEO CDP Free Edition for training purposes
u2i/egis
Egis - a handy Ruby interface for AWS Athena
ingef/conquery
Visual, interactive queries against big databases
tatsuiman/rpot2
Real-time Packet Observation Tool
GMAP/DSPBench
A suite of benchmark applications for distributed data stream processing systems
jackkolokasis/teraheap
TeraHeap: Reducing Memory Pressure in Managed Big Data Frameworks
Wittline/pyspark-on-aws-emr
The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on writing pyspark code.
arakat-community/arakat
ARAKAT - Big Data Analysis and Business Intelligence Application Development Platform
airflow-plugins/pandora-plugin
Plugin offering views, operators, sensors, and more developed at Pandora Media.
eskimo-sh/eskimo
Eskimo is a state of the art Big Data Infrastructure and Management Web Console to build, manage and operate Big Data 2.0 Analytics clusters on Kubernetes. This is the git repository of Eskimo Community Edition.
suzumura/graph500
World championship code for Graph500
scalytics/SDE
Scalytics Connect development environment, pre-build
Dammonoit/Student-performance-analysis-using-Big-data
This project analyses and correlates student performance with different attributes. Then at last, it determines most suitable algorithm from bunch of them.
OwenOrcan/YiraBot-Crawler
YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.
jaanli/american-community-survey
American Community Survey data on people and households
XuanyouLiu/US-Real-Estate-Analysis
US Real Estate Rental Price Analysis