data-simulation

There are 51 repositories under data-simulation topic.

  • leoliuf/MRiLab

    A Numerical Magnetic Resonance Imaging (MRI) Simulation Platform

    Language:MATLAB229124766
  • qMRLab/qMRLab

    Quantitative MRI Made Easy with qMRLab: MRI software for data Simulation, analysis and visualization

    Language:MATLAB1722430060
  • fortiql/data-forge

    Data Forge — a modern data stack playground to practice flows and best practices, not just tools. Spark, Trino, Kafka, Iceberg, ClickHouse, Airflow, MinIO, Superset — all wired together locally with Docker Compose.

    Language:Jupyter Notebook15221
  • IBA-Group-IT/IoT-data-simulator

    Generic IoT data simulator. Provides possibility to replay datasets or generates data on fly. Supports various IoT platforms out of the box.

    Language:JavaScript11582054
  • kgoldfeld/simstudy

    simstudy: Illuminating research methods through data generation

    Language:R8451319
  • bigdata-ustc/Agent4Edu

    Agent4Edu: Generating Learner Response Data by LLM-based Agents for Intelligent Education Systems (AAAI 2025)

    Language:Python56427
  • BCG-X-Official/pytools

    Foundational tools for BCG X's data science packages.

    Language:Python35623
  • peirong26/Brain-ID

    [ECCV 2024] Brain-ID: Learning Contrast-agnostic Anatomical Representations for Brain Imaging

    Language:Python31216
  • datamimic

    rapiddweller/datamimic

    🧠 Model-Driven and AI driven test data orchestration platform enabling developers to create realistic, scalable, and privacy-compliant test data. Features model-driven data generation, GDPR compliance, and seamless Python integration.

    Language:Python29092
  • anjalisilva/MPLNClust

    R Package With Shiny App to Perform and Visualize Clustering of Count Data via Mixtures of Multivariate Poisson-log Normal Model

    Language:R17204
  • anjalisilva/mixMVPLN

    R Package to Perform Clustering of Three-way Count Data Using Mixtures of Matrix Variate Poisson-log Normal Model With Parameter Estimation via MCMC-EM, Variational Gaussian Approximations, or a Hybrid Approach Combining Both.

    Language:R8100
  • BCCDC-PHL/simulate-short-reads

    Generate simulated illumina sequence reads from a reference sequence

    Language:Nextflow5190
  • volkanbicer/coinsentiment

    Cryptocurrency reddit sentiment analysis application.

    Language:Python5100
  • BCCDC-PHL/simulate-long-reads

    Generate simulated oxford nanopore reads from a reference sequence

    Language:Nextflow4121
  • sukrutrao/crowdsourced-data-simulator

    A program that simulates answers given by a crowd to multiple choice questions with either a single or multiple answers correct, and writes it to a CSV

    Language:Python4111
  • greenelab/ponyo

    Software to simulate compendium-wide gene expression data using a VAE.

    Language:Python33163
  • ronhandels/synthetic-correlated-data

    Generate synthetic longitudinal correlated data using distributions and correlations from real-world observational data

    Language:R3240
  • aschetti/R-data-sim

    Simulate data for simple experimental designs using the R package faux (https://github.com/debruine/faux/).

    Language:HTML1000
  • CS-LEE2022/Analyze_A-B_Test_Results

    Applying A/B test to help determing if company should launch the new page

    Language:HTML1100
  • dfornika/simulate-genomes

    Simulate genomic variation, based on input reference genomes

    Language:Nextflow110
  • greenfishbluefish/sims_loader

    A CLI for loading simulated data into any database using DBIx::Class plugins Schema::Loader and Sims

    Language:Perl1110
  • Jiian/seoulbike

    Optimising bike supply resources in Seoul's bike sharing system. (Analyses and Recommendations)

    Language:Jupyter Notebook1103
  • kibetbrian74/Simulating-and-Visualizing-Bird-Movements-Between-Randomly-Generated-Stations-Using-Geospatial-Data

    The code simulates the movement of birds between various stations, generating data points for each visit, including the station visited, exact (slightly perturbed) coordinates, and a timestamp. This data is stored in a GeoDataFrame that can be used for spatial analysis and visualization.

    Language:Jupyter Notebook1100
  • Lightbridge-KS/simWaves

    R 📦 for Simulate Waveform data

    Language:R110
  • Ravikiran27/GenAidataset

    A Python-based tool for generating customizable synthetic datasets tailored for Generative AI applications. Built with simplicity and flexibility in mind, this project helps researchers and developers simulate realistic data for training, testing, and experimentation.

    Language:HTML1
  • AdeLouis/JDST2023-Time-Series-Data-Augmented-Simulation

    Code for JDST 2023 paper: "Simulating Realistic Continuous Glucose Monitor Time Series By Data Augmentation" by L.Gomez, A.Toye, R.Hum and S.Kleinberg

    Language:Python0000
  • Catalina2820/Arquitectura-Bigdata

    This repository contains projects and exercises I completed during my "Big Data Architecture" course. It reflects the concepts I’ve learned about data processing using Apache Spark and PySpark.

    Language:Python00
  • sadevans/dataset_simulation

    Создание синтетического датасета на основе cимуляции свойств физики SEM

    Language:Python0200
  • SamyZouggari/Statistical-Analysis-and-Simulations

    This project uses RStudio to analyze a Poisson distribution, football statistics, and experimental simulations. The results are documented in Markdown with R scripts.

  • vargovema/systemic-risk

    This repository contains the code and implementation for a master's thesis on using deep learning techniques to model systemic risk in financial systems with non-normal risk factors.

    Language:Jupyter Notebook0100
  • A-Samod/sri-lanka-railways-gps-data-generator

    A Node.js application that simulates GPS data for Sri Lanka Railways, transmitting real-time train location data to a backend API at one-minute intervals.

    Language:JavaScript101
  • abikesa/gods

    Shakespeare & Intelligence

    Language:Jupyter Notebook10
  • Profbla2020/Environmental-Statistics-Project

    Estimating common parameters of the distribution of a successive Random dilution. A comparative study.

    Language:HTML10
  • Shashwatpandey4/DataFlux

    High-performance, multi-stream data ingestion simulator Built for testing real-time pipelines, PB-scale throughput, and stream processing systems like Kafka, Flink, FastAPI, and Iceberg.

    Language:Python