data-synthesis

There are 33 repositories under data-synthesis topic.

  • AgaMiko/data-augmentation-review

    List of useful data augmentation resources. You will find here some not common techniques, libraries, links to GitHub repos, papers, and others.

  • swz30/CycleISP

    [CVPR 2020--Oral] CycleISP: Real Image Restoration via Improved Data Synthesis

    Language:Python485152669
  • bpycv

    DIYer22/bpycv

    Computer vision utils for Blender (generate instance annoatation, depth and 6D pose by one line code)

    Language:Python461184656
  • Tebmer/Awesome-Knowledge-Distillation-of-LLMs

    This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & Vertical Distillation of LLMs.

  • MrGiovanni/SyntheticTumors

    [CVPR 2023] Label-Free Liver Tumor Segmentation

    Language:Python27771124
  • Baukebrenninkmeijer/On-the-Generation-and-Evaluation-of-Synthetic-Tabular-Data-using-GANs

    Repository for the results of my master thesis, about the generation and evaluation of synthetic data using GANs

    Language:Jupyter Notebook40414
  • Xiaohao-Xu/SLAM-under-Perturbation

    official code for Customizable Embodied Multi-modal Perturbations for SLAM Robustness Benchmarking

    Language:C++31111
  • Eleanor-H/MUSTARD

    Code & data for ICLR 2024 spotlight paper: 🍯MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data

    Language:C++26321
  • loopy

    Gariscat/loopy

    A data framework for music information retrieval focusing on electronic music.

    Language:Python22214
  • MatthewCYM/GenSE

    Official implementaion of EMNLP 2022 paper "Generate, Discriminate, and Contrast: A Semi-Supervised Sentence Representation Learning Framework"

    Language:Python22251
  • vkit-x/vkit

    Boosting Document Intelligence

    Language:Python22211
  • laanlabs/FootRenderer

    A data synthesizer for creating datasets of feet from a first-person perspective.

    Language:Swift15514
  • phrocker/nifi-datasynthesizer

    Apache NiFi Data Synthesizer

    Language:Java14583
  • zealscott/LDPTrace

    Source code for LDPTrace: Locally Differentially Private Trajectory Synthesis. VLDB 2023.

    Language:Python14101
  • sushantdhumak/Trigger-Word-Detection

    Coursera - RNN Programming Assignment: In this project, we will construct a speech dataset and implement an algorithm for trigger word detection (sometimes also called keyword detection, or wake word detection).

    Language:Jupyter Notebook7105
  • ArenaGrenade/bpycv3d

    Blender Python Package for extracting internal data from blender scenes for 3d related data generation purposes.

    Language:Python6100
  • Smithsonian/CCN-Data-Library

    The Coastal Carbon Network Data Library: An open-source database featuring carbon data from tidal wetlands around the world

    Language:HTML48762
  • KelestZ/ICW-GANs

    FMRI data augmentation via synthesis, The IEEE International Symposium on Biomedical Imaging (ISBI'19)

    Language:Python2210
  • PD-Mera/object-detection-data-synthesis

    Synthesis data in YOLO format given background and object images

    Language:Python2200
  • xuguodong1999/pen-simulator

    data synthesis for simulation of pen-based interaction

    Language:C++2100
  • hruffieux/echoseq

    echoseq R package - Synthetic-data generator: replication and simulation of molecular and clinical data

    Language:R1100
  • michelhilg/data-synthesis

    This GitHub repository showcases my bachelor thesis which is focused on exploring the application and comparison of various deep generative models for synthetic image augmentation in manufacturing domain.

    Language:Jupyter Notebook1100
  • pamudu123/BEE_counting

    Counting Bees

    Language:Jupyter Notebook1100
  • alexisfischer/bloom-baby-bloom

    Synthesize, analyze, and visualize biological oceanography data

    Language:MATLAB00
  • alexisfischer/ifcb-data-science

    Build machine learning image classifiers and summarize large image datasets from the Imaging FlowCytobot (IFCB)

    Language:MATLAB00
  • avishagnevo/VaccineMatchAnalysis

    Comprehensive reproduction of the paper "BNT162b2 mRNA Covid-19 Vaccine in a Nationwide Mass Vaccination Setting" by Noa Dagan, MD, et al., assisted by Professor Yair Goldberg. This statistical project explores vaccination's multifaceted impact on infection rates, employing synthetic data, advanced matching, and sophisticated statistical analysis.

    Language:Python0100
  • data-programming/fictitious-dimensions

    Since the times of d'Alembert, Lagrange and Euler humans like to add fictitious dimensions to their real-world physical and mathematical problems. This art was perfected in the XX-th century by Heisenberg, Pauli and Dirac in their 'matrix mechanics'. In the XXI-st century we can contribute to this proud tradition too, we have computers! :)

  • homayounsr/Sentiment-Classification-using-IMDB-reviews

    For this project, I aimed to perform sentiment analysis on IMDB movie reviews. My dataset consisted of over 36,000 reviews, each accompanied by movie ratings ranging from 0 to 10. The primary objective was to construct a machine learning model capable of categorizing reviews into three sentiment classes: negative, neutral, and positive.

    Language:Jupyter Notebook00
  • lone17/DECAF

    Data Utility Improvement Experiment for DECAF

    Language:Python0100
  • JohannesWiesner/nisynth

    A repository for synthesizing and simulating MRI images

    Language:Python10
  • ready4web

    ready4-dev/ready4web

    Website of the ready4 suite of tools for data synthesis and modelling in mental health

    Language:HTML10
  • wbuchanan/sdpConvening2023

    Repository for Slide Deck and Code Examples for talk at SDP Convening 2023

    Language:HTML10
  • ZhengtongYan/Daisy

    Language:Python00