data-augmentation

There are 991 repositories under data-augmentation topic.

  • snorkel-team/snorkel

    A system for quickly generating training data with weak supervision

    Language:Python5.7k168980861
  • DALI

    NVIDIA/DALI

    A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

    Language:C++5k941.6k608
  • ZhaoJ9014/face.evoLVe

    🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥

    Language:Python3.4k110186759
  • QData/TextAttack

    TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/

    Language:Python2.8k38269375
  • webdataset/webdataset

    A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

    Language:Python2k21297154
  • torchio

    fepegar/torchio

    Medical imaging toolkit for deep learning

    Language:Python2k18451230
  • iver56/audiomentations

    A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

    Language:Python1.7k20180182
  • 425776024/nlpcda

    一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda

    Language:Python1.7k932166
  • AgaMiko/data-augmentation-review

    List of useful data augmentation resources. You will find here some not common techniques, libraries, links to GitHub repos, papers, and others.

  • jasonwei20/eda_nlp

    Data augmentation for NLP, presented at EMNLP 2019

    Language:Python1.6k3638316
  • yongzhuo/nlp_xiaojiang

    自然语言处理(nlp),小姜机器人(闲聊检索式chatbot),BERT句向量-相似度(Sentence Similarity),XLNET句向量-相似度(text xlnet embedding),文本分类(Text classification), 实体提取(ner,bert+bilstm+crf),数据增强(text augment, data enhance),同义句同义词生成,句子主干提取(mainpart),中文汉语短文本相似度,文本特征工程,keras-http-service调用

    Language:Python1.5k4115394
  • visual-layer/fastdup

    fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.

    Language:Python1.4k2122974
  • zhanlaoban/EDA_NLP_for_Chinese

    An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。

    Language:Python1.3k1717241
  • LirongWu/awesome-graph-self-supervised-learning

    Code for TKDE paper "Self-supervised learning on graphs: Contrastive, generative, or predictive"

  • Paperspace/DataAugmentationForObjectDetection

    Data Augmentation For Object Detection

    Language:Jupyter Notebook1.1k2420314
  • asteroid-team/torch-audiomentations

    Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

    Language:Python8931110486
  • styfeng/DataAug4NLP

    Collection of papers and resources for data augmentation for NLP.

  • goru001/inltk

    Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need

    Language:Python8133667164
  • zhunzhong07/Random-Erasing

    Random Erasing Data Augmentation. Experiments on CIFAR10, CIFAR100 and Fashion-MNIST

    Language:Python7161618156
  • DemisEom/SpecAugment

    A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

    Language:Python6341130136
  • textflint

    textflint/textflint

    Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing

    Language:Python630183293
  • YuliangXiu/MobilePose

    Light-weight Single Person Pose Estimator

    Language:Jupyter Notebook6302046149
  • Westlake-AI/openmixup

    CAIRI Supervised, Semi- and Self-Supervised Visual Representation Learning Toolbox and Benchmark

    Language:Python583165359
  • firmai/deltapy

    DeltaPy - Tabular Data Augmentation (by @firmai)

    Language:Jupyter Notebook53317353
  • conradry/copy-paste-aug

    Copy-paste augmentation for segmentation and detection tasks

    Language:Jupyter Notebook52651873
  • quqxui/Awesome-LLM4IE-Papers

    Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)

  • arcelien/pba

    Efficient Learning of Augmentation Policy Schedules

    Language:Jupyter Notebook503203086
  • MTG/DeepConvSep

    Deep Convolutional Neural Networks for Musical Source Separation

    Language:Python4663420110
  • hongyi-zhang/mixup

    Implementation of the mixup training method

    Language:Python4597881
  • codebox/image_augmentor

    Data augmentation tool for images

    Language:Python4401011153
  • amanchadha/coursera-gan-specialization

    Programming assignments and quizzes from all courses within the GANs specialization offered by deeplearning.ai

    Language:Jupyter Notebook41062290
  • denisyarats/drq

    DrQ: Data regularized Q

    Language:Jupyter Notebook398132549
  • tigerlab-ai/tiger

    Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)

    Language:Jupyter Notebook38511926
  • bethgelab/imagecorruptions

    Python package to corrupt arbitrary images.

    Language:Python38491661
  • sshuair/torchsat

    🔥TorchSat 🌏 is an open-source deep learning framework for satellite imagery analysis based on PyTorch.

    Language:Python383201048
  • synthcity

    vanderschaarlab/synthcity

    A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.

    Language:Python3821212851