data-selection
There are 25 repositories under data-selection topic.
princeton-nlp/LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
p-lambda/dsir
DSIR large-scale data selection framework for language model training
alon-albalak/data-selection-survey
A Survey on Data Selection for Language Models
georgian-io/Transformers-Domain-Adaptation
:no_entry: [DEPRECATED] Adapt Transformer-based language models to new text domains
waltonfuture/InstructionGPT-4
InstructionGPT-4
yueyu1030/Patron
[ACL 2023] The code for our ACL'23 paper Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Propagation Approach
Nokia-Bell-Labs/data-centric-federated-learning
Enhancing Efficiency in Multidevice Federated Learning through Data Selection
reds-lab/projektor
This is an official repository for "Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources" (NeurIPS 2023).
ArsamAryandoust/DataSelectionMaps
Enhanced spatio-temporal electric load forecasts with less data using active deep learning
JoyeBright/DataSelection-NMT
Repository for the experiments in my paper accepted to the CLIN Journal: "Selecting Parallel In-domain Sentences for Neural Machine Translation Using Monolingual Texts"
lvapeab/sentence-selectioNN
Keras sentence classification
surafelml/adapt-mnmt
Dynamic Transfer Learning for Low-Resource Neural Machine Translation
liziniu/ISWBC
Code for NeurIPS 2023 Paper (Imitation Learning from Imperfection: Theoretical Justifications and Algorithms)
zincware/ZnNL
A Python package for studying neural learning
allo-media/cynical-selection
Allo-media data selection tool
sterzhang/CORE
CORE: Mitigating Catastrophic Forgetting in Continual Learning through Cognitive Replay (CogSci 2024 Oral)
wyu-du/Self-Training-Dialogue-Generation
This repository contains the data and code for the paper "Self-training with Two-phase Self-augmentation for Few-shot Dialogue Generation" (EMNLP2022-Findings).
4AI/generative_deduplication
Code for Generative Deduplication For Socia Media Data Selection (Findings of EMNLP 2024)
minsu716-kim/Quilt
Quilt: Robust Data Segment Selection against Concept Drifts (AAAI 2024)
abderrahman-bns/Data-Cleaning-and-Preprocessinng-with-Pandas
Introducing you to the fundamentals of the quintessential Python data analysis library, pandas, and its core data structures – the Series and DataFrame objects.
Bessouat40/pdf-region-picker
A project to select only part of a PDF file. It's usefull when you want to extract informations with some python library like fitz.
koudounasalkis/CSI-MIT
This repo contains the code for "Privacy Preserving Data Selection for Bias Mitigation in Speech Models"
JLeigh101/belly-button-challenge
NU Bootcamp Module 14
SyncfusionExamples/cell-and-checkbox-selection-with-vue-grid-rows
A quick-start project that helps you to perform different types of selection in Vue Grid and know about different modes of selection – Row, Cell and Both. This project contains code snippet about cell, checkbox and toggle selection, and the way to get row index of selected cells using row selection events.