data-filtering
There are 115 repositories under data-filtering topic.
weAIDB/awesome-data-llm
Official Repository of "LLM × DATA" Survey Paper
p-lambda/dsir
DSIR large-scale data selection framework for language model training
przemek83/volbx
Graphical tool for data manipulation written in C++/Qt.
GUNDAM-Labet/GUNDAM
GUNDAM is a data management system that prioritizes data using language models.
gookit/filter
⏳ Provide filtering, sanitizing, and conversion of Golang data. 提供对Golang数据的过滤,净化,转换。
heera/requent
A GraphQL like interface to map a request to eloquent query with data transformation for Laravel.
Victorwz/MLM_Filter
Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".
jonnieZG/EWMA
Exponentially Weighted Moving Average Filter
lpreterite/datagent
一个用于模块化管理前端请求的工具
zhuang-li/SCAR
[ACL 2025 main] SCAR: Data Selection via Style Consistency-Aware Response Ranking for Efficient Instruction-Tuning of Large Language Models
ai-forever/DataProcessingFramework
Framework for processing and filtering datasets
angelajw/QualtricsDataCleaning
R Tutorial: useful R codes for cleaning and filtering data from Qualtrics surveys, and for creating new variables in the dataframe. With step-by-step explanations.
kehvinbehvin/json-mcp-filter
JSON MCP server to filter only relevant data for your LLM
noronhadaniel/ACS_2023
This repository contains all (Python 3) code and libraries required for the 2022-2023 Notre Dame Rocketry Team (NDRT) Apogee Control System (ACS). It also contains sensor/actuator example code and flight data.
giupardeb/EpiMethEx
EpiMethEx (Epigenetic Methylation and Expression), a R package to perform a large-scale integrated analysis by cyclic correlation analyses between methylation and gene expression data.
RahulGoel2000/SBLS-Smartphone-Bot-Localization-system
Data extraction from smartphones and GPS and Accelerometer data "fusion" with Kalman filter.
levitation-opensource/DataAnonymiser
Anonymises data inside text files and in sheet files. It recognises and removes various sorts of personally identifiable information (PII). Each removed part is replaced with a suitable generic text, depending on the type of removed data. Currently English and Russian languages are supported. Russian works both with Cyrillic and Latin characters.
ryandkuster/ngsComposer
Base-call error-filtering and read preprocessing pipeline for fastq libraries
ajnanmvr/CDC-Connect
CDC Connect is a cross-platform mobile application built in React Native using JavaScript. The app is designed for data collection with a focus on surveys.
azuregray/ExcelScope
A multi-parameter sequential search utility for filtering through an input Excel Datasheet.
chaleaoch/lumi-filter
A powerful and flexible data filtering library with unified interface for multiple data sources including Peewee ORM, Pydantic models, and Python iterables. Flask-friendly.
DevExpress-Examples/winforms-grid-make-auto-filter-row-insensitive-to-accents
Make the data grid's Auto Filter Row insensitive to accents.
emre-tarhan/sql-desc-limit
PHP | SQL - DESC LIMIT ile istenilen sayıda veri çekme işlemi
w2xim3/sqljson
A powerful tool that allows users to query JSON data using SQL-like syntax. Effortlessly search, filter, and manipulate your JSON data with familiar SQL queries.
averageencoreenjoer/processing-csv
CSV Processing Tool is a Python CLI utility for filtering and aggregating data from CSV files. It allows you to quickly process large amounts of tabular information using the command line, without the need to use Excel or databases.
axah710/DSA
This repository features Data Structures and Algorithms (DSA) practices in Dart, focusing on mastering fundamental programming concepts and problem-solving techniques.
emre-tarhan/sql-between-interval
PHP - SQL | Between & Interval İfadelerinin Kullanımı
kgniewek/FileReader-DataProcessorPractise
2021 Java practice project focused on file reading and data processing. It includes functions for custom exception handling, data conversion into objects, and basic filtering of records based on specific criteria. A practice of Java fundamentals
kvvsatyaravi/ismart-data-visualizer
demo version
Leg0shii/FileArchiver
FileArchiver is a robust tool designed to safely archive outdated data from very large datasets (Terabyte size) and efficiently filter geo-data for mapping purposes. Developed for Deutsche Bahn AG, it streamlines the management of extensive geographical data to optimize storage and enhance data processing efficiency.
mrhrifat/mw-react-test
Filter & Fetch Dynamically Data
rachits999003/Data-Analysis-and-Analytic-tool
A powerful, interactive desktop dashboard built with PyQt5, Matplotlib, Seaborn, Plotly, and scikit-learn. Designed for data wrangling, visualization, and machine learning—all in one elegant dark-themed GUI.
RobCyberLab/Ngram-Similarity-Engine
🤖Ngram Similarity Engine📚
sethubolt7/CVE_CUSTOM_API
This repository contains a backend using Spring Boot, JPA, and H2 to manage and display over 10,000 CVE records. It fetches CVE data from a public source, stores it in H2, and provides custom endpoints with filtering by year, metric score, and last modified date. Built with MVC architecture for structured data handling and web page integration.
utmhikari/daggre
DAta-AGGREgator, a tool to handle data aggregation tasks
vimalnathnambiar/exfilms
A command-line interface tool to extract, filter, and standardise MS data.