data-centric

There are 34 repositories under data-centric topic.

  • ludwig

    ludwig-ai/ludwig

    Low-code framework for building custom LLMs, neural networks, and other AI models

    Language:Python10.9k1941.1k1.2k
  • lancedb/lance

    Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..

    Language:Rust3.5k40815181
  • data-centric-AI

    daochenzha/data-centric-AI

    A curated, but incomplete, list of data-centric AI resources.

  • encord-team/encord-active

    The toolkit to test, validate, and evaluate your models and surface, curate, and prioritize the most valuable data for labeling.

    Language:Python425101224
  • hkust-nlp/deita

    Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

    Language:Python38752325
  • CLUEbenchmark/DataCLUE

    DataCLUE: 数据为中心的NLP基准和工具包

    Language:Python1458217
  • dust-dds

    s2e-systems/dust-dds

    Rust implementation of the Data Distribution Service (DDS)

    Language:Rust623188
  • ChandlerBang/GTrans

    [ICLR'23] Implementation of "Empowering Graph Representation Learning with Test-Time Graph Transformation"

    Language:Python52315
  • Acharya

    astutic/Acharya

    A Data Centric NER annotation tool for your Named Entity Recognition projects

  • vuejs-form

    zhorton34/vuejs-form

    Vue Form with Laravel Inspired Validation and Simply Enjoyable Error Messages Api. (Form Api, Validator Api, Rules Api, Error Messages Api)

    Language:JavaScript415156
  • PrincetonUniversity/muchiSim

    Simulator framework for analysis of performance, energy consumption, area and cost of multi-node multi-chiplet tile-based manycore designs

    Language:C++32304
  • Maksims/mr-Observer

    An observer is a wrapper over JSON data, that provides an interface to know when data is changed, with a focus on performance and memory efficiency.

    Language:JavaScript23311
  • kennethleungty/Data-Centric-AI-Competition

    Codes for a Top 5% finish in the Data-Centric AI Competition organized by Andrew Ng and DeepLearning.AI

    Language:Jupyter Notebook20203
  • openlayer-ai/examples-gallery

    Sample notebooks that use the Openlayer Python API

    Language:Jupyter Notebook18400
  • minnesotanlp/infoVerse

    Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-information"

    Language:Python16101
  • stoney95/pypely

    From local functions to cloud deployed pipelines

    Language:Python16350
  • seedatnabeel/Data-IQ

    Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)

    Language:Jupyter Notebook13214
  • mdbloice/Labeller

    Quickly set up an image labelling web application for manually tagging images for machine learning tasks.

    Language:Python9212
  • seedatnabeel/Data-SUITE

    Data-SUITE: Data-centric identification of in-distribution incongruous examples (ICML 2022)

    Language:Jupyter Notebook9204
  • NileDB/com.niledb.core

    Open-source Data Backend written in Java and based on PostgreSQL & GraphQL.

    Language:JavaScript83351
  • rajive/doma

    Data-Oriented Microservices Architecture Framework using DDS

    Language:Shell6100
  • justincpresley/ndn-hydra

    ndn-hydra: A Python-coded NDN distributed repository with five focused attributes: resiliency, scalability, usability, efficiency, and security.

    Language:Python34147
  • NileDB/com.niledb.dataflow

    A series of NiFi processors that facilitate ingestion of data into NileDB Core platform.

    Language:Java3300
  • openlayer-ai/openlayer-python

    The official Python library for Openlayer, the Continuous Model Improvement Platform for AI. 📈

    Language:Python3200
  • datacentricorg/datacentric-cpp

    Data-centric core services library in C++. For the version supporting multiple languages, see datacentric repo.

    Language:C++2100
  • cadmiumkitty/dcaf-2020-provo

    Demo code for my talk at Data-Centric Architecture Forum 2020 about data provenance and PROV ontology.

    Language:Java1210
  • datacentricorg/datacentric

    Data-centric, cross-platform, multi-language core services library for C++, C#, Python, and Java. This repository includes all languages. Each language also has its own repository, e.g. datacentric-cpp.

    Language:C#1100
  • datacentricorg/datacentric-cs

    Data-centric core services library in C#. For the version supporting multiple languages, see datacentric repo.

    Language:C#1100
  • datacentricorg/datacentric-py

    Data-centric core services library in Python. For the version supporting multiple languages, see datacentric repo.

    Language:Python1200
  • nikimacm/trailmixers-project3

    Python and Data Centric Development: A full-stack site that allows users to add, edit, delete and search hiking trails in the Province of Andalucia, Spain. They can also upload photos and maps showing their trails. Each route will provide: A title, Address of the trail , Difficulty level, Description, Directions , Photos, Maps

    Language:HTML1101
  • rajive/doma-skel

    DOMA Skeleton - Document and Setup a DOMA Repository - Clone Me!

    Language:Lua1100
  • datacentricorg/datacentric-java

    Data-centric core services library in Java. For the version supporting multiple languages, see datacentric repo.

  • bryce-bowles/opioid-prescribing-rates

    Semester long project working with Virginia Department of Social Services to assist in data centric reengineer useful data into VA’s major FAACT database. Tableau dashboard analysis and presentation created using data from 2016 to 2019 on Medicare Prescribing rates.

  • zenetio/Traffic-Car-Classifier

    Use CNN to classify traffic signs

    Language:Jupyter Notebook20