data-anonymization

There are 70 repositories under data-anonymization topic.

  • microsoft/presidio

    An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

    Language:Python5.6k75495741
  • databunker

    securitybunker/databunker

    Secure Vault for Customer PII/PHI/PCI/KYC Records

    Language:Go1.3k341485
  • arx-deidentifier/arx

    ARX is a comprehensive open source data anonymization tool aiming to provide scalability and usability. It supports various anonymization techniques, methods for analyzing data quality and re-identification risks and it supports well-known privacy models, such as k-anonymity, l-diversity, t-closeness and differential privacy.

    Language:Java67233193228
  • ArtLabss/open-data-anonymizer

    Python Data Anonymization & Masking Library For Data Science Tasks

    Language:Python27281135
  • thoughtbot/top_secret

    Filter sensitive information from free text before sending it to external services or APIs, such as chatbots and LLMs.

    Language:Ruby2475
  • DbToolsBundle

    makinacorpus/DbToolsBundle

    A PHP library to back up, restore and anonymize databases

    Language:PHP219610716
  • HideDroid

    Dado1513/HideDroid

    HideDroid is an Android app that allows the per-app anonymization of collected personal data according to a privacy level chosen by the user.

    Language:Java202101510
  • BMW-InnovationLab/BMW-Anonymization-API

    This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.

    Language:Python1874318
  • privateai/deid-examples

    Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.

    Language:Jupyter Notebook84531
  • jftuga/deidentification

    Deidentify people's names and gender specific pronouns

    Language:Python41103
  • IFCA-Advanced-Computing/anjana

    ANJANA is a Python library for anonymizing sensitive data

    Language:Python35113
  • bluewave-labs/maskwise

    Maskwise detects, redacts, masks, and anonymizes sensitive data across text, images, and structured data in training datasets for LLM systems. Powered by Microsoft Presidio

    Language:TypeScript28001
  • docs

    snaplet/docs

    Snaplet Documentation

    Language:HTML2811210
  • aliengiraffe/deidentify

    Simple yet powerful tool for identifying and anonymizing personal information in various formats.

    Language:Go25
  • thoughtworks-datakind/anonymizer

    Library for identification, anonymization and de-anonymization of PII data

    Language:Python222255
  • KI-AIM/Cinnamon

    Cinnamon is a modular application designed to offer robust functionalities for data anonymization, synthetization, and evaluation.

    Language:Java211
  • nikosgalanis/local-dp-protocols

    🎓🔒 Creating, Analyzing and Testing Differential Privacy Protocols, aiming in Data Protection and Anonymization.

    Language:Jupyter Notebook17101
  • stefanrmmr/differentially_private_synthetic_data

    Differentially Private Synthetic Data Generation [DP-SDG] - Experimental Setups & Knowledge Base - WORK IN PROGRESS

    Language:Jupyter Notebook12102
  • yevh/anonymizer

    Anonymize sensitive data in your datasets.

    Language:Python12101
  • fabriziosalmi/csv-anonymizer

    CSV fuzzer/anonymizer

    Language:JavaScript1010
  • fgmacedo/datanonymizer

    Anonymizer tool for datasets such CSV files

    Language:Python9310
  • eriknovak/anonipy

    Data anonymization package, supporting different anonymization strategies

    Language:Python71153
  • geniuszly/GenFakeGenerator

    This script generates various types of fake data, such as names, addresses, phone numbers, coordinates, and more, using the Faker library. Users can select the data type and the quantity to generate. The generated data is saved to a JSON file

    Language:Python710
  • OsgiliathEnterprise/data-migrator

    Generate anonymized test dataset from production data and configurable anonymization sequences. Execute base to base (vendor agnostic) export and import

    Language:Java7173
  • Aymane11/anonymize

    Data anonymization made easy

    Language:Python6102
  • ryokugyu/One-Pass-KMeans-Algorithms

    Implementation of An Efficient Clustering Method for k-Anonymization in Python 2.7

    Language:Python6202
  • data-anonymizer

    Vishwamitra/data-anonymizer

    DataAnonymizer is an open-source personal data anonymization tool designed for GDPR compliancy

    Language:TypeScript5131
  • induction-anonymization

    data-protection-helpers/induction-anonymization

    Induction to anonymization of data

    Language:Jupyter Notebook4000
  • data-anonymization-diabetes

    jaimedantas/data-anonymization-diabetes

    Impacts of data anonymization on model prediction for diabetes

    Language:MATLAB4203
  • sonbachmi/NgAnonymize

    Data anonymization using Angular 2+

    Language:TypeScript4200
  • sandrociceros/DataMasker

    A free data masking and/or anonymizer library

    Language:C#3201
  • athulck/Data-Anonymization-Tool

    M.Tech final year project to create a data anonymization tool.

    Language:Python2100
  • Club-Innovate/GenAI-SQL-CLI

    GenAI-SQL is a modular, extensible suite of AI-powered tools for automating SQL code improvement, documentation, and validation. Built for developers, analysts, and data engineers, it leverages Azure OpenAI (GPT-4o) to analyze, refactor, comment, explain, test, and audit SQL — all within a secure, asynchronous, and HIPAA-compliant framework.

    Language:Python2
  • henryhamon/iris-disguise

    Data Anonymization tool for InterSystems IRIS

    Language:ObjectScript2103
  • pavanad/beegen

    BeeGen is an intelligent command-line tool designed to assist developers with everyday tasks, leveraging the power of generative AI.

    Language:Python20
  • viclang/anonymacy

    anonymaCy is a spaCy extension for anonymizing PII using rule-based recognizers, context-aware processing, conflict resolution and customizable anonymization.

    Language:Python2