pii

There are 205 repositories under pii topic.

  • microsoft/presidio

    An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

    Language:Python5.6k75496740
  • CatchTheTornado/text-extract-api

    Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

    Language:Python2.9k1278244
  • capitalone/DataProfiler

    What's in your data? Extract schema, statistics and entities from datasets

    Language:Python1.5k19187178
  • databunker

    securitybunker/databunker

    Secure Vault for Customer PII/PHI/PCI/KYC Records

    Language:Go1.3k341485
  • redhuntlabs/Octopii

    An AI-powered Personal Identifiable Information (PII) scanner.

    Language:Python693111059
  • hawk-eye

    rohitcoder/hawk-eye

    A powerful scanner to scan your Filesystem, S3, MySQL, Redis, Google Cloud Storage and Firebase storage for PII and sensitive data.

    Language:Python4455352
  • tokern/piicatcher

    Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub

    Language:Python3211210399
  • thoughtbot/top_secret

    Filter sensitive information from free text before sending it to external services or APIs, such as chatbots and LLMs.

    Language:Ruby2485
  • microsoft/presidio-research

    This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.

    Language:Jupyter Notebook236115069
  • samber/slog-formatter

    🚨 slog: Attribute formatting

    Language:Go195364
  • GoogleCloudPlatform/dlp-dataflow-deidentification

    Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP

    Language:Java95393051
  • EdyVision/pii-codex

    A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)

    Language:Python893610
  • klouddb/klouddbshield

    KloudDB Shield is a comprehensive Postgres Security Tool - PII Scanner , CIS Benchmarks , SSL audit , 12+ features .. Supports Postgres, RDS ,Aurora, MySQL

    Language:Go891617
  • philterd/phileas

    The open source PII and PHI redaction and de-identification engine

    Language:Java65213611
  • amanvirparhar/elara

    A simple tool to anonymize LLM prompts.

    Language:Svelte64217
  • rpgeeganage/pII-guard

    🛡️ PII Guard is an LLM-powered tool that detects and manages Personally Identifiable Information (PII) in logs — designed to support data privacy and GDPR compliance

    Language:TypeScript61
  • cxumol/promptmask

    Never give AI companies your secrets! A local LLM-based privacy filter for LLM users. Seamless integration with your existing AI tools as a Python library / OpenAI SDK replacement / API Gatetway / Web Server.

    Language:Python59
  • redacted

    polentino/redacted

    Scala library and compiler plugin that prevent inadvertent leakage of sensitive fields in `case classes` (such as credentials, personal data, and other confidential information)

    Language:Scala55243
  • open-privacy/opv

    Open Privacy Vault - Secure, Performant, Open Source PII as a Service.

    Language:Go50515
  • deliciousinsights/mongoose-pii

    A Mongoose plugin that lets you transparently cipher stored PII and use securely-hashed passwords

    Language:JavaScript461163
  • edwardcooper/piidetect

    A package to build an end-to-end pipeline for detecting personally identifiable information from text.

    Language:Python461310
  • PovertyAction/PII_detection

    Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets.

    Language:Python468611
  • apicrafter/metacrafter

    Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules

    Language:Python453275
  • jftuga/deidentification

    Deidentify people's names and gender specific pronouns

    Language:Python41103
  • AgenticA5/A5-PII-Anonymizer

    Desktop App with Built-In LLM for Removing Personal Identifiable Information in Documents

    Language:JavaScript361010
  • mddunlap924/PII-Detection

    Personal Identifiable Information (PII) entity detection and performance enhancement with synthetic data generation

    Language:Python30102
  • Poogles/piiregex

    Search for PII in Python

    Language:Python301010
  • primait/veil

    Rust derive macro for redacting sensitive data in std::fmt::Debug

    Language:Rust284223
  • AidanSpeakss/streamer-mode-for-firefox

    Hides personal information from pages, similar to Discord's Streamer mode.

    Language:JavaScript27286
  • MLukman/Keycloak-PII-Data-Encryption-Provider

    A Keycloak provider that enables encryption of user attributes that contain PII data to be automatically encrypted upon storing to database and then decrypted upon loading from database

    Language:Java26263
  • aliengiraffe/deidentify

    Simple yet powerful tool for identifying and anonymizing personal information in various formats.

    Language:Go25
  • nightfallai/nightfall-python-sdk

    Python Data Loss Prevention (DLP) SDK - Nightfall Developer Platform

    Language:Python255113
  • seanpedrick-case/doc_redaction

    Redact PDF/image-based documents, or CSV/XLSX files using a Gradio-based GUI interface

    Language:Python25103
  • kylemclaren/scrub

    A Python package to scrub PII

    Language:Python24206
  • Stuub/GitHush

    Detecting leaked secrets, API keys, credentials, and sensitive files from public repositories in near real-time using the GitHub Events API

    Language:Python233
  • ipcrypt-std/ipcrypt2

    A tiny, portable implementation of the IPCrypt specification in C.

    Language:C22