pii-detection
There are 83 repositories under pii-detection topic.
microsoft/presidio
An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.
vllm-project/semantic-router
Intelligent Mixture-of-Models Router for Efficient LLM Inference
redhuntlabs/Octopii
An AI-powered Personal Identifiable Information (PII) scanner.
google/magritte
Mediapipe-based library to redact faces from videos and images
thoughtbot/top_secret
Filter sensitive information from free text before sending it to external services or APIs, such as chatbots and LLMs.
databrickslabs/discoverx
A Swiss-Army-knife for your Data Intelligence platform administration.
awslabs/sensitive-data-protection-on-aws
The Sensitive Data Protection on AWS solution allows enterprise customers to create data catalogs, discover, protect, and visualize sensitive data across multiple AWS accounts. The solution eliminates the need for manual tagging to track sensitive data such as Personal Identifiable Information (PII) and classified information.
EdyVision/pii-codex
A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)
rpgeeganage/pII-guard
🛡️ PII Guard is an LLM-powered tool that detects and manages Personally Identifiable Information (PII) in logs — designed to support data privacy and GDPR compliance
edwardcooper/piidetect
A package to build an end-to-end pipeline for detecting personally identifiable information from text.
apicrafter/metacrafter
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
arcjet/example-nextjs
An example Next.js application protected by Arcjet.
mddunlap924/PII-Detection
Personal Identifiable Information (PII) entity detection and performance enhancement with synthetic data generation
bluewave-labs/maskwise
Maskwise detects, redacts, masks, and anonymizes sensitive data across text, images, and structured data in training datasets for LLM systems. Powered by Microsoft Presidio
aliengiraffe/deidentify
Simple yet powerful tool for identifying and anonymizing personal information in various formats.
seanpedrick-case/doc_redaction
Redact PDF/image-based documents, or CSV/XLSX files using a Gradio-based GUI interface
akazah/prompt-anonymizer
Anonymize / mask personal information before sending prompts to chat AI (like ChatGPT provided by OpenAI)
Akshay7591/Web-Scanner
Web Scanner written in Python which after scanning the given URL returns it's domain name, ip address, nmap scan results and also the contents the URL's robots.txt.
fvaleye/metadata-guardian
Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️
redhat-et/semantic_router
LLM Semantic Router: Intelligent Mixture-of-Models (MoM) System with Privacy Preservation and Prompt Guard. The semantic router intelligently directs OpenAI compliant API requests to the most suitable backend models based on semantic understanding of request content.
apicrafter/metacrafter-registry
Registry of metadata identifier entities like UUID, GUID, person fullname, address and so on. Linked with other sources
dotfurther/OpenDiscoverSDK
.NET 8 API for document file format identification, text/metadata/attachment/embedded object/sensitive item (PII/PHI)/entity extraction.
edwardcooper/data-sentry
A project to build a machine learning pipeline to detect personal identifiable information (PII)
dotfurther/OpenDiscoverPlatformCaseStudy
Case study using dotfurther's Open Discover Platform with the RavenDB document store to rapidly create a full-text search/eDiscovery/information governance capable demonstration application.
DataFog/codexify
An open-source API that identifies, masks, and replaces Personallly Identifying Information (PII)
HabaneroCake/pii-filter
A personally identifiable information (PII) filter.
gretelai/multi-table
Notebook and code to synthesize relational databases such as Postgres and Mysql.
oxytis/oxidize
Discover PII sensitive data. Find most common personally identifiable information in your environment such as financial related information. Quickly determine exposure after a breach.
Lizhecheng02/Kaggle-PII_Data_Detection
Implement named entity recognition (NER) using regex and fine-tuned LLM, with a total of 15 categories. The ultimate goal is to apply the model to detect personally identifiable information (PII) in student writing.
mns-llc/bitsnarf
Finds useful information in English/US strings using regex with a focus on PII.
bballamudi/data-sentry
A project to build a machine learning pipeline to detect personal identifiable information (PII)
montevive/go-name-detector
High-performance Go library for PII name detection. 10x faster than Python with embedded 727K names dataset. Production-ready out-of-the-box functionality for privacy compliance and data security.
arcjet/example-remix
An example Remix application protected by Arcjet.
jayeshthk/Aegis
Aegis is a light weight Chrome extension that obfuscates sensitive data like emails, phone numbers, and credit card numbers in real time. It features an intuitive interface for masking and copying obfuscated text, ensuring your privacy while interacting with web forms and input fields.
ComputerAnything/spii_redactor
A Python-based tool for detecting and redacting PII from images using EasyOCR and SpaCy.
viclang/anonymacy
anonymaCy is a spaCy extension for anonymizing PII using rule-based recognizers, context-aware processing, conflict resolution and customizable anonymization.