extraction
There are 569 repositories under extraction topic.
axa-group/Parsr
Transforms PDF, Documents and Images into Enriched Structured Data
Trusted-AI/adversarial-robustness-toolbox
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
google/mtail
extract internal monitoring data from application logs for collection in a timeseries database
aubio/aubio
a library for audio and music analysis
apache/tika
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
symfony/property-access
Provides functions to read and write from/to an object or array using a simple string notation
morkt/GARbro
Visual Novels resource browser
onekey-sec/unblob
Extract files from any kind of container formats
NanoNets/docext
An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)
dbashford/textract
node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!
chrismattmann/tika-python
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
yobix-ai/extractous
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
langchain-ai/langchain-extract
🦜⛏️ Did you say you like data?
rikyoz/bit7z
A C++ static library offering a clean and simple interface to the 7-zip shared libraries.
Lattyware/unrpa
A program to extract files from the RPA archive format.
philipperemy/stanford-openie-python
Stanford Open Information Extraction made simple!
BDBC-KG-NLP/IE-Survey
北京航空航天大学大数据高精尖中心自然语言处理研究团队对信息抽取领域的调研。包括实体识别,关系抽取,属性抽取等子任务,每类子任务分别对学术界和工业界进行调研。
carlospuenteg/File-Injector
File Injector is a script that allows you to store any file in an image using steganography
overtools/OWLib
DataTool is a program that lets you extract models, maps, and files from Overwatch.
rize/UriTemplate
PHP URI Template (RFC 6570) supports both URI expansion & extraction
nissl-lab/toxy
.net text extraction & export framework
squid-box/SevenZipSharp
Fork of SevenZipSharp on CodePlex
puddly/android-otp-extractor
Extracts OTP tokens from rooted Android devices
nazuke/SEOMacroscope
SEO Macroscope is a website scanning tool, to check your website for broken links; including some technical SEO functionality, site scraping, Excel reporting, and more.
IceHacks/SurvivCheatInjector
An actual, updated, surviv.io cheat. Works great and we reply fast.
robinst/autolink-java
Java library to extract links (URLs, email addresses) from plain text; fast, small and smart
thrau/jarchivelib
A simple archiving and compression library for Java
nazywam/AutoIt-Ripper
Extract AutoIt scripts embedded in PE binaries
NeelShah18/emot
Open source Emoticons and Emoji detection library: emot
BobLd/tabula-sharp
Extract tables from PDF files (port of tabula-java)
DiegoCaraballo/Email-extractor
The main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url
Dither/full-text-rss
Full-Text RSS can transform partial feeds to deliver the full content stripped of clutter and ads
assafmo/xioc
Extract indicators of compromise from text, including "escaped" ones.
rossumai/docile
DocILE: Document Information Localization and Extraction Benchmark
evyatarmeged/stegextract
Detect hidden files and text in images
chrise96/3D_Ground_Segmentation
A ground segmentation algorithm for 3D point clouds based on the work described in “Fast segmentation of 3D point clouds: a paradigm on LIDAR data for Autonomous Vehicle Applications”, D. Zermas, I. Izzat and N. Papanikolopoulos, 2017. Distinguish between road and non-road points. Road surface extraction. Plane fit ground filter