html-to-markdown
There are 124 repositories under html-to-markdown topic.
firecrawl/firecrawl
The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data 🔥
ScrapeGraphAI/Scrapegraph-ai
Python scraper based on AI
mixmark-io/turndown
🛏 An HTML to Markdown converter written in JavaScript
adbar/trafilatura
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
JohannesKaufmann/html-to-markdown
⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.
vsch/flexmark-java
CommonMark/Markdown Java parser with source level AST. CommonMark 0.28, emulation of: pegdown, kramdown, markdown.pl, MultiMarkdown. With HTML to MD, MD to PDF, MD to DOCX conversion modules.
any4ai/AnyCrawl
AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.
helloworld-Co/html2md
helloworld 开发者社区开源的一个轻量级,强大的 html 一键转 md 工具,支持多平台文章一键转换,并保存下载到本地。
philschmid/clipper.js
HTML to Markdown converter and crawler.
firecrawl/firecrawl-app-examples
🔥 This repository contains complete application examples, including websites and other projects, developed using Firecrawl.
breakdance/breakdance
It's time for your markup to get down! HTML to markdown converter. Breakdance is a highly pluggable, flexible and easy to use.
devflowinc/firecrawl-simple
➖ Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed. Crawl and convert any website into LLM-ready markdown.
paulpierre/markdown-crawler
A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG
mrusme/reader
reader is for your command line what the “readability” view is for modern browsers: A lightweight tool offering better readability of web pages (and EML files!) on the CLI.
notlmn/copy-as-markdown
📋 Browser extension to copy text as Markdown (with GFM and MathML support)
inhumantsar/slurp
Slurps webpages and saves them as clean, uncluttered Markdown. Think Pocket, but better.
0x6b/copy-selection-as-markdown
Firefox add-on to copy selection as Markdown
gtmdh/medium-2-md
A CLI tool that converts exported Medium posts (html) to Jekyll/Hugo compatible markdown with front matter.
Spenhouet/confluence-markdown-exporter
Export Atlassian Confluence pages as markdown files.
bevacqua/domador
:smirk_cat: Dependency-free and lean DOM parser that outputs Markdown
oidlabs-com/Lexoid
Multimodal document parser for high quality data understanding and extraction
inaridiy/webforai
The best HTML to Markdown library, A esm-native & Useful Utilities with simple, lightweight and epic quality.
EvitanRelta/htmlarkdown
HTML-to-Markdown converter that adaptively preserves HTML when needed (eg. when center-aligning, or resizing images)
agarwalvishal/claude-chat-exporter
Claude Chat Exporter is a JavaScript tool that allows you to export your conversations with Claude AI into a well-formatted Markdown file.
tim-gromeyer/html2md
Transform your HTML into clean, easy-to-read markdown with html2md.
syfxlin/xkeditor
:pencil: XK-Editor | 一个支持富文本和Markdown的编辑器
ActuallyTaylor/SwiftHTMLToMarkdown
A simple Swift package that converts HTML into Markdown
lightfeed/extractor
Using LLMs and AI Browser Automation to Robustly Extract Web Data
Stardown-app/Stardown
Copy the web as markdown
iw4p/url-to-markdown
URL to Markdown API is a service that convert web content into clean, structured Markdown format through a simple HTTP GET request. It's built using FastAPI and the MarkItDown library, offering a straightforward way to convert various content types (web pages, YouTube videos, PDFs, documents) into Markdown that's optimized for Large Language Mod
kasvith/htmd
A fast HTML to Markdown converter for Elixir, powered by Rust
dedalozzo/converter
A set of classes to translate a text from a HTML to BBcode and from BBCode to Markdown.
ParryQiu/Generate-Cnblogs-Articles-To-Markdown
导出博客园的文章成 Markdown 文件存储
izyuumi/html2md-rs
HTML to Markdown converter written in Rust
opendocs-md/do-tutorials
Digital Ocean tutorials in Markdown format
spider-rs/web-crawling-guides
How to guides on web-crawling or scraping