#网络爬虫#Firecrawl 是一种 API 服务,它爬取URL并将其转换为清洗过的 markdown 或结构化数据
一个 HTML 转 Markdown 的 JavaScript 库
#网络爬虫#Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.
CommonMark/Markdown Java parser with source level AST. CommonMark 0.28, emulation of: pegdown, kramdown, markdown.pl, MultiMarkdown. With HTML to MD, MD to PDF, MD to DOCX conversion modules.
helloworld 开发者社区开源的一个轻量级,强大的 html 一键转 md 工具,支持多平台文章一键转换,并保存下载到本地。
It's time for your markup to get down! HTML to markdown converter. Breakdance is a highly pluggable, flexible and easy to use.
#自然语言处理#HTML to Markdown converter and crawler.
#大语言模型#A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG
🖱 Browser extension to copy hyperlinks, images, and selected text as Markdown with GFM support
reader is for your command line what the “readability” view is for modern browsers: A lightweight tool offering better readability of web pages on the CLI.
Slurps webpages and saves them as clean, uncluttered Markdown. Think Pocket, but better.
Firefox add-on to copy selection as Markdown
A CLI tool that converts exported Medium posts (html) to Jekyll/Hugo compatible markdown with front matter.
#大语言模型#🔥 This repository contains complete application examples, including websites and other projects, developed using Firecrawl.
😼 Dependency-free and lean DOM parser that outputs Markdown
HTML-to-Markdown converter that adaptively preserves HTML when needed (eg. when center-aligning, or resizing images)
#编辑器# 📝 XK-Editor | 一个支持富文本和Markdown的编辑器
#网络爬虫#The best HTML to Markdown library, A esm-native & Useful Utilities with simple, lightweight and epic quality.
Transform your HTML into clean, easy-to-read markdown with html2md.