#网络爬虫#Firecrawl 是一种 API 服务,它爬取URL并将其转换为清洗过的 markdown 或结构化数据
一个 HTML 转 Markdown 的 JavaScript 库
#网络爬虫#Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.
CommonMark/Markdown Java parser with source level AST. CommonMark 0.28, emulation of: pegdown, kramdown, markdown.pl, MultiMarkdown. With HTML to MD, MD to PDF, MD to DOCX conversion modules.
helloworld 开发者社区开源的一个轻量级,强大的 html 一键转 md 工具,支持多平台文章一键转换,并保存下载到本地。
#自然语言处理#HTML to Markdown converter and crawler.
It's time for your markup to get down! HTML to markdown converter. Breakdance is a highly pluggable, flexible and easy to use.
#大语言模型#A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG
🖱 Browser extension to copy hyperlinks, images, and selected text as Markdown with GFM support
reader is for your command line what the “readability” view is for modern browsers: A lightweight tool offering better readability of web pages on the CLI.
#大语言模型#🔥 This repository contains complete application examples, including websites and other projects, developed using Firecrawl.
Slurps webpages and saves them as clean, uncluttered Markdown. Think Pocket, but better.
Firefox add-on to copy selection as Markdown
A CLI tool that converts exported Medium posts (html) to Jekyll/Hugo compatible markdown with front matter.
😼 Dependency-free and lean DOM parser that outputs Markdown
HTML-to-Markdown converter that adaptively preserves HTML when needed (eg. when center-aligning, or resizing images)
#网络爬虫#The best HTML to Markdown library, A esm-native & Useful Utilities with simple, lightweight and epic quality.
#编辑器# 📝 XK-Editor | 一个支持富文本和Markdown的编辑器
Transform your HTML into clean, easy-to-read markdown with html2md.