#

html-to-markdown

https://static.github-zh.com/github_avatars/firecrawl?size=40

#网络爬虫#Firecrawl 是一种 API 服务,它爬取URL并将其转换为清洗过的 markdown 或结构化数据

TypeScript 57.46 k
33 分钟前
https://static.github-zh.com/github_avatars/mixmark-io?size=40
HTML 10.26 k
1 个月前
https://static.github-zh.com/github_avatars/adbar?size=40
Python 4.67 k
3 天前
JohannesKaufmann/html-to-markdown
https://static.github-zh.com/github_avatars/JohannesKaufmann?size=40

⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.

Go 3.08 k
22 天前
https://static.github-zh.com/github_avatars/vsch?size=40

CommonMark/Markdown Java parser with source level AST. CommonMark 0.28, emulation of: pegdown, kramdown, markdown.pl, MultiMarkdown. With HTML to MD, MD to PDF, MD to DOCX conversion modules.

Java 2.49 k
5 个月前
https://static.github-zh.com/github_avatars/any4ai?size=40

#网络爬虫#AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.

TypeScript 2.14 k
1 天前
https://static.github-zh.com/github_avatars/helloworld-Co?size=40

helloworld 开发者社区开源的一个轻量级,强大的 html 一键转 md 工具,支持多平台文章一键转换,并保存下载到本地。

JavaScript 768
1 年前
https://static.github-zh.com/github_avatars/firecrawl?size=40

#大语言模型#🔥 This repository contains complete application examples, including websites and other projects, developed using Firecrawl.

Jupyter Notebook 543
3 个月前
https://static.github-zh.com/github_avatars/breakdance?size=40

It's time for your markup to get down! HTML to markdown converter. Breakdance is a highly pluggable, flexible and easy to use.

JavaScript 533
3 年前
https://static.github-zh.com/github_avatars/paulpierre?size=40

#大语言模型#A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG

Python 400
1 年前
https://static.github-zh.com/github_avatars/mrusme?size=40

reader is for your command line what the “readability” view is for modern browsers: A lightweight tool offering better readability of web pages (and EML files!) on the CLI.

Go 374
2 个月前
https://static.github-zh.com/github_avatars/notlmn?size=40

📋 Browser extension to copy text as Markdown (with GFM and MathML support)

JavaScript 366
3 个月前
https://static.github-zh.com/github_avatars/inhumantsar?size=40

Slurps webpages and saves them as clean, uncluttered Markdown. Think Pocket, but better.

TypeScript 243
9 个月前
https://static.github-zh.com/github_avatars/0x6b?size=40
JavaScript 208
3 个月前
https://static.github-zh.com/github_avatars/gautamd8?size=40

A CLI tool that converts exported Medium posts (html) to Jekyll/Hugo compatible markdown with front matter.

JavaScript 148
1 年前
https://static.github-zh.com/github_avatars/bevacqua?size=40

😼 Dependency-free and lean DOM parser that outputs Markdown

JavaScript 86
3 年前
loading...
Website
Wikipedia