GitHub 中文社区
回车: Github搜索    Shift+回车: Google搜索
论坛
排行榜
趋势
登录

©2025 GitHub中文社区论坛GitHub官网网站地图GitHub官方翻译

  • X iconGitHub on X
  • Facebook iconGitHub on Facebook
  • Linkedin iconGitHub on LinkedIn
  • YouTube iconGitHub on YouTube
  • Twitch iconGitHub on Twitch
  • TikTok iconGitHub on TikTok
  • GitHub markGitHub’s organization on GitHub
集合主题趋势排行榜
#

scraping

Website
Wikipedia
scrapy/scrapy
https://static.github-zh.com/github_avatars/scrapy?size=40
scrapy / scrapy

#爬虫框架#一款流行,高效,生态丰富的Python爬虫框架

Pythonscrapingcrawling框架爬虫Hacktoberfestweb-scrapingweb-scraping-python
Python 57.54 k
2 天前
https://static.github-zh.com/github_avatars/mendableai?size=40
mendableai / firecrawl

#网络爬虫#Firecrawl 是一种 API 服务,它爬取URL并将其转换为清洗过的 markdown 或结构化数据

人工智能爬虫dataMarkdownscraperhtml-to-markdown大语言模型ragscrapingweb-crawlerai-scrapingwebscraping
TypeScript 42.81 k
12 小时前
https://static.github-zh.com/github_avatars/feder-cr?size=40
feder-cr / Jobs_Applier_AI_Agent_AIHawk

#网络爬虫#AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.

自动化BotChatGPTgptjobjobsearchjobseekeropeaiPythonresumescraperscrapingapplication-resumeSeleniumChromehuman-resourcesjobsagent人工智能
Python 28.44 k
1 个月前
https://static.github-zh.com/github_avatars/gocolly?size=40
gocolly / colly

#爬虫框架#一个快速优雅的Golang爬虫框架

Goscraper框架爬虫scrapingcrawlingspider
Go 24.43 k
24 天前
https://static.github-zh.com/github_avatars/ScrapeGraphAI?size=40
ScrapeGraphAI / Scrapegraph-ai

#网络爬虫#Python scraper based on AI

scrapingscraping-pythonautomated-scraper大语言模型人工智能web-crawlerweb-scrapingai-scraping爬虫html-to-markdownMarkdownrag
Python 20.25 k
9 天前
apify/crawlee
https://static.github-zh.com/github_avatars/apify?size=40
apify / crawlee

#网络爬虫#Crawlee - 一个用于Node.js 开发的网页爬虫和浏览器自动化库

web-scrapingweb-crawlingnpmheadless-chromePuppeteer自动化apifyscrapingcrawling爬虫headlessscraperweb-crawlerJavaScriptNode.jsPlaywrightTypeScript
TypeScript 18.44 k
13 小时前
soxoj/maigret
https://static.github-zh.com/github_avatars/soxoj?size=40
soxoj / maigret

#网络爬虫#Maigret 是一个OSINT用户名检查器。输入目标用户名,即可从各大社交网站采集该用户信息的工具。fork自sherlock开源项目

OSINTsocial-networkidentificationParsingsocmintusername-checkerusername-searchsherlockusernameinvestigationnamecheckerPythonOpen SourceCybersecurityscrapingosint-pythonredteamblueteamosint-framework
Python 15.56 k
5 天前
https://static.github-zh.com/github_avatars/psf?size=40
psf / requests-html

#网络爬虫#Pythonic HTML Parsing for Humans™

HTMLscrapingPythonrequestsHTTPkennethreitzlxmlpyquerycss-selectorsbeautifulsoup
Python 13.83 k
1 年前
https://static.github-zh.com/github_avatars/code4craft?size=40
code4craft / webmagic

#网络爬虫#webmagic是一个开源的Java垂直爬虫框架,目标是简化爬虫的开发流程,让开发者专注于逻辑功能的开发。webmagic的核心非常简单,但是覆盖爬虫的整个流程,也是很好的学习爬虫开发的材料。

爬虫Javascraping框架
Java 11.58 k
8 天前
https://static.github-zh.com/github_avatars/ultrafunkamsterdam?size=40
ultrafunkamsterdam / undetected-chromedriver

#网络爬虫#Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)

chromedriverSeleniumwebdriverChromeanti-detectionanti-botdistilbrowser自动化scrapingPythoncaptchanavigatorTestingCloudflarecloudflare-bypassbot-detection
Python 11.43 k
7 天前
https://static.github-zh.com/github_avatars/tabulapdf?size=40
tabulapdf / tabula

#网络爬虫#Tabula is a tool for liberating data tables trapped inside PDF files

pdfCSVexceltablesscraping
CSS 7.11 k
4 个月前
https://static.github-zh.com/github_avatars/lorien?size=40
lorien / awesome-web-scraping

#网络爬虫#List of libraries, tools and APIs for web scraping and data processing.

web-scrapingcaptcha-recaptchacrawlingcrawling-pythonscrapingscraping-frameworkscraping-pythonscraping-toolwebscraping爬虫spider
Makefile 7.07 k
6 个月前
alirezamika/autoscraper
https://static.github-zh.com/github_avatars/alirezamika?size=40
alirezamika / autoscraper

#网络爬虫#A Smart, Automatic, Fast and Lightweight Web Scraper for Python

scrapingscraperscrapewebscraping爬虫web-scraping人工智能Pythonwebautomation自动化机器学习
Python 6.84 k
1 个月前
D4Vinci/Scrapling
https://static.github-zh.com/github_avatars/D4Vinci?size=40
D4Vinci / Scrapling

#网络爬虫#🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!

爬虫crawlingcrawling-pythonHacktoberfestPlaywrightPythonscrapingselectorsstealth-gameweb-scraperweb-scrapingweb-scraping-pythonwebscrapingxpath自动化人工智能ai-scrapingdatadata-extraction
Python 6.29 k
1 天前
MontFerret/ferret
https://static.github-zh.com/github_avatars/MontFerret?size=40
MontFerret / ferret

#网络爬虫#Declarative web scraping

Goquery-languagedata-miningscrapingscraping-websitesdslcdpcrawlingscraper爬虫Chrome命令行界面工具Library
Go 5.83 k
4 天前
https://static.github-zh.com/github_avatars/apify?size=40
apify / crawlee-python

#网络爬虫#Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works...

apify自动化beautifulsoup爬虫crawlingheadlessheadless-chromepipPlaywrightPythonscraperscrapingweb-crawlerweb-crawlingweb-scrapingHacktoberfest
Python 5.81 k
16 小时前
https://static.github-zh.com/github_avatars/yujiosaka?size=40
yujiosaka / headless-chrome-crawler

#网络爬虫#Distributed crawler powered by Headless Chrome

headless-chromePuppeteerjQuery爬虫crawlingscraperscrapingChromeChromiumPromise
JavaScript 5.58 k
2 年前
https://static.github-zh.com/github_avatars/adbar?size=40
adbar / trafilatura

#网络爬虫#Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

web-scrapingtext-extraction自然语言处理text-mining爬虫text-preprocessingarticle-extractorreadabilityscrapinghtml-to-markdowncorpus-toolsrss-feednews-aggregatorrag大语言模型
Python 4.47 k
1 个月前
https://static.github-zh.com/github_avatars/sparklemotion?size=40
sparklemotion / mechanize

#网络爬虫#Mechanize is a ruby library that makes automated web interaction easy.

scrapingWebRuby
Ruby 4.42 k
1 个月前
https://static.github-zh.com/github_avatars/CodeCutTech?size=40
CodeCutTech / Data-science

#网络爬虫#数据科学相关的工具、软件以及文章合集

数据科学机器学习自然语言处理Python数据可视化数据分析articles人工智能time-seriesscraping
Jupyter Notebook 4.11 k
1 个月前
loading...