🤖/👨🦰 Detect bots/crawlers/spiders using the user agent string
#网络爬虫#An R web crawler and scraper
#网络爬虫#Open source SEO auditing tool.
#搜索#Norconex Crawlers (or spiders) are flexible web and filesystem crawlers for collecting, parsing, and manipulating data from the web or filesystem to various data repositories such as search engines.
Astray is a lua based maze, room and dungeon generation library for dungeon crawlers and rougelike video games
#搜索#Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.
#网络爬虫#Vietnamese text data crawler scripts for various sites (including Youtube, Facebook, 4rum, news, ...)
#网络爬虫#hproxy - Asynchronous IP proxy pool, aims to make getting proxy as convenient as possible.(异步爬虫代理池)
Block crawlers and high traffic users on your site by IP using Redis
#网络爬虫#Raven is a powerful and customizable web crawler written in Go.
#网络爬虫#Tiny script to crawl information of a specific application in the Google play/store base on PHP.
#网络爬虫#Sneakpeek is a framework that helps to quickly and conviniently develop scrapers. It’s the best choice for scrapers that have some specific complex scraping logic that needs to be run on a constant ba...
#网络爬虫#Serritor is an open source web crawler framework built upon Selenium and written in Java. It can be used to crawl dynamic web pages that require JavaScript to render data.
User agent database in JSON format of bots, crawlers, certain malware, automated software, scripts and uncommon ones.
#网络爬虫#Python script to check if there is any differences in responses of an application when the request comes from a search engine's crawler.