"Your own personal internet archive" (网站存档 / 爬虫),一个自托管的网站时光机
Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.
Desktop Electron app for ArchiveBox internet archiver. (ALPHA: not ready for general use)
List of open workflows and resources for A/V archiving
Home of the official docker image for ArchiveBox
FileTrove indexes files and creates metadata from them.
IFIscripts is an open-source digital preservation tool which facilitates collection management workflows within the IFI and further afield. It is freely available from the GitHub repository and subjec...
Homebrew formula for the ArchiveBox self-hosted internet archiving solution.
Engine for analysis of Siegfried export files and DROID CSV. The tool has three purposes, break the export into its components and store them within a SQLite database; create additional columns to aug...
Official ArchiveBox MITM proxy: saves URLs of all requests passing through to an ArchiveBox server for archival.
Collection of resources, papers, blog posts, and other documentation around working on and with Archivematica.
DigestBox takes any webpage URL (news article, video link, comment thread, etc.) and gives you just the raw content. It's powered by ArchiveBox.io under the hood.
Home of the official apt/deb package for Ubuntu/Debian-based systems.
#网络爬虫#🧩 Proposal to allow user scripts like "expand comments", "hide popups", "fill out this form", etc. to be reusable across pure browser environments, puppeteer, playwright, extensions, AI tools, and ma...
Official Python package for ArchiveBox, the self-hosted internet archiving solution.