Entity Resolution is the task of detecting different entity profiles that describe the same real-world objects.
Deduplication Based Filesystem
#数据仓库#Fast Semantic Text Deduplication & Filtering
RabbitMQ Plugin for filtering message duplicates
🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
"1 + 1 = 1 or Record Deduplication with Python" Jupyter Notebook
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Streaming Deduplication Package for Go
Blocklist compilation and deduplication
#安全#Data deduplication engine, supporting optional compression and public key encryption.
Resources for tackling record linkage / deduplication / data matching problems
Enable deduplication with non-Synology SSDs and unsupported NAS models
A containerd snapshotter with data deduplication and lazy loading in P2P fashion
文档去重功能是为了解决搜索引擎的文档语义重复的问题,方法是多重哈希下的语义指纹算法。
#安全#Cross-platform backup tool for Windows, macOS & Linux with fast, incremental backups, client-side end-to-end encryption, compression and data deduplication. CLI and GUI included.