#自然语言处理#工业级的 Python/CPython 自然语言处理(NLP)库
#大语言模型#Easy token price estimates for 400+ LLMs. TokenOps.
A suite of image and video neural tokenizers
LunaSec - Dependency Security Scanner that automatically notifies you about vulnerabilities like Log4Shell or node-ipc in your Pull Requests and Builds. Protect yourself in 30 seconds with the LunaTra...
翻译 - LunaSec - 安全性和合规性 SDK,可阻止软件中的数据泄漏。只需几行代码,LunaSec 就在您的堆栈中添加了零信任架构、独特的每记录加密以及针对 XSS、SQL 注入和 RCE 等常见安全问题的保护。在这里现场试用:https://app.lunasec.dev
#安全#Secure Vault for Customer PII/PHI/PCI/KYC Records
翻译 - 安全存储符合GDPR要求的个人记录
#区块链#Ravencoin Core integration/staging tree
翻译 - Ravencoin Core集成/分级树
#自然语言处理#Unsupervised text tokenizer focused on computational efficiency
翻译 - 无监督文本令牌生成器专注于计算效率
#自然语言处理#👑 spaCy building blocks and visualizers for Streamlit apps
翻译 - Stream适用于Streamlit应用程序的spaCy构建块和可视化工具
#自然语言处理#All the slides, accompanying code and exercises all stored in this repo. 🎈
#自然语言处理#Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
翻译 - Trankit是用于多语言自然语言处理的基于轻型变压器的Python工具包
#自然语言处理#Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashta...
Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript
Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing
#自然语言处理#PHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language
ClangKit provides an Objective-C frontend to LibClang. Source tokenization, diagnostics and fix-its are actually implemented.
#自然语言处理#🎤 vibrato: Viterbi-based accelerated tokenizer
Sudachi in Rust 🦀 and new generation of SudachiPy
#自然语言处理#Fast and customizable text tokenization library with BPE and SentencePiece support
#时序数据库#The official code 👩💻 for - TOTEM: TOkenized Time Series EMbeddings for General Time Series Analysis
[NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.