#自然语言处理#NLP 领域常见任务的实现,包括新词发现、以及基于pytorch的词向量、中文文本分类、实体识别、摘要文本生成、句子相似度判断、三元组抽取、预训练模型等。
#自然语言处理#Spanish word embeddings computed with different methods and from different corpora
#自然语言处理#Tools for shrinking fastText models (in gensim format)
#计算机科学#Text to abstract art generation for the holidays!
翻译 - 在假期给抽象艺术一代发短信!
A monolingual and cross-lingual meta-embedding generation and evaluation framework
#自然语言处理#Persian sentiment analysis ( آناکاوی سهش های فارسی | تحلیل احساسات فارسی )
#自然语言处理#PyTorch repository for text categorization and NER experiments in Turkish and English.
Improving Word Translation via Two-Stage Contrastive Learning (ACL 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.
An evaluation of word-embeddings for classification
Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"
Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).
Machine Translation from Sanskrit to Hindi using Unsupervised and Supervised Learning
#自然语言处理#Ensemble PhoBERT with FastText Embedding to improve performance on Vietnamese Sentiment Analysis tasks.
Improving Bilingual Lexicon Induction with Cross-Encoder Reranking (Findings of EMNLP 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.
#自然语言处理#Romanian Word Embeddings. Here you can find pre-trained corpora of word embeddings. Current methods: CBOW, Skip-Gram, Fast-Text (from Gensim library). The .vec and .model files are available for downl...
#自然语言处理#FastText-based Offensive Language Detection in Tweets using OLID & SOLID datasets with LIME visualizations for interpretability.
#自然语言处理#This project contains the code to use custom fasttext embeddings with flair framework.
Repository for the free online book Oddly Satisfying Deep Learning from Scratch (link below!)
Biomedical Word embeddings generated from Spanish Biomedical corpora.
Machine learning- based solution to the problem of duplicity in the bug reports repository.