#计算机科学#DAMO-YOLO: a fast and accurate object detection method with some new techs, including NAS backbones, efficient RepGFPN, ZeroHead, AlignedOTA, and distillation enhancement.
DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.
DAMON user-space tool
ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab
Lightweight HTML5 Canvas Danmaku Engine
UITableView subclass that abstracts away the ugliness involved with creating static or modular UITableViews. Settings and menu pages are a snap to create with DAModularTableView.
A terminal UI (TUI) for HashiCorp Nomad
DAMOV is a benchmark suite and a methodical framework targeting the study of data movement bottlenecks in modern applications. It is intended to study new architectures, such as near-data processing. ...
cocos creator
Official implementation of "Composer: Creative and Controllable Image Synthesis with Composable Conditions"
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Self-Supervised Pre-Training for Transformer-Based Person Re-Identification
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech
基于达摩院在深度学习、计算机视觉、地理空间分析等方向上的技术积累,结合阿里云强大算力支撑,提供多源对地观测数据的云计算分析服务,用数据感知地球世界,让AI助力科学研究。
Source code of ICML'22 paper: FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
Data and code for paper "M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models"
#大语言模型#Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
The official code for "One Fits All: Power General Time Series Analysis by Pretrained LM (NeurIPS 2023 Spotlight)"