Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, ...
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).