🤗 Kimi-Audio-7B | 🤗 Kimi-Audio-7B-Instruct | 📑 Paper
We present Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation. This repository hosts the model checkpoints for Kimi-Audio-7B.
Kimi-Audio is designed as a universal audio foundation model capable of handling a wide variety of audio processing tasks within a single unified framework. Key features include:
For more details, please refer to our GitHub Repository and Technical Report.
Kimi-Audio-7B is a base model without fine-tuning. So it cannot be used directly. The base model is quite flexible, you can fine-tune it on any possible downstream tasks.
If you are looking for an out-of-the-box model, please refer to Kimi-Audio-7B-Instruct.
If you find Kimi-Audio useful in your research or applications, please cite our technical report:
@misc{kimi_audio_2024, title={Kimi-Audio Technical Report}, author={Kimi Team}, year={2024}, eprint={arXiv:placeholder}, archivePrefix={arXiv}, primaryClass={cs.CL} }
The model is based and modified from Qwen 2.5-7B. Code derived from Qwen2.5-7B is licensed under the Apache 2.0 License. Other parts of the code are licensed under the MIT License.