|
Haoran Wang
I am now a senior undergraduate student at
ACM Honors Class,
Shanghai Jiao Tong University majoring in Computer Science.
Now I am a visiting student at
WAVLab affiliated with LTI/CMU, advised by Prof.
Shinji Watanabe. During my junior year, I am very fortunate to work with Prof.
Kai Yu at
X-LANCE Lab.
My research interests lie in generative audio and speech processing, especially in neural audio codecs and speech
language models. My long-term goal is to pioneer a universal audio foundation model capable of holistically
understanding and generating the full spectrum of sound, including speech, music, and complex acoustic scenes, while
making these models high-fidelity, controllable, and efficient.
Feel free to check out my CV and shoot me an email if you'd like to chat.
Email  / 
CV  / 
Google Scholar  / 
Github / 
|
|
Education
 |
Carnegie Mellon University
Visiting Scholar • Aug. 2025 - Present
Advisor: Prof.
Shinji Watanabe
|
 |
Shanghai Jiao Tong University
B.Eng. in Computer Science • Sept. 2022 - June 2026 (Expected)
Academic Advisor: Prof.
Yong Yu
Research Advisor: Prof.
Kai Yu
|
Selected Publications
My research interests lie in speech and audio processing, especially neural codecs and speech language models for complex acoustic environments. I am interested in building general and high-fidelity speech representations that are useful for both reconstruction and downstream tasks.
Representative works are highlighted.
BSCodec: A Band-Split Neural Codec for High-Quality Universal Audio Reconstruction
Haoran Wang, Jiatong Shi, Jinchuan Tian, Bohan Li, Kai Yu, Shinji Watanabe
Under review.
arXiv /
code /
demo
|
Towards General Discrete Speech Codec for Complex Acoustic Environments: A Study of Reconstruction and Downstream Task Consistency
Haoran Wang, Guanyu Chen, Bohan Li, Hankun Wang, Yiwei Guo, Zhihan Li, Xie Chen, Kai Yu
IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2025.
arXiv
|
PURE Codec: Progressive Unfolding of Residual Entropy for Speech Codec Learning
Jiatong Shi, Haoran Wang, William Chen, Chenda Li, Wangyou Zhang, Jinchuan Tian, Shinji Watanabe
IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2025.
|
Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective
Hankun Wang, Haoran Wang, Yiwei Guo, Zhihan Li, Chenpeng Du, Xie Chen, Kai Yu
Under review.
arXiv
|
This homepage is designed based on a previous version of Xingyang Li's website, which itself was inspired by the designs of Haoran Geng and Jon Barron. Last updated: Nov. 29, 2025
© 2025 Haoran Wang
|