Sledovat
Wei-Ning Hsu
Wei-Ning Hsu
Facebook AI Research (FAIR)
E-mailová adresa ověřena na: csail.mit.edu - Domovská stránka
Název
Citace
Citace
Rok
Hubert: Self-supervised speech representation learning by masked prediction of hidden units
WN Hsu, B Bolte, YHH Tsai, K Lakhotia, R Salakhutdinov, A Mohamed
IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 3451-3460, 2021
28152021
Data2vec: A general framework for self-supervised learning in speech, vision and language
A Baevski, WN Hsu, Q Xu, A Babu, J Gu, M Auli
International Conference on Machine Learning, 1298-1312, 2022
8582022
An unsupervised autoregressive model for speech representation learning
YA Chung, WN Hsu, H Tang, J Glass
INTERSPEECH, 2019
4602019
Unsupervised learning of disentangled and interpretable representations from sequential data
WN Hsu, Y Zhang, J Glass
Thirty-first Conference on Neural Information Processing Systems (NeurIPS), 2017
4302017
On generative spoken language modeling from raw audio
K Lakhotia, E Kharitonov, WN Hsu, Y Adi, A Polyak, B Bolte, TA Nguyen, ...
Transactions of the Association for Computational Linguistics 9, 1336-1354, 2021
3282021
Unsupervised speech recognition
A Baevski, WN Hsu, A Conneau, M Auli
Advances in Neural Information Processing Systems 34, 27826-27839, 2021
3122021
Hierarchical generative modeling for controllable speech synthesis
WN Hsu, Y Zhang, RJ Weiss, H Zen, Y Wu, Y Wang, Y Cao, Y Jia, Z Chen, ...
Seventh International Conference on Learning Representations (ICLR), 2019
311*2019
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
A Polyak, Y Adi, J Copet, E Kharitonov, K Lakhotia, WN Hsu, A Mohamed, ...
INTERSPEECH, 2021
2932021
Learning audio-visual speech representation by masked multimodal cluster prediction
B Shi, WN Hsu, K Lakhotia, A Mohamed
arXiv preprint arXiv:2201.02184, 2022
2852022
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training
WN Hsu, A Sriram, A Baevski, T Likhomanenko, Q Xu, V Pratap, J Kahn, ...
INTERSPEECH, 2021
2492021
Scaling speech technology to 1,000+ languages
V Pratap, A Tjandra, B Shi, P Tomasello, A Babu, S Kundu, A Elkahky, ...
Journal of Machine Learning Research 25 (97), 1-52, 2024
2342024
Lingvo: a modular and scalable framework for sequence-to-sequence modeling
J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ...
arXiv preprint arXiv:1902.08295, 2019
2122019
Active learning by learning
WN Hsu, HT Lin
Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015
2072015
Voicebox: Text-guided multilingual universal speech generation at scale
M Le, A Vyas, B Shi, B Karrer, L Sari, R Moritz, M Williamson, V Manohar, ...
Advances in neural information processing systems 36, 2024
2032024
Learning Latent Representations for Speech Generation and Transformation
WN Hsu, Y Zhang, J Glass
INTERSPEECH, 1273-1277, 2017
1862017
Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation
WN Hsu, Y Zhang, J Glass
2017 IEEE automatic speech recognition and understanding workshop (ASRU), 16-23, 2017
1772017
Direct speech-to-speech translation with discrete units
A Lee, PJ Chen, C Wang, J Gu, S Popuri, X Ma, A Polyak, Y Adi, Q He, ...
arXiv preprint arXiv:2107.05604, 2021
1582021
Semi-supervised training for improving data efficiency in end-to-end speech synthesis
YA Chung, Y Wang, WN Hsu, Y Zhang, RJ Skerry-Ryan
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1412019
Textless speech-to-speech translation on real data
A Lee, H Gong, PA Duquenne, H Schwenk, PJ Chen, C Wang, S Popuri, ...
arXiv preprint arXiv:2112.08352, 2021
1372021
Disentangling correlated speaker and noise for speech synthesis via data augmentation and adversarial factorization
WN Hsu, Y Zhang, RJ Weiss, YA Chung, Y Wang, Y Wu, J Glass
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1302019
Systém momentálně nemůže danou operaci provést. Zkuste to znovu později.
Články 1–20