Detai Xin - Google Scholar

Get my own profile

Cited by

	All	Since 2019
Citations	154	154
h-index	7	7
i10-index	5	5

0

70

35

202120222023202411 31 67 45

Detai Xin

Detai Xin

The University of Tokyo

Verified email at ipc.i.u-tokyo.ac.jp

Speech processing Speech synthesis Machine Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Utmos: Utokyo-sarulab system for voicemos challenge 2022 T Saeki, D Xin, W Nakata, T Koriyama, S Takamichi, H Saruwatari arXiv preprint arXiv:2204.02152, 2022	52	2022
Disentangled speaker and language representations using mutual information minimization and domain adaptation for cross-lingual tts D Xin, T Komatsu, S Takamichi, H Saruwatari ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021	28	2021
Cross-Lingual Text-To-Speech Synthesis via Domain Adaptation and Perceptual Similarity Regression in Speaker Space. D Xin, Y Saito, S Takamichi, T Koriyama, H Saruwatari Interspeech, 2947-2951, 2020	20	2020
Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis. D Xin, Y Saito, S Takamichi, T Koriyama, H Saruwatari Interspeech, 1614-1618, 2021	11	2021
Exploring the effectiveness of self-supervised learning and classifier chains in emotion recognition of nonverbal vocalizations D Xin, S Takamichi, H Saruwatari arXiv preprint arXiv:2206.10695, 2022	10	2022
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech D Yang, T Koriyama, Y Saito, T Saeki, D Xin, H Saruwatari ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	9	2023
Improving speech prosody of audiobook text-to-speech synthesis with acoustic and textual contexts D Xin, S Adavanne, F Ang, A Kulkarni, S Takamichi, H Saruwatari ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	7	2023
Laughter synthesis using pseudo phonetic tokens with a large-scale in-the-wild laughter corpus D Xin, S Takamichi, A Morimatsu, H Saruwatari arXiv preprint arXiv:2305.12442, 2023	5	2023
NaturalSpeech 3: Zero-shot speech synthesis with factorized codec and diffusion models Z Ju, Y Wang, K Shen, X Tan, D Xin, D Yang, Y Liu, Y Leng, K Song, ... arXiv preprint arXiv:2403.03100, 2024	3	2024
Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control A Watanabe, S Takamichi, Y Saito, W Nakata, D Xin, H Saruwatari 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023	3	2023
Mid-attribute speaker generation using optimal-transport-based interpolation of gaussian mixture models A Watanabe, S Takamichi, Y Saito, D Xin, H Saruwatari ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	3	2023
How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics J Park, S Takamichi, T Nakamura, K Seki, D Xin, H Saruwatari arXiv preprint arXiv:2306.00697, 2023	2	2023
JNV corpus: A corpus of Japanese nonverbal vocalizations with diverse phrases and emotions D Xin, S Takamichi, H Saruwatari Speech Communication 156, 103004, 2024	1	2024
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis D Xin, X Tan, K Shen, Z Ju, D Yang, Y Wang, S Takamichi, H Saruwatari, ... arXiv preprint arXiv:2404.03204, 2024		2024
Building speech corpus with diverse voice characteristics for its prompt-based representation A Watanabe, S Takamichi, Y Saito, W Nakata, D Xin, H Saruwatari arXiv preprint arXiv:2403.13353, 2024		2024
JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions D Xin, J Jiang, S Takamichi, Y Saito, A Aizawa, H Saruwatari IEEE Access, 2024		2024
Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation D Xin, S Takamichi, T Okamoto, H Kawai, H Saruwatari arXiv preprint arXiv:2204.10561, 2022		2022
Emotional Speech with Nonverbal Vocalizations: Corpus Design, Synthesis, and Detection D Xin

The system can't perform the operation now. Try again later.

Articles 1–18