Showing results 1 to 8 of 8
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation Choi, Jeongsoo; Park, Se Jin; Kim, Minsu; Ro, Yong Man, IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024, Computer Vision Foundation, IEEE Computer Society, 2024-06-19 |
DiffV2S: Diffusion-based Video-to-Speech Synthesiswith Vision-guided Speaker Embedding Choi, Jeongsoo; Hong, Joanna; Ro, Yong Man, IEEE/CVF International Conference on Computer Vision (ICCV), Computer Vision Foundation, IEEE Computer Society, 2023-10-04 |
Intelligible Lip-to-speech Synthesis with Speech Units Choi, Jeongsoo; Kim, Minsu; Ro, Yong Man, 24th INTERSPEECH Conference (INTERSPEECH 2023), International Speech Communication Association (ISCA), 2023-08-22 |
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge Kim, Minsu; Yeo, Jeong Hun; Choi, Jeongsoo; Ro, Yong Man, IEEE/CVF International Conference on Computer Vision (ICCV), Computer Vision Foundation, IEEE Computer Society, 2023-10-04 |
SyncTalkFace: Talking Face Generation with Precise Lip-syncing via Audio-Lip Memory Park, Se Jin; Kim, Minsu; Hong, Joanna; Choi, Jeongsoo; Ro, Yong Man, 36th AAAI Conference on Artificial Intelligence (AAAI 22), pp.2062 - 2070, Association for the Advancement of Artificial Intelligence, 2022-02-25 |
Text-driven Talking Face Synthesis by Reprogramming Audio-driven Models Choi, Jeongsoo; Kim, Minsu; Park, Se Jin; Ro, Yong Man, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024), IEEE Signal Processing Society, 2024-04-16 |
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens Kim, Minsu; Choi, Jeongsoo; Maiti, Soumi; Yeo, Jeong Hun; Watanabe, Shinji; Ro, Yong Man, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024), IEEE Signal Processing Society, 2024-04-16 |
Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring Hong, Joanna; Kim, Minsu; Choi, Jeongsoo; Ro, Yong Man, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Computer Vision Foundation, IEEE Computer Society, 2023-06-20 |
Discover