DSpace at KOASAS: Multi-modal depth estimation from misaligned thermal and RGB images

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

Multi-modal depth estimation from misaligned thermal and RGB images비정렬 열 영상과 자연 영상으로부터의 다중 모달 깊이 추정

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 5
Download : 0

Export

DC Field	Value	Language
dc.contributor.advisor	김문철	-
dc.contributor.author	Kwon, Byeongjun	-
dc.contributor.author	권병준	-
dc.date.accessioned	2024-07-30T19:31:27Z	-
dc.date.available	2024-07-30T19:31:27Z	-
dc.date.issued	2024	-
dc.identifier.uri	http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1097161&flag=dissertation	en_US
dc.identifier.uri	http://hdl.handle.net/10203/321589	-
dc.description	학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2024.2,[v, 45 p. :]	-
dc.description.abstract	Depth estimation is a research area that focuses on predicting the depth of each pixel in an input image when matched to a 3D space. Research on depth estimation is highly applicable in various fields such as autonomous driving and virtual reality. Particularly in recent years, it has become a crucial study in the field of autonomous driving and robot vision. In this thesis, we propose an effective deep learning method for estimating the depth of images by simultaneously utilizing thermal and RGB images, which are actively studied for enhancing driver and pedestrian safety through automatic pedestrian detection in autonomous driving. In this thesis, we propose the method that complementarily predicts the depth from misaligned thermal and RGB images. Specifically, to utilize consistent information from thermal images captured during nighttime and RGB images representing consistent information during daytime, we propose: (i) feature extraction from misaligned thermal and RGB images and their Cross-fusion module, (ii) a shared encoder and decoder structure for multi-modal image input, and (iii) Multi-objective training strategy for simultaneous supervised training from multi-modal supervision. In particular, we use cross-attention methods to effectively extract features for depth prediction from corresponding positions in misaligned thermal and RGB images. Through various experiments, our proposed method demonstrates its effectiveness, achieving performance improvements of 7%-points and 4%-points, respectively, compared to using only each modal input (thermal or RGB images).	-
dc.language	eng	-
dc.publisher	한국과학기술원	-
dc.subject	깊이 추정▼a멀티 모달▼a비정렬▼a강인함	-
dc.subject	Depth estimation (DE)▼aMulti-modal▼aMisalignment▼aRobustness	-
dc.title	Multi-modal depth estimation from misaligned thermal and RGB images	-
dc.title.alternative	비정렬 열 영상과 자연 영상으로부터의 다중 모달 깊이 추정	-
dc.type	Thesis(Master)	-
dc.identifier.CNRN	325007	-
dc.description.department	한국과학기술원 :전기및전자공학부,	-
dc.contributor.alternativeauthor	Kim, Munchurl	-

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Multi-modal depth estimation from misaligned thermal and RGB images비정렬 열 영상과 자연 영상으로부터의 다중 모달 깊이 추정

KOASAS

Communities & Collections