DSpace at KOASAS: Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Conference Papers(학술회의논문)

Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 28
Download : 0

Export

Lee, Donghoon / Luu, Tung M. / Lee, Younghwan / Yoo, Chang D.researcher

Recent research highlights the potential of multimodal foundation models in tackling complex decision-making challenges. However, their large parameters make real-world deployment resource-intensive and often impractical for constrained systems. Reinforcement learning (RL) shows promise for task-specific agents but suffers from high sample complexity, limiting practical applications. To address these challenges, we introduce LVLM to Policy (LVLM2P), a novel framework that distills knowledge from large vision-language models (LVLM) into more efficient RL agents. Our approach leverages the LVLM as a teacher, providing instructional actions based on trajectories collected by the RL agent, which helps reduce less meaningful exploration in the early stages of learning, thereby significantly accelerating the agent's learning progress. Additionally, by leveraging the LVLM to suggest actions directly from visual observations, we eliminate the need for manual textual descriptors of the environment, enhancing applicability across diverse tasks. Experiments show that LVLM2P significantly enhances the sample efficiency of baseline RL algorithms. The code is available at https://github.com/i22024/LVLM2P.

Publisher: Institute of Electrical and Electronics Engineers Inc.

Issue Date: 2025-04

Language: English

Citation: 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025

DOI: 10.1109/ICASSP49660.2025.10888998

URI: http://hdl.handle.net/10203/336599

Appears in Collection: EE-Conference Papers(학술회의논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Academic Information Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Sample Efficient Reinforcement Learning via Large Vision Language Model Distillation

KOASAS

Communities & Collections