Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces

CORC > 自动化研究所 > 中国科学院自动化研究所 > 精密感知与控制研究中心 > 人工智能与机器学习

	Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces
	Li HF(李海芳)1 ; Yingce Xia 2; Wensheng Zhang1
	2018-04
会议日期	July 13-19 2018
会议地点	Stockholm, Sweden
英文摘要	Policy evaluation with linear function approximation is an important problem in reinforcement learning. When facing high-dimensional feature spaces, such a problem becomes extremely hard considering the computation efficiency and quality of approximations. We propose a new algorithm, LSTD(λ)-RP, which leverages random projection techniques and takes eligibility traces into consideration to tackle the above two challenges. We carry out theoretical analysis of LSTD(λ)-RP, and provide meaningful upper bounds of the estimation error, approximation error and total generalization error. These results demonstrate that LSTD(λ)-RP can benefit from random projection and eligibility traces strategies, and LSTD(λ)-RP can achieve better performances than prior LSTDRP and LSTD(λ) algorithms.
会议录出版者	Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)
语种	英语
内容类型	会议论文
源URL	[http://ir.ia.ac.cn/handle/173211/26084]
专题	精密感知与控制研究中心_人工智能与机器学习
通讯作者	Li HF(李海芳)
作者单位	1.Institute of Automation, Chinese Academy of Sciences 2.University of Science and Technology of China
推荐引用方式 GB/T 7714	Li HF,Yingce Xia,Wensheng Zhang. Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces[C]. 见:. Stockholm, Sweden. July 13-19 2018.