Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces
Li HF(李海芳)1; Yingce Xia2; Wensheng Zhang1
2018-04
会议日期July 13-19 2018
会议地点Stockholm, Sweden
英文摘要

Policy evaluation with linear function approximation is an important problem in reinforcement learning. When facing high-dimensional feature spaces, such a problem becomes extremely hard considering the computation efficiency and quality of approximations. We propose a new algorithm, LSTD(λ)-RP, which leverages random projection techniques and takes eligibility traces into consideration to tackle the above two challenges. We carry out theoretical analysis of LSTD(λ)-RP, and provide meaningful upper bounds of the estimation error, approximation error and total generalization error. These results demonstrate that LSTD(λ)-RP can benefit from random projection and eligibility traces strategies, and LSTD(λ)-RP can achieve better performances than prior LSTDRP and LSTD(λ) algorithms.

会议录出版者Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)
语种英语
内容类型会议论文
源URL[http://ir.ia.ac.cn/handle/173211/26084]  
专题精密感知与控制研究中心_人工智能与机器学习
通讯作者Li HF(李海芳)
作者单位1.Institute of Automation, Chinese Academy of Sciences
2.University of Science and Technology of China
推荐引用方式
GB/T 7714
Li HF,Yingce Xia,Wensheng Zhang. Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces[C]. 见:. Stockholm, Sweden. July 13-19 2018.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace