CORC  > 北京大学  > 信息科学技术学院
Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs
Liang, Yun ; Wang, Shuo
刊名JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY
2016
关键词register file racetrack memory GPU EFFICIENT COMPILER FRAMEWORK CACHE
DOI10.1007/s11390-016-1610-1
英文摘要The key to high performance for GPU architecture lies in its massive threading capability to drive a large number of cores and enable execution overlapping among threads. However, in reality, the number of threads that can simultaneously execute is often limited by the size of the register file on GPUs. The traditional SRAM-based register file takes up so large amount of chip area that it cannot scale to meet the increasing demand of GPU applications. Racetrack memory (RM) is a promising technology for designing large capacity register file on GPUs due to its high data storage density. However, without careful deployment of RM-based register file, the lengthy shift operations of RM may hurt the performance. In this paper, we explore RM for designing high-performance register file for GPU architecture. High storage density RM helps to improve the thread level parallelism (TLP), but if the bits of the registers are not aligned to the ports, shift operations are required to move the bits to the access ports before they are accessed, and thus the read/write operations are delayed. We develop an optimization framework for RM-based register file on GPUs, which employs three different optimization techniques at the application, compilation, and architecture level, respectively. More clearly, we optimize the TLP at the application level, design a register mapping algorithm at the compilation level, and design a preshifting mechanism at the architecture level. Collectively, these optimizations help to determine the TLP without causing cache and register file resource contention and reduce the shift operation overhead. Experimental results using a variety of representative workloads demonstrate that our optimization framework achieves up to 29% (21% on average) performance improvement.; supported by the National Natural Science Foundation of China; SCI(E); EI; 中国科技核心期刊(ISTIC); 中国科学引文数据库(CSCD); ARTICLE; ericlyun@pku.edu.cn; shvowang@pku.edu.cn; 1; 36-49; 31
语种英语
内容类型期刊论文
源URL[http://ir.pku.edu.cn/handle/20.500.11897/437947]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
Liang, Yun,Wang, Shuo. Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs[J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,2016.
APA Liang, Yun,&Wang, Shuo.(2016).Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs.JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY.
MLA Liang, Yun,et al."Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs".JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY (2016).
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace