Coordinated static and dynamic cache bypassing for GPUs

doi:10.1109/HPCA.2015.7056023

CORC > 北京大学 > 信息科学技术学院

	Coordinated static and dynamic cache bypassing for GPUs
	Xie, Xiaolong ; Liang, Yun ; Wang, Yu ; Sun, Guangyu ; Wang, Tao
	2015
英文摘要	The massive parallel architecture enables graphics processing units (GPUs) to boost performance for a wide range of applications. Initially, GPUs only employ scratchpad memory as on-chip memory. Recently, to broaden the scope of applications that can be accelerated by GPUs, GPU vendors have used caches in conjunction with scratchpad memory as on-chip memory in the new generations of GPUs. Unfortunately, GPU caches face many performance challenges that arise due to excessive thread contention for cache resource. Cache bypassing, where memory requests can selectively bypass the cache, is one solution that can help to mitigate the cache resource contention problem. In this paper, we propose coordinated static and dynamic cache bypassing to improve application performance. At compile-time, we identify the global loads that indicate strong preferences for caching or bypassing through profiling. For the rest global loads, our dynamic cache bypassing has the flexibility to cache only a fraction of threads. In CUDA programming model, the threads are divided into work units called thread blocks. Our dynamic bypassing technique modulates the ratio of thread blocks that cache or bypass at run-time. We choose to modulate at thread block level in order to avoid the memory divergence problems. Our approach combines compile-time analysis that determines the cache or bypass preferences for global loads with run-time management that adjusts the ratio of thread blocks that cache or bypass. Our coordinated static and dynamic cache bypassing technique achieves up to 2.28X (average I.32X) performance speedup for a variety of GPU applications. ? 2015 IEEE.; EI; 76-88
语种	英语
出处	2015 21st IEEE International Symposium on High Performance Computer Architecture, HPCA 2015
DOI标识	10.1109/HPCA.2015.7056023
内容类型	其他
源URL	[http://ir.pku.edu.cn/handle/20.500.11897/423675]
专题	信息科学技术学院
推荐引用方式 GB/T 7714	Xie, Xiaolong,Liang, Yun,Wang, Yu,et al. Coordinated static and dynamic cache bypassing for GPUs. 2015-01-01.

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

[发表评论/异议/意见]

暂无评论

评论
权益异议
反馈意见

评注功能仅针对注册用户开放，请您登录

您对该条目有什么异议，请向管理员反馈。
内容：
Email：	*
单位:
验证码：	刷新

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接

CORC

联系我们