FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs
Li, Fanrong4,5; Li, Gang3,5; Mo, Zitao3,5; He, Xiangyu3,5; Cheng, Jian1,2,5
刊名IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS
2020-11-01
卷号39期号:11页码:3589-3600
关键词Accelerator architecture convolutional neural networks (CNNs) sparsity
ISSN号0278-0070
DOI10.1109/TCAD.2020.3012212
通讯作者Cheng, Jian(jcheng@nlpr.ia.ac.cn)
英文摘要Sparsity, as an intrinsic property of convolutional neural networks (CNNs), has been widely employed for hardware acceleration, and many customized accelerators tailored for sparse weights or activations have been proposed in these years. However, the irregular sparse patterns introduced by both weights and activations are much more challenging for efficient computation. For example, due to the issues of access contention, workload imbalance, and tile fragmentation, the state-of-the-art sparse accelerator SCNN fails to fully leverage the benefits of sparsity, leading to nonoptimal results for both speedup and energy efficiency. In this article, we propose an efficient sparse CNN accelerator for both weights and activations, namely fine-grained systolic accelerator (FSA), which jointly optimizes both hardware dataflow and software partitioning and scheduling strategy. Specifically, to deal with the access contentions problem, we present a fine-grained systolic dataflow, in which the activations move rhythmically along the horizontal processing element array while the weights are fed into the array in a fine-grained order. We then propose a hybrid network partitioning strategy that sets different partitioning strategies for different layers to balance the workload and alleviate the fragmentation problem caused by both sparse weights and activations. Finally, we present a scheduling search strategy to find the optimized schedules for neural networks, which can further improve energy efficiency. Extensive evaluations show that the proposed FSA consistently outperforms SCNN over AlexNet, VGGNet, GoogLeNet, and ResNet with an average speedup of 1.74x and up to 13.86x energy efficiency.
资助项目National Natural Science Foundation of China[61972396] ; National Natural Science Foundation of China[61876182] ; National Natural Science Foundation of China[61906193] ; Strategic Priority Research Program of Chinese Academy of Science[XDB32050200] ; Advance Research Program[31511130301]
WOS关键词TIME
WOS研究方向Computer Science ; Engineering
语种英语
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
WOS记录号WOS:000587712700037
资助机构National Natural Science Foundation of China ; Strategic Priority Research Program of Chinese Academy of Science ; Advance Research Program
内容类型期刊论文
源URL[http://ir.ia.ac.cn/handle/173211/41756]  
专题类脑芯片与系统研究
通讯作者Cheng, Jian
作者单位1.Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
3.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
4.Univ Chinese Acad Sci, Sch Future Technol, Beijing 100049, Peoples R China
5.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Li, Fanrong,Li, Gang,Mo, Zitao,et al. FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,2020,39(11):3589-3600.
APA Li, Fanrong,Li, Gang,Mo, Zitao,He, Xiangyu,&Cheng, Jian.(2020).FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs.IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,39(11),3589-3600.
MLA Li, Fanrong,et al."FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs".IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 39.11(2020):3589-3600.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace