FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs | |
Li, Fanrong4,5; Li, Gang3,5; Mo, Zitao3,5; He, Xiangyu3,5; Cheng, Jian1,2,5 | |
刊名 | IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS |
2020-11-01 | |
卷号 | 39期号:11页码:3589-3600 |
关键词 | Accelerator architecture convolutional neural networks (CNNs) sparsity |
ISSN号 | 0278-0070 |
DOI | 10.1109/TCAD.2020.3012212 |
通讯作者 | Cheng, Jian(jcheng@nlpr.ia.ac.cn) |
英文摘要 | Sparsity, as an intrinsic property of convolutional neural networks (CNNs), has been widely employed for hardware acceleration, and many customized accelerators tailored for sparse weights or activations have been proposed in these years. However, the irregular sparse patterns introduced by both weights and activations are much more challenging for efficient computation. For example, due to the issues of access contention, workload imbalance, and tile fragmentation, the state-of-the-art sparse accelerator SCNN fails to fully leverage the benefits of sparsity, leading to nonoptimal results for both speedup and energy efficiency. In this article, we propose an efficient sparse CNN accelerator for both weights and activations, namely fine-grained systolic accelerator (FSA), which jointly optimizes both hardware dataflow and software partitioning and scheduling strategy. Specifically, to deal with the access contentions problem, we present a fine-grained systolic dataflow, in which the activations move rhythmically along the horizontal processing element array while the weights are fed into the array in a fine-grained order. We then propose a hybrid network partitioning strategy that sets different partitioning strategies for different layers to balance the workload and alleviate the fragmentation problem caused by both sparse weights and activations. Finally, we present a scheduling search strategy to find the optimized schedules for neural networks, which can further improve energy efficiency. Extensive evaluations show that the proposed FSA consistently outperforms SCNN over AlexNet, VGGNet, GoogLeNet, and ResNet with an average speedup of 1.74x and up to 13.86x energy efficiency. |
资助项目 | National Natural Science Foundation of China[61972396] ; National Natural Science Foundation of China[61876182] ; National Natural Science Foundation of China[61906193] ; Strategic Priority Research Program of Chinese Academy of Science[XDB32050200] ; Advance Research Program[31511130301] |
WOS关键词 | TIME |
WOS研究方向 | Computer Science ; Engineering |
语种 | 英语 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
WOS记录号 | WOS:000587712700037 |
资助机构 | National Natural Science Foundation of China ; Strategic Priority Research Program of Chinese Academy of Science ; Advance Research Program |
内容类型 | 期刊论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/41756] |
专题 | 类脑芯片与系统研究 |
通讯作者 | Cheng, Jian |
作者单位 | 1.Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 3.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China 4.Univ Chinese Acad Sci, Sch Future Technol, Beijing 100049, Peoples R China 5.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China |
推荐引用方式 GB/T 7714 | Li, Fanrong,Li, Gang,Mo, Zitao,et al. FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,2020,39(11):3589-3600. |
APA | Li, Fanrong,Li, Gang,Mo, Zitao,He, Xiangyu,&Cheng, Jian.(2020).FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs.IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS,39(11),3589-3600. |
MLA | Li, Fanrong,et al."FSA: A Fine-Grained Systolic Accelerator for Sparse CNNs".IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 39.11(2020):3589-3600. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论