Sparse tensor core: Algorithm and hardware co-design for vector-wise sparse neural networks on modern gpus M Zhu, T Zhang, Z Gu, Y Xie Proceedings of the 52nd Annual IEEE/ACM International Symposium on …, 2019 | 157 | 2019 |
Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation L Deng, G Wang, G Li, S Li, L Liang, M Zhu, Y Wu, Z Yang, Z Zou, J Pei, ... IEEE Journal of Solid-State Circuits 55 (8), 2228-2246, 2020 | 152 | 2020 |
Dynamic sparse graph for efficient deep learning L Liu, L Deng, X Hu, M Zhu, G Li, Y Ding, Y Xie arXiv preprint arXiv:1810.00859, 2018 | 62 | 2018 |
Performance evaluation and optimization of HBM-Enabled GPU for data-intensive applications M Zhu, Y Zhuo, C Wang, W Chen, Y Xie 2017 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017 | 49 | 2017 |
Opencl caffe: Accelerating and enabling a cross platform machine learning framework J Gu, Y Liu, Y Gao, M Zhu Proceedings of the 4th International Workshop on OpenCL, 1-5, 2016 | 41 | 2016 |
Evaluating the potential of graphics processors for high performance embedded computing S Mu, C Wang, M Liu, D Li, M Zhu, X Chen, X Xie, Y Deng 2011 Design, Automation & Test in Europe, 1-6, 2011 | 38 | 2011 |
Implementation and evaluation of deep neural networks (DNN) on mainstream heterogeneous systems J Gu, M Zhu, Z Zhou, F Zhang, Z Lin, Q Zhang, M Breternitz Proceedings of 5th Asia-Pacific Workshop on Systems, 1-7, 2014 | 37 | 2014 |
Structurally sparsified backward propagation for faster long short-term memory training M Zhu, J Clemons, J Pool, M Rhu, SW Keckler, Y Xie arXiv preprint arXiv:1806.00512, 2018 | 26* | 2018 |
Cnnlab: a novel parallel framework for neural networks using gpu and fpga-a practical study with trade-off analysis M Zhu, L Liu, C Wang, Y Xie arXiv preprint arXiv:1606.06234, 2016 | 26 | 2016 |
fuseGNN: Accelerating graph convolutional neural network training on GPGPU Z Chen, M Yan, M Zhu, L Deng, G Li, S Li, Y Xie Proceedings of the 39th International Conference on Computer-Aided Design, 1-9, 2020 | 23 | 2020 |
Taming unstructured sparsity on GPUs via latency-aware optimization M Zhu, Y Xie 2020 57th ACM/IEEE Design Automation Conference (DAC), 1-6, 2020 | 10 | 2020 |
A polyhedral modeling based source-to-source code optimization framework for GPGPU C Wang, K Kang, M Zhu, Y Deng 2012 IEEE 26th International Parallel and Distributed Processing Symposium …, 2012 | 2 | 2012 |
Efficient Weighted Histogramming on GPUs with HASH M Zhu, N Xu, CZ Di Wu, Y Deng, Y Wang, FH Hsu INVITED SPEAKER: Analyzing the Performance of Top-K Retrieval Algorithms, 1995, 0 | | |