Follow
Kun Wu
Title
Cited by
Cited by
Year
Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture
SW Min, K Wu, S Huang, M Hidayetoğlu, J Xiong, E Ebrahimi, D Chen, ...
VLDB 14 (11), 2087--2100, 2021
642021
Pylog: An algorithm-centric python-based FPGA programming and synthesis flow
S Huang, K Wu, H Jeong, C Wang, D Chen, WM Hwu
IEEE Transactions on Computers 70 (12), 2015-2028, 2021
512021
PyTorch-Direct: Enabling GPU Centric Data Access for Very Large Graph Neural Network Training with Irregular Accesses
SW Min, K Wu, S Huang, M Hidayetoğlu, J Xiong, E Ebrahimi, D Chen, ...
292021
Graph Neural Network Training with Data Tiering
SW Min, K Wu, M Hidayetoğlu, J Xiong, X Song, W Hwu
2022 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2022
162022
Memory-bound proof-of-work acceleration for blockchain applications
K Wu, G Dai, X Hu, S Li, X Xie, Y Wang, Y Xie
Proceedings of the 56th Annual Design Automation Conference 2019, 1-6, 2019
152019
TEMPI: An interposed MPI library with a canonical representation of CUDA-aware datatypes
C Pearson, K Wu, IH Chung, J Xiong, WM Hwu
Proceedings of the 30th International Symposium on High-Performance Parallel …, 2021
42021
Hector: An Efficient Programming and Compilation Framework for Implementing Relational Graph Neural Networks in GPU Architectures
K Wu, M Hidayetoğlu, X Song, S Huang, D Zheng, I Nisa, W Hwu
29th ACM International Conference on Architectural Support for Programming …, 2024
2*2024
TBA: Faster Large Language Model Training Using SSD-Based Activation Offloading
K Wu, JB Park, X Zhang, M Hidayetoğlu, VS Mailthody, S Huang, ...
arXiv preprint arXiv:2408.10013, 2024
12024
A Python-based High-Level Programming Flow for CPU-FPGA Heterogeneous Systems
S Huang, K Wu, SR Chalamalasetti, I El Hajj, C Xu, P Faraboschi, D Chen
2021 IEEE/ACM Programming Environments for Heterogeneous Computing (PEHC), 20-26, 2021
12021
Towards a unified framework of matrix derivatives
J Xu, G Li, C Wen, K Wu, L Deng
IEEE Access 6, 47922-47934, 2018
12018
LSM-GNN: Large-scale Storage-based Multi-GPU GNN Training by Optimizing Data Transfer Scheme
JB Park, K Wu, VS Mailthody, Z Quresh, S Mahlke, W Hwu
arXiv preprint arXiv:2407.15264, 2024
2024
Object store offloading
D Korolija, K Wu, SR Chalamalasetti, LMK Evans, DS Milojicic
US Patent App. 17/958,189, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–12