NUKADA Akira
- Articles
- High Performance 3D Convolution for Protein Docking on IBM Blue Gene
Nukada Akira; Hourai Yuichiro; Nishida Akira; Akiyama Yu...
Parallel and Distributed Processing and Applications. ISPA 2007. Lecture Notes in Computer Science/4742/pp.958-969, 2007-08 - Bandwidth Intensive 3-D FFT kernel for GPUs using CUDA
Nukada Akira; Ogata Yasuhiko; Endo Toshio; Matsuoka Satoshi
SC '08: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, 2008-11 - Fast Conjugate Gradients with Multiple GPUs
Cevahir Ali; Nukada Akira; Matsuoka Satoshi
ICCS 2009: Computational Science – ICCS 2009/pp.893-903, 2009-05 - Auto-Tuning 3-D FFT Library for CUDA GPUs
Nukada Akira; Matsuoka Satoshi
SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2009-11 - Linpack Evaluation on a Supercomputer with Heterogeneous Accelerators
Endo Toshio; Nukada Akira; Matsuoka Satoshi; Maruyama Naoya
2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2010-04 - A High-Performance Fault-Tolerant Software Framework for Memory on Commodity GPUs
Maruyama Naoya; Nukada Akira; Matsuoka Satoshi
2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2010-04 - High Performance Conjugate Gradient Solver on Multi-GPU Clusters Using Hypergraph Partitioning
Cevahir Ali; Nukada Akira; Matsuoka Satoshi
Computer Science - Research and Development/25(1-2)/pp.83-91, 2010-05 - Statistical Power Modeling of GPU Kernels Using Performance Counters
Nagasaka Hitoshi; Maruyama Naoya; Nukada Akira; Endo Tos...
GREENCOMP '10: Proceedings of the International Conference on Green Computing/pp.115-122, 2010-08 - An 80-Fold Speedup, 15.0 TFlops, Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code
Shimokawabe Takashi; Aoki Takayuki; Muroi Chiashi; Ishida...
SC '10: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, 2010-11 - Low-overhead diskless checkpoint for hybrid computing systems
Gomez Leonardo Bautista; Nukada Akira; Maruyama Naoya; Ca...
International Conference on High Performance Computing (HiPC 2010), 2010-12 - NVCR: A Transparent Checkpoint-Restart Library for NVIDIA CUDA
Nukada Akira; Takizawa Hiroyuki; Matsuoka Satoshi
20th Heterogeneity in Computing Workshop (HCW 2011)/pp.104-113, 2011-05 - Hamming Color Code for Dense and Robust One-shot 3D Scanning
Yamazaki Shuntaro; Nukada Akira; Mochimaru Masaaki
2011 British Machine Vision Conference/pp.96.1-96.9, 2011-08 - Peta-scale Phase-Field Simulation for Dendritic Solidification on the TSUBAME 2.0 Supercomputer
Shimokawabe Takashi; Aoki Takayuki; Takaki Tomohiro; Yama...
SC '11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, 2011-11 - High Performance 3-D FFT using multiple CUDA GPUs
Nukada Akira; Maruyama Yutaka; Matsuoka Satoshi
Fifth Workshop on General Purpose Processing using Graphics Processing Units (GPGPU-5)/pp.57-63, 2012-03 - Scalable Multi-GPU 3-D FFT for TSUBAME 2.0 Supercomputer
Nukada Akira; Sato Kento; Matsuoka Satoshi
SC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, 2012-11 - Mixed Precision AMG method for Many Core Accelerators
Sumiyoshi Yuki; Fujii Akihiro; Nukada Akira; Tanaka Teruo
nternational Workshop on Enhancing Parallel Scientific Applications with Accelerated HPC (ESAA 2014)/pp.127-132, 2014-08 - TSUBAME-KFC: a Modern Liquid Submersion Cooling Prototype towards Exascale Becoming the Greenest Supercomputer in the World
Endo Toshio; Nukada Akira; Matsuoka Satoshi
20th IEEE International Conference on Parallel and Distributed Systems (ICPADS 2014)/pp.360-367, 2014-12 - Cache-aware Sparse Matrix Formats for Kepler GPU
Nagasaka Yusuke; Nukada Akira; Matsuoka Satoshi
20th IEEE International Conference on Parallel and Distributed Systems (ICPADS 2014)/pp.281-288, 2014-12 - Efficient Execution of Multiple CUDA Applications using Transparent Suspend, Resume and Migration
Suzuki Taichiro; Nukada Akira; Matsuoka Satoshi
Euro-Par 2015: Parallel Processing. Euro-Par 2015. Lecture Notes in Computer Science/9233/pp.687-699, 2015-08 - Adaptive Multi-level Blocking Optimization for Sparse Matrix Vector Multiplication on GPU”, Procedia Computer Science series
Nagasaka Yusuke; Nukada Akira; Matsuoka Satoshi
Procedia Computer Science/80/pp.131-142, 2016-06 - High-Performance and Memory-Saving Sparse General Matrix-Matrix Multiplication for NVIDIA Pascal GPU
Nagasaka Yusuke; Nukada Akira; Matsuoka Satoshi
46th International Conference on Parallel Processing (ICPP-2017)/pp.101-110, 2017-08 - Optimizations of Compute-bound Scientific Kernels on SW26010 Many-core Processor
Lin James; Xu Zhigeng; Nukada Akira; Maruyama Naoya; Mats...
46th International Conference on Parallel Processing (ICPP-2017)/pp.432-441, 2017-08 - Efficient Solving of Scan Primitive on Multi-GPU Systems
Dieguez Adrian Perez; Amor Margarita; Ramón Doallo; Nukad...
32nd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2018)/pp.794-803, 2018-05 - Optimizations of Preconditioned Conjugate Gradient on TaihuLight for OpenFOAM
Lin James; Wen Minhua; Meng Delong; Liu Xin; Nukada Akir...
18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2018)/pp.283-290, 2018-05 - MRG8 - Random Number Generation for the Exascale Era
Nagasaka Yusuke; Nukada Akira; Matsuoka Satoshi; Miura K...
PASC 2018: Platform for Advanced Scientific Computing Conference, 2018-07 - more...
- High Performance 3D Convolution for Protein Docking on IBM Blue Gene