논문 목록
Issue Date Title Journals
2024-01 Gem5-AVX: Extension of the Gem5 Simulator to Support AVX Instruction Sets IEEE Access
2023-12 Enabling Fine-Grained Spatial Multitasking on Systolic-Array NPUs Using Dataflow Mirroring IEEE Transactions on Computers
2023-10 Virtual PIM: Resource-Aware Dynamic DPU Allocation and Workload Scheduling Framework for Multi-DPU PIM Architecture Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
2023-07 Occamy: Memory-efficient GPU Compiler for DNN Inference Proceedings - Design Automation Conference
2023-06 Design and Analysis of a Processing-in-DIMM Join Algorithm: A Case Study with UPMEM DIMMs Proceedings of the ACM on Management of Data
2022-10 Decoupling Schedule, Topology Layout, and Algorithm to Easily Enlarge the Tuning Space of GPU Graph Processing Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
2021-12 DANCE: Differentiable Accelerator/Network Co-Exploration Proceedings - Design Automation Conference
2021-11 Dataflow Mirroring: Architectural Support for Highly Efficient Fine-Grained Spatial Multitasking on Systolic-Array NPUs Proceedings - Design Automation Conference
2021-07 Making a Better Use of Caches for GCN Accelerators with Feature Slicing and Automatic Tile Morphing IEEE COMPUTER ARCHITECTURE LETTERS
2021-06 Making a Better Use of Caches for GCN Accelerators with Feature Slicing and Automatic Tile Morphing IEEE COMPUTER ARCHITECTURE LETTERS
2021-03 Thread-Aware Area-Efficient High-Level Synthesis Compiler for Embedded Devices 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)
2020-06 Real-Time Object Detection System with Multi-Path Neural Networks Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS