Article Details

Algorithm/Architecture Codesign of Low Power and High Performance Linear Algebra Compute Fabrics

Oleh   Ardavan Pedram [-]
Kontributor / Dosen Pembimbing : -
Jenis Koleksi : Jurnal elektronik
Penerbit : FMIPA - Matematika
Fakultas : Fakultas Matematika dan Ilmu Pengetahuan Alam (FMIPA)
Subjek : Algebra
Kata Kunci : Linear Algebra; Power efficient; Accelerator;
Sumber : 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum Year: 2013 Page s: 2214 - 2217
Staf Input/Edit : Dwi Ary Fuziastuti   Ena Sukmana
File : 1 file
Tanggal Input : 2019-02-06 13:59:36

We show the design of specialized compute fabrics that maintain the efficiency of full custom hardware while providing enough flexibility to execute a whole class of coarse- grain linear algebra operations. The broad vision of this project is to develop integrated and specialized hardware/software solutions that are co-optimized and co-designed across all layers ranging from the basic hardware foundations all the way to the application through standard linear algebra packages. We have designed a specialized linear algebra proces- sor (LAP) that can perform level-3 BLAS and more complex LAPACK level operations like Cholesky, LU (with partial pivoting), and QR factorizations. We present a power per- formance model that compares state of the art CPUs and GPUs with our design. Our power model reveals sources of inefficiencies in CPUs and GPUs, and our LAP design demonstrates how to overcome them. When compared to other conventional architectures for linear algebra applications, LAP is over orders of magnitude more power efficient. Based on our estimations up to 55 and 25 GFLOPS/W single- and double- precision efficiencies are achievable on a single chip in standard 45nm technology.