GATLAS GPU Automatically Tuned Linear Algebra Software
Chris Jang <>
Download source code:

An article I wrote about this work: CaseStudyGATLAS.htm

I apologize that GATLAS is not immediately useful. There is no direct way to use it in applications. There is almost no documentation either!

GATLAS is a learning project leading to a better future. The lesson is traditional compiler optimization techniques for supercomputers also work for GPGPU. GATLAS kernels incorporate the following loop code transformations: unrolling, fusion, interchange, strip mining, tiling and scalar expansion.

I am working on a new project.

PeakStream had the right approach. GPGPU should be a managed platform with a virtual machine and JIT compiler. High performance kernels should be synthesized as applications run. Pre-optimized math kernel libraries are not enough. So far, resource management has been easy. The compiler is more difficult and taking longer. Data dependency analysis, code transformation and scheduling must be automated.

The goal is an open source platform for GPGPU similar to PeakStream. I am working on it!

The GATLAS project is currently:
an auto-tuning OpenCL benchmark
an automated and adaptive regression test framework
an ahead-of-time compiler without a front-end or middle-end
a back-end hardcoded to generate code for matrix multiply
a testbed for compiler optimized OpenCL kernels
optimized for ATI Evergreen GPUs but runs on NVIDIA Fermi GPUs too
What does GATLAS do now?
finds fast OpenCL GEMM, GEMV and SAXPY kernels
adapts to different GPU models, SDK and driver versions
optimized search of kernel specializations using expectation maximization
journalling fault tolerance of OpenCL compiler crashes, corrupt kernel output, driver hangs
supports single and double precision in scalar and vector lengths
supports row and column major matrix data layouts
supports memory buffer and image kernel arguments
How fast are the kernels? (gigaFLOPS, x86_64 SDK/driver ATI v2.2/10.7b NVIDIA v3.1/256.40)
HD 5870: SGEMM 1418 DGEMM 366
HD 5770: SGEMM 716
HD 5670: SGEMM 329
HD 5440: SGEMM 65
GTX 480: DGEMM 83? major issues with incorrect kernel output
Here is the old homepage.

San Francisco, CA, Oct 18 2010