Download source code: http://github.com/cjang/GATLAS
| GATLAS GPU Automatically Tuned Linear Algebra Software |
| Chris Jang <firstname.lastname@example.org> |
An article I wrote about this work: CaseStudyGATLAS.htm
I apologize that GATLAS is not immediately useful. There is no direct way to use it in applications. There is almost no documentation either!
GATLAS is a learning project leading to a better future. The lesson is traditional compiler optimization techniques for supercomputers also work for GPGPU. GATLAS kernels incorporate the following loop code transformations: unrolling, fusion, interchange, strip mining, tiling and scalar expansion.
I am working on a new project.
PeakStream had the right approach. GPGPU should be a managed platform with a virtual machine and JIT compiler. High performance kernels should be synthesized as applications run. Pre-optimized math kernel libraries are not enough. So far, resource management has been easy. The compiler is more difficult and taking longer. Data dependency analysis, code transformation and scheduling must be automated.
The goal is an open source platform for GPGPU similar to PeakStream.
I am working on it!
Here is the old homepage.
- The GATLAS project is currently:
- an auto-tuning OpenCL benchmark
- an automated and adaptive regression test framework
- an ahead-of-time compiler without a front-end or middle-end
- a back-end hardcoded to generate code for matrix multiply
- a testbed for compiler optimized OpenCL kernels
- optimized for ATI Evergreen GPUs but runs on NVIDIA Fermi GPUs too
- What does GATLAS do now?
- finds fast OpenCL GEMM, GEMV and SAXPY kernels
- adapts to different GPU models, SDK and driver versions
- optimized search of kernel specializations using expectation maximization
- journalling fault tolerance of OpenCL compiler crashes, corrupt kernel output, driver hangs
- supports single and double precision in scalar and vector lengths
- supports row and column major matrix data layouts
- supports memory buffer and image kernel arguments
- How fast are the kernels? (gigaFLOPS, x86_64 SDK/driver ATI v2.2/10.7b NVIDIA v3.1/256.40)
- HD 5870: SGEMM 1418 DGEMM 366
- HD 5770: SGEMM 716
- HD 5670: SGEMM 329
- HD 5440: SGEMM 65
- GTX 480: DGEMM 83? major issues with incorrect kernel output
San Francisco, CA, Oct 18 2010