In October 2015, AMD released the AMD ACL 1.0 Beta 2, the second version of the AMD Compute Library (ACL), which provided important improvements in the clBLAS, clFFT, and clSPARSE libraries relative to the Beta 1 release. Since then, the team has worked hard to release even more improvements.
The GA release continues AMD’s goal of providing a unified repository for a variety of open-source math libraries that allow you to accelerate computations on AMD GPUs, APUs, and CPUs. All of the source code, readmes, and documentation are available at the respective GitHub links listed at the end of this post.
In the following sections, we’ll list the significant features in the GA, Beta 2, and Beta 1 releases.
The AutoGEMM functionality included in Beta 2 allowed users to automatically generate optimized kernels for various matrix sizes, but was restricted to “Hawaii”-based dGPUs. The GA version of clBLAS takes this feature to “Fiji”-based dGPUs—you can now generate optimized kernels for “Fiji”-based dGPUs.
The GA version of clBLAS also introduces a fix for multi-GPU and multi-context support. Earlier releases supported only an OpenCL™ context with an identical dGPU. The GA version fixes this such that it runs wells with systems with different dGPUs.
For context, see the features introduced in the clBLAS Beta 2 and Beta 1 versions at a glance.
The Beta 2 version of clFFT included a pre-callback feature that enables faster custom pre-processing of input data directly by the library via a user callback function. The GA version goes a step further: it introduces a post-call back feature that enables faster custom post-processing of output data directly by the library via a user callback function. For more information about the post-call back feature, see this blog.
The GA version of clFFT increases the range of sizes supported for 1D in-place transforms while enabling really large-size 1D FFTs.
For context, see the features introduced in the clFFT Beta 2 and Beta 1 versions at a glance.
The v0.10 version of clSPARSE introduces an abstraction for the bitness width of indices. This release incorporates API changes to increase library usability and readability, so please refer to the project release notes for details.
For context, see the features introduced in the clSPARSE Beta 2 and Beta 1 versions at a glance.
The GA version of the clRNG library includes no changes from the Beta 1 version.
Beta 1 included the following features.
Have fun. Please provide your feedback at: https://community.amd.com/community/devgurus/amd-compute-libraries.
Karthik Dakshinamoorthy is the Program Manager for AMD Compute Libraries. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.
OpenCL is a trademark of Apple Inc. used by permission by Khronos.