Copyright (C) 2025, Advanced Micro Devices, Inc.

Copyright (C) 2014, The University of Texas at Austin

AOCL-BLAS - Release Notes - version 5.1.0
--------------------------------------------

AOCL-BLAS is a portable software framework for instantiating high-performance 
BLAS-like dense linear algebra libraries. The framework was designed to isolate 
essential kernels of computation that enable optimized 
implementations of most of its commonly used and computationally intensive 
operations.  AMD has extensively optimized the implementation of BLIS for AMD processors. 

Highlights of AOCL-BLAS 5.1.0
--------------------------------

- ZGEMM optimisations for tiny matrices 
- DGEMM & DTRSM optimisations for zen5
- Performance Optimizations
	* DGEMM, DGEMV, ZGEMM, DTRSV, DCOPYV on Zen4/5
	* DSCALV, DDOTV on Zen3
- Benchmark support for ASUMV
- LPGEMM
	* AOCL_ENABLE_INSTRUCTIONS support
	* Added batch_gemm APIs for all data types
	* New Output Datatype for Integer APIs
	* BF16 Support on AVX2 Platforms
	* WOQ with/without Group Quantization
	* Threading Framework Optimizations
	* Reference Kernels for all reorder APIs 
	* Performance Optimizations for all APIs 
- Testsuite Enhancements
- Minor Bug Fixes

Please refer AOCL User Guide for supported Operating Systems and Compilers.

The package contains AOCL-BLAS Library binaries which includes optimizations for
the AMD EPYC and AMD Ryzen processor families, header files and examples.

