COSMA is a parallel, high-performance, GPU-accelerated, matrix-matrix multiplication algorithm and library implementation that is communication-optimal for all combinations of matrix dimensions, number of processors and memory sizes, without the need for any parameter tuning. COSMA is written in C++11 with MPI, OpenMP and CUDA/ROCm programming models. The library is open-source (BSD 3-clause licence) and is freely available.
CoE: MaX
This website is created and maintained by the project FocusCoE. FocusCoE has received funding from the European Union’s Horizon 2020 research and innovation programme under the grant agreement Nº 823964.