# Talk:Matrix multiplication

From Rosetta Code

## Hints about optimization[edit]

The task looks easy, but between two implementations of this basic function, speed may differ by several orders of magnitude.
There are several ways to optimize a matrix product (optimizing cache usage by loop order and block product, transposing, using SIMD processor instructions, OpenMP...). Here is a lecture I like at MIT OpenCourseWare: *Matrix Multiply: A Case Study*. In real life, one would use an optimized BLAS library like what is found in ATLAS or Intel MKL. Arbautjc (talk) 23:05, 14 August 2016 (UTC)