Talk:Matrix multiplication
Hints about optimization[edit]
The task looks easy, but between two implementations of this basic function, speed may differ by several orders of magnitude. There are several ways to optimize a matrix product (optimizing cache usage by loop order and block product, transposing, using SIMD processor instructions, OpenMP...). Here is a lecture I like at MIT OpenCourseWare: Matrix Multiply: A Case Study. In real life, one would use an optimized BLAS library like what is found in ATLAS or Intel MKL. Arbautjc (talk) 23:05, 14 August 2016 (UTC)