https://github.com/flame/how-to-optimize-gemm/wiki
Copyright by Prof. Robert van de Geijn (rvdg@cs.utexas.edu).
Adapted to Github Markdown Wiki by Jianyu Huang (jianyu@cs.utexas.edu).
- The GotoBLAS/BLIS Approach to Optimizing Matrix-Matrix Multiplication - Step-by-Step
- NOTICE ON ACADEMIC HONESTY
- References
- Set Up
- Step-by-step optimizations
- Computing four elements of C at a time
- Computing a 4 x 4 block of C at a time
- Acknowledgement
This material was partially sponsored by grants from the National Science Foundation (Awards ACI-1148125/1340293 and ACI-1550493).
Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).