In session 2 we covered the paper FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness, and walked through my Triton kernel implementation.
Presenters: Ben Schneider, Daniel Vega-Myhre
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||
In session 2 we covered the paper FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness, and walked through my Triton kernel implementation.
Presenters: Ben Schneider, Daniel Vega-Myhre