Hi, thanks for the insightful blog! I wonder if you've produced some throughput results versus other impl such as: - the FLA one as you mentioned; - the one in https://github.com/lucidrains/native-sparse-attention-pytorch
Hi, thanks for the insightful blog! I wonder if you've produced some throughput results versus other impl such as: