Skip to content

Commit effa57d

Browse files
isurufinducer
authored andcommitted
Optimize M2L a bit
gives a 40% performance boost in CUDA
1 parent c3c0a63 commit effa57d

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

sumpy/expansion/m2l.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -795,8 +795,8 @@ def optimize_loopy_kernel(self, knl, tgt_expansion, src_expansion):
795795
knl = lp.unprivatize_temporaries_with_inames(knl,
796796
{"icoeff_tgt"}, {"tgt_expansion"})
797797

798-
knl = lp.split_iname(knl, "icoeff_tgt", 32, inner_iname="inner",
799-
inner_tag="l.0")
798+
knl = lp.split_iname(knl, "icoeff_tgt", 64, inner_iname="inner",
799+
inner_tag="l.0", outer_tag="g.1")
800800
knl = lp.tag_inames(knl, {"itgt_box": "g.0"})
801801
return knl
802802

0 commit comments

Comments
 (0)