At some point in the past I simplified all of the loop bounds by requiring all field types to have the same dimensions and capturing their differences in the mask information. However, it seems that I was slack and only actually changed the implementation for the initialisation routines for NE stagger (that used by NemoLite2d). Those for SW stagger still do +/-1 and that's an important oversight because they're used in Shallow which @sergisiso has been trying to get working recently.