Removes shuffle operations running back to back when their modes are reciplocal.
Example:
c =
shuffle(a, b, interleave_T128_4x2_lo)
d =
shuffle(a, b, interleave_T128_4x2_hi)
e =
shuffle(c, d, interleave_T128_2x4_lo)
f =
shuffle(c, d, interleave_T128_2x4_hi)
v16acc32 shuffle(v16acc32 a, v16acc32 b, unsigned int mode)