Address computation
Original shape
(
8
,
8
)
↓ reshape
Tiled shape
(
4
,
2
,
2
,
4
)
Mapped indices
(
i//2
,
i%2
,
j//4
,
j%4
)
addr =
(i//2) × 16
+
(i%2) × 4
+
(j//4) × 8
+
(j%4) × 1
Example: addr(2, 3) = 1×16 + 0×4 + 0×8 + 3×1 = 19