There's a couple of things I should clarify regarding my previous reply
about the stack calculation.
1- In discussing disabled optimization for debug: Instead of "often
puts variables at a gross alignment" perhaps I should have said "often
puts storage for variables at a maximum 8-byte alignment".
2- Microsoft's compilers pretty much ignore the "register" qualifier.
They're smart enough to forbid taking the address of a register
variable but that's as far as they go. Thus the results others get
for the match() stack requirement will often differ from what a
MSVC compiler would produce. I don't know if gcc will/won't pass
some of the match() parameters and/or use its other "register" vars
as CPU registers or put them on the stack.