aarch64: Fix incorrect init/fini stack manipulation
The pre-index operator ('!') was missing at the end of the stp
instruction.
As a result, the stack pointer wasn't updated after the
store of the 64-bit pair and the stored values were basically lost when
follow on code used the stack for later store ops.