
If you wish to see an example of code using both, a scheduler I wrote for the Cortex M4f is available on BitBucket, it's not documented, but it is fairly straight forward. The destacking of the new stack pointer then returns execution to the next task. In a multi tasking system, if the scheduler caused the exception, it is at this point that you change where the PSP is pointing to be the stack pointer for the next task, and return from the exception.

When an exception happens, a stack frame gets pushed to the currently active stack pointer, and then switches to use the MSP for the exception handler. The idea is that the PSP or process stack pointer is used by the individual tasks, and the kernel uses the MSP. The reason for two is to enable the user to easily implement a multi tasking 'operating system'.

You are correct in a way, in the cortex m (which your stm32 is, though I can't say which variant unless you specify a part) there is one active stack pointer r13, this can however be either the MSP or PSP.
