This is a complex part of the UML spec. In the simplest case, when you enter a state containing orthogonal regions, the initial psuedo state in each orthogonal region essentially starts a separate thread of control. There are lots of complicated rules about how events are consumed by these threads and how the threads join back together.
But, according to a methodologist I highly recommend (H. S. Lahman), you really shouldn't use more than plain old Moore state machines. For more information on why one should use Moore state machines (which you can model perfectly well in UML) instead of Mealy or Harel state machines, please see this excerpt from Lahman's book. For more information on the difference between a Moore and a Mealy state machine, please see
this StackExchange question.