Immersive virtual reality (VR) is a powerful tool for vocational training - especially for safety-critical vocations where real-world training is often too complicated, expensive, or risky. Despite its apparent utility in this context, there is not yet a coherent, systematic framework that captures the key features of user learning experiences in immersive virtual reality environments. With the use of Structural Equation Modelling (SEM), this article presents a comprehensive framework which brings different dimensions (assessment criteria) together. This framework is informed by the analysis of 45 VR training sessions conducted by Mines Rescue Pty Ltd and total of 284 mines rescue brigades partook in this study and received training on specific mines rescue operation. The result of this study shows that actual and perceived learning was enhanced by trainee engagement with the scenario, their perception of the fidelity of the scenario, their sense of co-presence with other trainees, the perceived usability of the system, an overall positive attitude towards the technology, and the involvement of skilled trainers. These results suggest that: 1) there are multiple paths by which immersive VR training can have a positive impact on learning, and 2) immersive VR training will not replace the requirement for skilled trainers, but rather it can serve as an effective vehicle to convey their expertise.