Although accurate wind power prediction can improve the reliability, security and economic operation of a power system, the prediction task is complex due to the intermittent nature of wind speed its strong dependence on weather conditions. This paper proposes a novel framework, consisting of hybrid deep learning models, an optimization algorithm and a data decomposition technique, to improve the forecasting accuracy of ultra-short-term wind power generation. The data of the wind power generation collected from a real wind farm are preprocessed and decomposed using a variational mode decomposition (VMD). A deep-learning model (long short-term memory (LSTM) with dropout regularization) is designed to accurately predict the decomposed spectra. The hyper-parameters of the model are optimized by applying the grey-wolf optimization (GWO) algorithm to choose the best hyper-parameters. The combination of deep learning and optimization algorithm plays a key role in achieving better prediction accuracy. The effectiveness of the proposed framework is measured by applying it to two data sets, and the framework is compared with other forecasting models, such as a hybrid deep-learning and empirical wavelet transform (EWT) with LSTM. Comparison with experimental results demonstrates that the novel hybrid framework has the best prediction accuracy in forecasting ultra short-term wind power generation.