Abstract:
This study aims to improve the forecast capability of mid-to-long term visibility by analysing the impact of pollution levels, circulation systems, and spatiotemporal distribution characteristics on low visibility weather. A neural network approach is utilised to model over 2500 stations nationwide, incorporating multi-year meteorological observations, pollution data, and reanalysis data. The selection of model structure and parameterisation schemes takes into account performance evaluations based on empirical formulas and varying parameter values across different datasets. Cross-validation is employed to split the neural network datasets into training and validation sets during the parameter training phase. Different parameterisation schemes are applied to train the models on the training set, and their performance is assessed on the validation set. By comparing the models’ performance under different parameterisation schemes, an optimal balance between fitting accuracy and generalisation capability is achieved. Using the previously established forecasting models, a visibility ensemble forecast product is created based on 15-day PM2.5 CAMx-NCEP model, observed data, and ECMWF ensemble forecast. The ensemble forecast product includes control forecast values, ensemble means, and 50th percentile values. In the winter of 2022, the TS score evaluation test in all forecast durations, including medium-to-long term, shows that the ensemble forecast’s control forecast values and ensemble means outperform the 50th percentile forecast values and ECMWF’s visibility products in the visibility ranges of 1 km, 1-3 km, and 3-5 km. For the visibility ranges of 5-10 km and greater than 10 km, the TS scores of the control forecast values, ensemble means, 50th percentile forecast values, and ECMWF’s visibility products are relatively close. Based on the visibility ensemble forecast product, three post-processing methods (probability matching, optimal percentiles, and neural networks) are developed to improve forecast TS scores compared to the ensemble forecast product. The average TS scores for visibility below 1 km are 0.126, 0.126, and 0.130 for the optimal percentiles, probability matching, and neural network methods, respectively. For visibility in the range of 1-3 km, the average TS scores are 0.168, 0.168, and 0.170, respectively. These post-processing methods provide an improvement of around 10% and 7% for visibility below 1 km and in the 1-3 km range, respectively, compared to the ensemble forecast. Analysis of the forecast model reveals errors primarily originating from discrepancies between the ECMWF model’s input factors and observed values, such as 2 m humidity and wind fields. Each post-processing method exhibits advantages in different forecast lead times and visibility ranges, which are integrated using statistical methods for optimal ensemble forecasting. The TS score evaluation of the visibility post-processing optimal ensemble shows overall similarity or slight superiority compared to individual methods in the low visibility range. The minimum ensemble method slightly outperforms mean and weighted ensemble products in TS scores between 0-3 km but performs worse above 3 km. To emphasise the forecast focus on low visibility, the minimum ensemble method is selected to generate the optimal ensemble forecast product, enhancing the forecast service capability for low visibility weather during the extended period.