Retrieval of Boundary Layer Height and Its Influence on PM2.5 Concentration Based on Lidar Observation over Guangzhou

Wavelet analysis was applied to lidar observations to retrieve the planetary boundary layer height (PBLH) over Guangzhou from September 2013 to November 2014 over Guangzhou. Impact of the boundary effect and the wavelet scale factor on the accuracy of the retrieved PBLH has been explored thoroughly. In addition, the PBLH diurnal variations and the relationship between PM2.5 concentration and PBLH during polluted and clean episodes were studied. Results indicate that the most steady retrieved PBLH can be obtained when scale factor is chosen between 300-390 m. The retrieved maximum and minimum PBLH in the annual mean diurnal cycle were ~1100 m and ~650 m, respectively. The PBLH was significantly lower in the dry season than in the wet season, with the average highest PBLH in the dry season and the wet season being ~1050 m and ~1200 m respectively. Compared to the wet season, the development of PBLH in the dry season was delayed by at least one hour due to the seasonal cycle of solar radiation. Episode analysis indicated that the PBLH was ~50% higher during clean episodes than during haze episodes. The average highest PBLH in the haze episodes and clean episodes were ~800 m and ~1300 m, respectively. A significant negative correlation between PBLH and PM2.5 concentration (r = -0.55**) is discovered. According to China“Ambient Air Quality Standard”, the PBLH values in good and slightly polluted conditions were 1 / 6-1 / 3 lower than that in excellent conditions, while the corresponding PM2.5 concentration were ~2-2.5 times higher.


INTRODUCTION
Planetary boundary layer (PBL) is directly affected by the activity from the Earth's surface. Surface forcing mechanisms usually exert influence on the PBL on time scales shorter than an hour (Stull [1] ). Haze, which refers to the atmospheric visibility of 10 km or less caused by aerosol systems composed of non-hydrated materials (Wu et al. [2] ), occurs frequently in large cities in China (Chan and Yao [3] ; Gao et al. [4] ). The haze episodes have been increasingly frequent due to the rapid industrialization and urbanization in China. Particulate matter (PM) pollution has become an important environmental issue in Guangzhou, a metropolis in the Pearl River Delta region (PRD) in southern China (Liu et al. [5] ; Jung et al. [6] ; Wu et al. [7][8] ). Studies have shown that the interaction between the PBL and aerosol aggravate air pollution issues (Wang et al. [9] ; Ding et al. [10] ; Petäjä et al. [11] ). Heavy PM pollution enhances the stability of the PBL (Ding et al. [12] ), leading to further accumulation of PM concentration, which in turn leads to further stabilization of the boundary layer and forms a positive feedback of stable boundary layer and heavy PM pollution (Yu et al. [13] ). The development and changes of the PBL structure vary in different regions due to local effects and climate system variations such as seasonal cycles, monsoon activities, land-sea breezes, topography effects, and heat island effects (Fan et al. [14] ; Yang et al. [15] ; Huang et al. [16] ). The planetary boundary layer height (PBLH) directly determines the atmospheric environmental capacity and plays an important role in the formation and development of haze episodes (Chen observational data, the method to obtain lidar normalized backscatter signal, and the identification of PBLH by the wavelet transform. Section 3 presents results and discussions on (1) verification of the PBLH retrieval method, including a wavelet transform scale factor sensitivity experiment with respect to the effects of the scale factor on PBLH retrieval; (2) method verification and boundary layer characteristic analysis with case studies; (3) PBLH evolutions over Guangzhou during the dry and wet seasons and during the haze and clean episodes; (4) relationships between PBLH and PM 2.5 concentration. Conclusions are presented in Section 4.

Data
The micropulse Mie scattering lidar (Model 1014, Sigma Space Co., USA) was deployed at the Guangzhou Meteorological Bureau from September 26, 2013 to November 7, 2014. The laser transmitter of the lidar emits a green laser beam at the wavelength of 532 nm. The lidar has a vertical resolution of 15 m and a time resolution of 1 minute, with its minimum and maximum detection heights at 255 m and 60 km, respectively. Observational data obtained in precipitation are excluded in this study.
Meteorological data, including wind speed, temperature, relative humidity, visibility, and precipitation, were obtained from the GMB, and additional observational data, including PM 2.5 concentration and relative humidity, were obtained from the Guangzhou Panyu Atmospheric Composition Station. The temperature sounding site is located in Sanshui Meteorological Bureau. Haze episodes were defined as periods with daily mean visibility lower than 7 km, relative humidity lower than 90%, and a duration of equal to or more than 3 days. Clean episodes were defined as periods with daily mean visibility higher than 15 km and a duration of equal to or more than 3 days.

Normalized backscattered signal acquisition method
Raw lidar data must be corrected before processing. Required corrections include afterpulse correction, background correction, deadtime correction, overlap correction, range correction, etc. (Campbell [41] ). The relationship between the emitted laser energy and the energy of the backscattered echo signal with corrections are shown as Equation 1-3: where r is the height; P(r) is the energy of the backscattered echo signal; E is the laser emission energy; C is the lidar constant; O olp is the overlap correction coefficient; n b (r) is the background noise correction coefficient; n ap (r) is the afterpulse correction coefficient; D(n(r)) is the deadtime correction coefficient; T is the atmospheric transmittance; σ 1 (r) and σ 2 (r), and β 1 (r) and β 2 (r) are the extinction coefficients (σ) and backscatter coefficients (β) of (1) aerosols and (2) air molecules, respectively. The normalized backscattered signal (NRB), X(r), after correction can be written as: 2.3 Using the wavelet transform to retrieve the PBLH The NRB profile corrected by the lidar data can be used to retrieve the boundary layer height. The basic principle of the retrieval is to use aerosol concentration to distinguish the PBL from the free atmosphere. Common PBLH retrieval methods include the gradient method (Hayden et al. [42] ; Wulfmeyer [43] ; Summa et al. [44] ), idealized backscatter method (Steyn et al. [45] ), cluster analysis method (Toledo et al. [46] ), wavelet analysis method (Davis et al. [47] ; Liu et al. [5] ), signal variance method (Poltera et al. [48] ), artificial identification method (Boers et al. [49] ), etc. The aerosol concentration in the PBL is much higher than that in the free atmosphere, and thus the resulting backscattered signal is significantly stronger in the PBL in theory. In reality, however, the difference in signal between the PBL and free atmosphere is not necessarily obvious. The multilayer structure often brings interference to the retrieval, and some signals with low signal-to-noise ratio also bring dramatic challenges to the retrieval. Gradient method is prone to produce misjudgment under the influence of multiple aerosol layer, and has poor applicability in signals with low signal-to-noise ratio. Idealized backscatter method and cluster analysis method need to use the idealized signal model to influence the shape of the profile; however, it is difficult to deal with the profile with large bias from the model. Signal variance method suffers from boundary effects and multiple results misjudgments. Artificial identification method such as the earliest retrieval method has high reliability, while it is difficult to cope with large amount of data due to limited discrimination accuracy and processing speed. As a more commonly used automatic discrimination method, the wavelet analysis is mainly affected by the scale factor. In order to make improved use of batch retrieval and ensure the reasonability, this study used wavelet analysis method to retrieve the PBLH.
The simplest orthogonal wavelet Haar wavelet was used along with the wavelet covariance transform (WCT) method in this study. The Haar wavelet function is defined as where r is the height, a is the scale factor, and b is the shifting factor of Haar wavelet. The Haar WCT coefficient is defined as: where X(r) is the NRB profile; r t and r b are the identified upper and lower limits of the PBLH, respectively. The lower statistical limit of the lidar blind zone is set at 255m and the upper statistical limit is set at 2000m, which is almost higher than the PBLH in all cases. (Wf) (a,b) is a parameter that illustrates the similarity between the NRB profile and the Haar wavelet, where larger (Wf) (a, b) values indicate higher similarity. Retrieval of PBLH utilizes the property that b value changes during the integration of each NRB profile. The PBLH is obtained when (Wf) (a, b) reaches the maximum value. The accuracy of the wavelet analysis retrieval is largely connected with the choice of scale factor. Finding an appropriate scale factor a is therefore crucial to the automatic batch retrieval of PBLH from lidar observations.

Sensitivity analysis for the wavelet transform scale factor
3.1.1 BOUNDARY EFFECT OF HAAR WAVELET TRANSFORM Wavelet in the wavelet transform is defined to be infinite in theory. However, the lidar signal is finite. The Haar wavelet is illustrated in Fig. 2 (a) with a case of signal profile given in Fig. 2 (b). The wavelet transform can be treated as a kind of integral calculation. The value of the Haar wavelet is 1 in the neighborhood of (b-a / 2, b), with a corresponding positive integral (blue shaded area), and the value of the Haar wavelet is -1 in the neighborhood of (b, b + a / 2), with a corresponding negative integral (red shaded area). Therefore, the wavelet coefficient of b can be obtained by subtracting the area of red shaded region from the blue shaded region, which is clearly related to the slope of the fitting profile and the scale factor a. Suppose the lower boundary of the signal is c (Fig.  2(c)). According to the recognition principle, the shift factor b calculates the wavelet coefficients from the lower boundary of the lidar profile ( Fig. 2(c)) and gradually shifts and calculates to the upper boundary of the wavelet (Fig. 2(d~f)), by which the wavelet coefficients of the entire profile could be obtained. The problem is that in the process of changing the shift factor in the area from b=c to b < c + a / 2 ( Fig. 2(c~d)), the integral value in the interval (b-a / 2, b) could only be partially calculated and an effective wavelet coefficient could not be obtained in the region of c ≤ b < c+a/2 due to the boundary effect of the Haar wavelet transform in the interval (0, a / 2). Therefore, the wavelet transform coefficients could only be effectively calculated when b ≥ c + a / 2 ( Fig. 2(f)). The boundary effect at the lower boundary of the signal was discussed above. Similarly, the boundary effect at the upper boundary of the signal also exists. Generally, the problem of the boundary effect was the increment in the lower limit of recognition.
The signal extension is generally used as solutions to the boundary effect. Conventional extension methods include of symmetric method, antisymmetric method, periodic method, zero padding method, continuous method, smooth method, etc. To avoid significant singularities and signal deformation, linear extension is used as a common method (Yuan and Song [51] ). The extended wavelet is subjected to wavelet transform, and the part of boundary effect is removed from the wavelet transform coefficients, preventing the potential influence of feature extraction of the coefficients.

INFLUENCES OF SCALE FACTOR IN DIFFERENT PERIODS
The influence of wavelet analysis on the retrieval of PBLH in theoretical lidar signals was analyzed by Brooks [50] . The wavelet analysis was conducted using the observational data in this study. During the period from December 8, 2013 to December 12, 2013, a typical haze episode occurred in Guangzhou area. A typical cleaning episode from November 29, 2013 to December 2, 2013 is used for comparative analysis. Fig. 3 is the sensitivity analysis of the wavelet transform scale factor a, where (a) and (b) are the time series of the hourly average value of PBLH identified by wavelet analysis under the influence of a typical cleaning episode and a typical haze episode, respectively. Cohn and Angevine [30] found that the identified PBLH increases as the scale increases, but this study found that the recognition result decreases as the scale increases. Fig. 3 shows that the effects of the retrieval results are different in the following two periods: (1) In the typical cleaning episode, the PBLHs are slightly affected by the selection of a, and the standard deviation of the retrieval result is 4% on average. The PBLH was slightly affected around noon, and usually decreased slightly with the increase of a according to the slope of signal profile. (2) In the typical haze episode, the PBLH is significantly affected by the selection of a, with the same trend and standard deviation of the retrieval resulted in 8% on average. In the severe haze period (December 9, 1300 LT to December 11, 0700 LT), the PBLH increases with the increase of a. There is a multilayer coupling in the beginning and end stages of the haze period: the recognition results are slightly different at the beginning and end stages of the haze episode. At this special time, due to multi-layer aerosol coupling above the boundary layer in the troposphere, the recognition results are different. The following section uses specific profiles as examples to illustrate these phenomena.

RESULTS OF WAVELET TRANSFORM OF TYPICAL PROFILES
In this part, typical profiles are displayed to illustrate the differences between the results of the wavelet transform.  Scale factor a signal mutation of corresponding height has occurred at the corresponding scale factor. In the process of increasing the scale factor, several maximum values of wavelet transform coefficients gradually disappear, indicating that small mutations are easier to be identified at small scale factors, while the influence of noise can be easily introduced. Random noise is easier to find in the upper layers of the signals than in the lower layers. The scale is usually 100-200 m, and the scale factor should be selected higher than this scale. For instance, several extrema of wavelet transform coefficient (A, B, C, D, F, and G) exist at a = 150 m, which are not easy for identifying the real PBLH. Referring to Fig. 4(c), only the extrema A, B, F and G are retained when a = 300 m, wherein the peak F is exceedingly higher, indicating that the extremum F is the maximum, and the height corresponding to F at 1000 mis identified as PBLH. As the scale factor gradually increases, the maximum value line A of the wavelet transform gradually merges into line B, and the corresponding signal mutation at A is gradually distorted. According to Fig. 5(c), the extrema of wavelet transform coefficient B-E are maintained at a =300 m, wherein the peaks B and C are higher. At this time, B < C, so the height corresponding to C at 1050 m is identified as PBLH. In contrast, the values of extrema A and B are very small. When the scale factor a is larger than 450 m, the corresponding signal mutation at C is gradually distorted. At this time, B > C, and the height of 400 m corresponding to B could be identified as PBLH, which may cause misjudgment of PBLH. In addition, when the maximum value B is 150 m≤a≤570 m, the height decreases with the increase of the scale; when a> 570 m, the height increases with the increase of the scale, which is related to the effective recognition height of wavelet analysis. The maximum value C increases in height with increasing scale. After removing the influence of the boundary effect, the PBLH recognition will be different in the process of increasing the scale in the profile with different slope changes. This is one of the intrinsic characteristics of the wavelet analysis algorithm, so the key to the accurate recognition is to  1) the scale factor cannot be too small (for example, less than 200m), otherwise it is prone to introduce noise for calculation and cause misidentification; it is prone to false identification of the PBLH due to the aerosol layer coupled above the boundary layer, or make the wavelet change in the frequency domain not obvious (in this study, the size of the wavelet coefficient). (2) The scale factor cannot be too large (more than 400m), otherwise it will easily cause the wavelet time domain (height in this study) to be blurred. At the same time, it amplifies the boundary effects and increases the lower limit of recognition. (3) The selection of the scale factor should also facilitate the subsequent feature extraction, such as using the threshold method to extract the peaks and troughs of wavelet coefficients. It may be even necessary to extract the second largest peak and trough when studying multilayer aerosol structures. After testing a large number of cases, it is found that the scale factor between 300 and 390 m meets the above principles, with the results retrieved in this range being insignificantly different. In order to facilitate the calculation of large batches of data, here we finalize the selection of the scale factor a to be 300 m.

VERIFICATION OF RETRIEVAL RESULTS
In order to verify the retrieval results of the boundary layer height by lidar, the temperature sounding data was selected for comparison. The coincidence time of the radiosonde data and the available lidar data is from December 19 to 20, 2013. We selected the radiosonde observation at 16: 00 LT because the pollutant is sufficiently mixed in the boundary layer at this time and the vertical change of the boundary layer is significant. Potential temperature is used instead of virtual potential temperature in the retrieval of sounding data. It can be seen from Fig. 6 that the height of the boundary layer retrieved by the sounding data and lidar signal at 16: 00 LT on December 19 was about 1000 m and 1065 m, respectively, and the PBLH retrieved at 16: 00 on December 20 was about 900 m and 915 m respectively. The retrieval results of the two are close, indicating that the two methods are suitable for the retrieval of the boundary layer height.

Case analysis of boundary layer feature
Figures 7(a) and 7(b) show the PBLH retrieval results and spatial-temporal NRB distributions obtained for the typical clean and haze episodes, respectively. The lidar echo signal is extremely strong between 1300 m and 1800 m during the period from 1200 LT on The data shows clearly that the clouds are completely separated from the high-concentration aerosols, which implies that these clouds should be above the PBL. Therefore, the impact of these clouds must be removed from the wavelet analysis.    There are diurnal variations in the PBLH during both the clean and haze episodes in general. The diurnal variations are mainly affected by solar radiation. The analysis indicates that PBLHs were at 753 ± 315 m, 582 ± 212 m, and 473 ± 70 m during the typical clean, haze, and severe haze period, respectively. The clean episode PBLH was on average 29.4% and 59.2% higher than those during the typical haze and severe haze periods, respectively. In the afternoon (1300-1600 LT), the PBLH was identified at 1143 ± 147 m, 647 ± 114 m, and 562 ± 85 m during the typical clean, haze, and severe haze period, respectively. The PBLH in the typical clean episode was, on average, 76.7% and 103.4% higher than those during the typical haze episode and severe haze periods in the afternoon, respectively. The daytime PBLH during the typical cleaning process was generally 750-1500 m, while the daytime PBLH during the severe haze period was generally less than 500 m. The diurnal variations of PBLH were quite pronounced during the typical clean episode and at the beginning and end of the haze episode, although the diurnal change of the PBLH during the severe haze period was weak. Fig. 8 shows time series of PBLH, the surface ventilation index (SVI), and basic surface meteorological parameters during the typical haze episode. The SVI is equal to the PBLH multiplied by the surface wind speed (Deng et al. [25] ). Fig. 8(a) shows surface wind vector and PM 2.5 concentration time series during the typical haze episode. During this episode, the PM 2.5 concentration was at its lowest (~70 μg m -3 ) when the wind direction was NNE. The PM 2.5 concentration was higher under N, NNW, and NW wind directions, especially when the wind direction was NW, the PM 2.5 concentration exceeded 100 μg m -3 . During the haze episode, the upwind area of the NNW wind direction (the direction of the main urban area of Guangzhou) had major contribution to the pollution while the near-surface wind speed below 2.5 m s -1 had minor impact in removing the pollutants.
It can be seen from Fig. 8(b) that there is a negative correlation between PBLH and PM 2.5 concentration during the entire typical haze process with the correlation coefficient r = -0.77**. During the severe haze period, the PBLH was generally lower than~500 m with the temperature change significantly lower than in other periods. The variations in PM 2.5 concentration and    visibility were not large during the severe haze period and the peak values were observed during the morning and evening traffic hours. The PM 2.5 concentration was predominantly below 80 μg m -3 , and visibility was usually higher than 5 km when the PBLH was higher than 700 m. When the PBLH was between 500 m and 700 m, the variation in PM 2.5 concentration was large. When the PBLH was less than 500 m, PM 2.5 concentration typically exceeded 70 μg m -3 , and the visibility was less than 4 km. During the typical haze episode, the visibility in Fig. 8(c) was generally at a low level below 7 km. The lidar observations were excluded from analysis since precipitation occurred at the observation site in the afternoon on the 12 th . It is worth noting that the PM 2.5 concentration did not immediately decrease after the rain because (1) light rain is not enough to remove pollutants，(2) some fine particles can bypass the raindrops and continue to be suspended in the atmosphere, and (3) raindrops may break after falling to the ground and these broken raindrops have a chance to evaporate and cause the internal pollutants to be suspended again. The PM 2.5 concentration reached an extreme value of 130 μg m -3 in the morning on December 13, 2013 and then slowly declined. Visibility continued to decrease gradually after the precipitation event partly due to the aerosol hygroscopic growth in the humid environment post-rain. During the haze episode, wind speeds less than 2 m s -1 were associated with PM 2.5 concentrations higher than 80 μg m -3 . When the wind speed dropped below 1.5 m s -1 , the PM 2.5 concentration reached more than 100 μg m -3 . As discussed earlier, reduced visibility and poor air quality are in general associated with low PBLH, still wind, temperature inversion, and high relative humidity. Fig. 8(d) shows the time series of the surface ventilation index (SVI) during the typical haze episode. There is a significant negative correlation between SVI and PM 2.5 concentration (r = -0.73**), suggesting that the SVI can be used to characterize the vertical diffusion and horizontal transport of pollutants in the boundary layer. The SVI can accurately represent the ventilation capacity of the boundary layer. The SVI was 1238 ± 738.43 m 2 s -1 in the typical haze episode. When the maximum SVI reached 4000 m 2 s -1 , minimum PM 2.5 concentration was observed in the haze episode because the transport and diffusion capacity of the boundary layer was at the peak. The SVI calculated in this experiment is similar to the boundary layer ventilation index (VI) calculated using PBLH values retrieved from the stratified vertical wind field and temperature profiles in the PRD region by Wu et al. [8] . Overall, SVI values obtained from a combination of the lidar PBLH retrieval and surface wind speeds can be used to accurately calculate the atmospheric ventilation capacity at the bottom of the PBL.

PBLH characteristics over Guangzhou
The structural changes in the PBL are significantly affected by solar radiation, exhibiting a clear diurnal cycle. The surface absorbs net radiation heat after sunrise. As heat transport upwards, thermal convective turbulence begins to develop in the PBL and the PBLH begins to increase significantly, reaching a maximum value in the afternoon. Then the PBLH decreases as the surface temperature drops after sunset and the nocturnal stable boundary layer begins to develop. These processes form the typical diurnal cycle of PBL structure. The following section will discuss the statistical PBLH results of daytime PBLH in an annual scale and with respect to the dry and wet seasons, haze episode and clean episode, respectively.

DIURNAL VARIATIONS IN PBLH IN ANNUAL CYCLE AND IN THE WET AND DRY SEASONS
In this study, the dry season is defined as October to March and the wet season is defined as April to September. Fig. 9 shows PBLH diurnal variations averaged over the whole year, the dry season, and the wet season, respectively. The highest and lowest values in the annually-averaged PBLH appeared at 14: 00-15: 00LT and 7: 00LT, and the range of variation was between 600 and 1100 m with significant increase from 10:00LT. The highest and lowest values of PBLH in the dry season appeared at 15: 00LT and 9: 00LT, and the range of variation was between 600 and 1050 m with significant increase from 11: 00LT. The highest and lowest values of PBLH in the wet season appeared at 14: 00 and 7: 00, respectively, and the range was 600-1200m. It can be seen from the above results that, the appearance of the highest value of PBLH in the dry season and the dramatic increase of the boundary layer height were significantly delayed by at least 1 h compared with the wet season. Since the change in the height of the boundary layer is closely related to solar radiation, this may be related to the seasonal cycle of the local maximum solar zenith angle. The PBLH in the dry season was significantly lower than that in the wet season. The maximum value of PBLH in the dry season was about 150 m lower than that in the wet season, and the diurnal variation of the boundary layer structure was weaker than that in the wet season. The reason for these patterns may be that the solar radiation is significantly higher in the wet season than in the dry season due to the higher maximum solar zenith angle. More heat flux radiates from the surface as a consequence, which leads to stronger convective turbulent activity in the PBL and a more developed PBL in the wet season. The PBLH over Guangzhou identified in this study is slightly higher than the PBLH values found in studies in Beijing (Tang et al. [52] ), Tianjin (Zhu et al. [53] ), and Nanjing (Qu et al. [35] ). In addition, the dry season contains more cases with of daytime PBLH values lower than 600 m, while the wet season includes more cases with daytime PBLH higher than 1200 m, indicating that PBL development is more limited in the dry season. Therefore, pollutants are less likely to be effectively diffused during the dry season, enhancing the likelihood of atmospheric pollution episodes.

DIURNAL VARIATIONS IN THE PBLH DURING HAZE AND CLEAN EPISODES
Eight haze episodes and nine clean episodes occurred in Guangzhou during the observation period in this study. Summary statistics of the weather patterns for these episodes are presented in Table 1. Four haze periods were affected by a warm zone ahead of a cold front, two were controlled by warm low pressure zone, and two were influenced by peripheral high pressure areas caused by typhoon Haiyan. Three clean periods were affected by shear line, three by belt-shape low pressure areas, and the rest by continental high, tropical depression, and typhoon Kalmaegi. Fig. 10 shows the statistical results of the PBLH during the haze episodes and clean episodes. The highest value of PBLH in the haze episode and clean episode both appeared at 14: 00LT, at about 800 m and 1300 m, respectively; the lowest value appeared at 8: 00LT and 7: 00LT, at 550m and 650m, respectively. The variation in the PBL was minimal during haze episodes but pronounced during clean episodes. In general, the PBLH was~50% higher during clean episodes than during haze episodes. In addition, it is found that the PBLH was at a very low height during the haze episodes. In many polluted episodes, the effective development of the boundary layer structure was insufficient, which favored the accumulation of pollutants. The boundary layer in the cleaning process is well developed, creating good conditions for the vertical diffusion of pollutants.

Correlations between PBLH and PM 2.5 concentration
The relationship between PBLH, wind speed and PM 2.5 concentration in the haze episodes has been discussed briefly in the case analysis of section 3.2. The corresponding relationship between wind speed and the hourly average value of PM 2.5 concentration is shown in Fig. 11. It can be seen that when the PBLH is less than 500 m, the vertical diffusion capacity of pollutants is poor. At this time, the mean PM 2.5 concentration is 64 μg m -3 . There is no obvious relationship between wind speed and PM 2.5 . When the PBLH is in the range of 500-1000 m, the mean PM 2.5 concentration is 48 μg m -3 , and PM 2.5 concentration generally decreases with increasing wind speed; the correlation coefficient between wind speed and PM 2.5 concentration is -0.25. When PBLH is between 1000 m and 1500 m, the vertical diffusion capacity of pollutants in the boundary layer gradually increases. When the PBLH is greater than 1500 m, the conditions for vertical diffusion of pollutants in the boundary layer are much stronger, with the average concentration of PM 2.5 at 31μg m -3 , which is a relative low level. No obvious relationship between wind speed and PM 2.5 concentration can be obtained. The analysis above reveals a close relationship between PBLH and PM 2.5 . In order to quantitatively study the relationship between them, the following will conduct a correlation analysis between PBLH and PM 2.5 by assuming that the total amount of pollution emissions remains constant in     the short-term and by excluding the influence of other meteorological factors. According to the China "Ambient Air Quality Standard"on the 24-hour average PM 2.5 concentration limit, the episodes when the 24-hour average concentration of PM 2.5 is lower than the firstlevel concentration limit (35μg m -3 ) is defined as the "excellent" condition. The 24-hour average PM 2.5 concentration higher than the primary concentration limit (35μg m -3 ) and below the secondary concentration limit (75μg m -3 ) is defined as a"good"condition, and 24-hour average PM 2.5 concentration higher than the secondary concentration limit (75μg m -3 ) is defined as "slightly polluted"condition. The diurnal variations in PBLH and PM 2.5 concentration were calculated for the above defined air quality condition. As shown in Fig. 12, PBLH has obvious diurnal changes in the three conditions, but the magnitude of the change is different. In excellent, good and slightly polluted conditions, the average PBLH was 1064 m, 833 m, and 716 m, respectively. The peak height was 1252 m, 1059 m, and 881 m, respectively. The PBLH in the excellent condition was about 1.2 times higher than that in the good condition, and was about 1.4 times higher than that in slightly polluted condition.
The diurnal variation in PM 2.5 concentration is relatively small under excellent conditions, with a mean PM 2.5 concentration of~26 μg m -3 in the daytime. In good and slightly polluted conditions, the PM 2.5 concentration exhibits significant diurnal variations, with high concentration in the morning and evening and low concentration in the afternoon (mean concentration of~54 μg m -3 and 63 μg m -3 in the daytime, respectively). There is a significant negative correlation between PBLH and PM 2.5 . The PBLH under good and slightly polluted conditions are 1 / 6-1 / 3 lower than the PBLH in excellent condition while the corresponding PM 2.5 concentrations are~2-2.5 times higher. Fig. 13 shows a distribution of the daily average PBLH and PM 2.5 concentration values, which has a correlation coefficient of r = -0.55**; this coefficient supports the strong negative correlation between PBLH and PM 2.5 concentration discussed earlier.

CONCLUSIONS
This article used wavelet analysis to retrieve the planetary boundary layer height (PBLH) based on lidar observations and discussed the effect of the wavelet analysis scale factor a on the accuracy of the retrieved PBLH. With an optimal scale factor a selected, trends in PBLH diurnal variations over Guangzhou were studied during the wet and dry seasons, and during the haze and clean episodes. The relationship between PM 2.5 concentration and PBLH were further studied during different episodes.
Causes of the boundary effect of wavelet analysis were studied. The effect of the wavelet transform scale factor a on the PBLH retrieval was more obvious during haze episodes than during clean episodes. Large scale factor values cannot capture the real planetary boundary layer structure while small scale factor values caused instability and error in the results. In this study, the PBLH retrieval results were the most steady and accurate when the scale factor was between 300-390 m.
The highest annual average PBLH appeared between 14: 00 and 15: 00, at about 1100 m, and the lowest value appeared at 7: 00, at about 650 m. The boundary layer increased significantly from 10:00. In the dry season, the highest and lowest values of the average diurnal PBLH appeared at 15:00 and 9:00, and the range of variation was between 600 and 1050 m. The highest and lowest values of the average diurnal PBLH in the wet season appeared at 14:00 and 7:00, respectively, and the range was 600-1200 m. The occurrence of the highest value of PBLH and the significant increase of  the boundary layer were significantly delayed by at least 1 h from wet to the dry season. In addition, the PBLH in the dry season was significantly lower than that in the wet season, and the maximum value of PBLH in the dry season was about 150 m lower than that in the wet season. Diurnal variation of boundary layer structure was weaker in the dry season than in the wet season. In addition, the dry season has more cases below 600 m during the daytime than the wet season does while the wet season has significantly more cases above 1200 m during the day than the dry season does. In haze episode and clean episode, the average PBLH reached the highest (~800 m and 1300 m, respectively) at 1400 LT and lowest (~550 m and 650 m) at 0800 LT and 0700 LT, respectively. In general, the PBLH was~50% higher during clean episodes than during haze episodes. During the typical haze episode, the surface ventilation coefficient (SVI) valued at 1238 ± 738.43 m 2 s -1 can reasonably represent the boundary layer ventilation capacity. There was a significant negative correlation between SVI and PM 2.5 concentration (r = -0.73**).
The PBLH featured distinct diurnal variations under different pollution conditions. Under excellent, good, and slightly polluted conditions, the mean PBLH values were 1064 m, 833 m, and 716 m, respectively, and the peak PBLH values were 1252 m, 1059 m, and 881 m, respectively. The diurnal variation of PM 2.5 concentration was relatively small in excellent conditions, with a mean PM 2.5 concentration of~26 μg m -3 in the daytime. In good and slightly polluted conditions, the PM 2.5 concentrations showed obvious diurnal variations double peaked in the morning and evening, and hit low in the afternoon. There was a significant negative correlation between the PBLH and PM 2.5 concentration (r =-0.55**). The PBLH values in good and slightly polluted conditions were 1/6-1/3 lower than the PBLH in excellent conditions, while the corresponding PM 2.5 concentrations were~2-2.5 times higher.