A new weighted quantile regression

The objective of the study is to use quantile regression to estimate extreme value events. The exploration of extreme value events requires the use of heavy-tailed distributions to build a model which fits the data well. One needs to estimate high conditional quantiles of a random variable for extreme events. Quantile regression ultimately yields results which the alternative mean regression method has no hope to offer, leading to it being labeled as the more powerful method. In order to improve this approach even further, a weighted quantile regression method is introduced with a complete comparison to the unweighted method. The Monte Carlo simulations show good results for the proposed weighted method. Comparisons of the proposed method and existing methods are given. The paper also investigates two real-world examples of applications on extreme events using the proposed weighted method. Subjects: Science; Mathematics & Statistics; Statistics & Probability; Statistics; Statistical Computing; Statistical Theory & Methods; Statistics for Business, Finance & Economics; Research Article


PUBLIC INTEREST STATEMENT
It is important to study extreme events in various disciplines including finance, earth sciences, and biological sciences. For example, equity risks, wildfires, extreme floods, pollutions, and tumor growth are potentially damaging to both humans and the environment. The outcome of successfully modeling such events leads to accurately predicting their risks of occurring in the future. Estimating the high conditional quantiles of extreme values of related variables is a very important task. The traditional mean regression model is not sufficient for this task. This paper proposes a new weighted quantile regression model. The study results of asymptotic properties and Monte Carlo simulations show that the proposed methods perform better relative to regular quantile regression. We also study two real-world examples, namely a CO 2 Emission and Apple Stock Prices example, showing that the proposed weighted quantile regression methods give more reasonable predictions of high quantile values of related variables.

Objective and motivation
Extreme value theory (de Haan & Ferreira, 2006) deals with extreme events in various worldwide disciplines including finance, earth sciences, and biological sciences. This is an important field in probability and statistics for studying extreme events including equity risks, wildfires, extreme floods, and tumor growth which are potentially damaging to both humans and the environment. The outcome of successfully modeling such events leads to accurately predicting the risk of those extreme events occurring in the future. Estimating the high conditional quantiles of extreme values of related variables is thus a very important task.
The well-known mean regression model assumes that the conditional mean of y given is where = (1, x 1 , x 2 , … , x k ) T contains given variables and = ( 0 , 1 , … , k ) T where ∈R p (p = k + 1). By solving via minimizing the L 2 -squared distance, one obtains a least squares estimator for from a random sample (y i , x i1 , x i2 , … , x ik ) where i = 1, 2, … , n, namely The mean linear regression provides the mean relationship between a response variable and explanatory variables. There are limitations present in the conditional mean models. When analyzing extreme value events, where the response variable y is heavy-tailed distributed, the mean linear regression cannot be extended to non-central locations. Quantile regression estimates conditional quantiles and it will be used to estimate values of extreme events (Hao & Naiman, 2007;Yu, Lu, & Stander, 2003). We will study two real-world examples in the following sections.

CO 2 emission example
A greenhouse gas is a gas in an atmosphere that absorbs and emits radiation within the thermal infrared range. This process is the primary cause of the greenhouse effect. The greenhouse effect is the process by which radiation from a planet's atmosphere warms the planet's surface to a temperature higher than what it would be in the absence of its atmosphere. The main greenhouse gases in the Earth's atmosphere include: water vapor, carbon dioxide, methane, nitrous oxide, and ozone. Carbon dioxide is the primary greenhouse gas that is contributing to climate change. If the global CO 2 emission continues its steady increase over time as shown in Figure 1, the repercussions could be deadly (BP, 2014). Without greenhouse gases, the average temperature of the Earth's surface would be about −1 • C rather than the current average of 15 • C. Attribution of recent climate change involves the work of discovering mechanisms responsible for recent changes observed in the Earth's climate which is more commonly known as global warming. The overriding mechanisms to which recent climate change has been attributed arise as the result of human activity which includes the increase in atmospheric concentrations of greenhouse gases. Impacts of climate change, including glacier retreat, the changes in the timing of seasonal events, and agricultural productivity, have already been detected. The 2012 CO 2 emissions from fossil fuel use and industrial processes in 181 countries have been collected from the Emission Database for Global Atmospheric Research (2017) (http://edgar.jrc.ec.europa.eu/). An annual emission of over 40,000 kilotons of CO 2 is considered to be noteworthy (Tracker, 2015). The selected threshold is therefore 40,000 kilotons of CO 2 , which leads to 67 records remaining in the data set. In Figure 2, the x-axis is the country (in alphabetical order), the y-axis is CO 2 emission (in kilotons), and the red line indicates the threshold of 40,000 kilotons of CO 2 . The top 10 emission data are in Table 1.
One can see quadratic behavior in Figure 3 (i.e. as the oil consumption increases, a high CO 2 emission is much more probable) along with both a least squares curve and least squares surface. This is not surprising since the burning of oil is a main human source of CO 2 emissions. We use a traditional polynomial regression model  where x is (oil consumption (1000 BPD)) 0.25 and y is (CO 2 emission (kt)) 0.25 . We use ̂ LS in (1) and sample with size n = 67 to compute ̂ LS in (1). The ̂ LS curve in Figure 3(a) merely estimates the average (CO 2 emission (kt)) 0.25 for a given level of (oil consumption (1000 BPD)) 0.25 . Recalling that one is interested in high CO 2 emissions which can inflict damage, the quantile regression method will be used to estimate the desired high (i.e. 95%) conditional quantiles of CO 2 emission. We will continue to study this example in detail in Subsection 5.1.

Apple stock prices example
The closing stock price (CSP) is the final price at which a security is traded on a given trading day. The closing price represents the most up-to-date valuation of a security until trading begins again on the following trading day. Investors are interested in predicting when particular stock prices will be high in order to maximize their rate of return. The daily Apple closing stock prices from 1 January 1990 until 31 December 2015 have been collected from Yahoo Canada Finance (2017) with every 30th data point being retained (https://ca.finance.yahoo.com/). A closing stock price of 30 US dollars is the selected threshold due to the fact that a stock with a closing stock price of over 30 US dollars is considered to have much potential to grow, which leads to 163 records from the original 218 remaining in the data set (Chaturvedi, 2009). The top 10 Apple closing stock price (CSP-Apple) data are in Table  2. In Figure 4, the x -axis is the date, the y-axis is the CSP-Apple (in US dollars), and the red line indicates the threshold of 30 US dollars.
One can see linear behavior in Figure 5 along with both the least squares lines and the least squares surface. Since Apple Inc., IBM Corporation, and EMC Corporation are all popular technology companies with excellent reputations, their closing stock prices are well known to be very predictive of one another. Utilizing the traditional linear regression model with two regressors (x 1 and x 2 ), namely  where x 1 is (CSP-IBM (USD)) 0.5 , x 2 is (CSP-EMC (USD)) 0.5 , and y is (CSP-Apple (USD)) 0.5 . We use ̂ LS in (1) and sample with size n = 163 to compute ̂ LS in (1). However, the ̂ LS surface in Figure 5 . Recalling that one is interested in high CSP-Apple which can lead to noteworthy return rates, the quantile regression method will be used to estimate the desired high (i.e. 95%) conditional quantiles of CSP-Apple. We will continue to study this example in greater detail in Subsection 5.2.

Main methods and results
In this paper, we propose a new weighted quantile regression method in order to improve the regular quantile regression method. In this paper, we will do three studies: (1) A weight as a function of local conditional density is proposed. An estimate of this weight is also given.
(3) Monte Carlo simulations will be performed to show the efficiency of the new weighted quantile regression estimator relative to the regular quantile regression estimator.
(4) The new proposed method will be applied to real-world examples of extreme events and compared to mean regression and regular quantile regression.
In Section 2, we review some notation. In Section 3, we propose a weighted quantile regression method and give its good asymptotic properties for any uniformly bounded positive weight independent of response variable y with the conditional density as the weight. In Section 4, the results of Monte Carlo simulations generated from the bivariate Pareto distribution Type II show that the proposed weighted method produces high efficiencies relative to existing methods. In Section 5, the three regression methods: mean regression, regular quantile regression, and the proposed weighted  quantile regression are applied to the real-life examples: CO 2 Emission (Subsection 1.1) and Apple Stock Prices (Subsection 1.2). Three goodness-of-fit tests are used to assess the distributions of the data. Studies of the examples illustrate that the proposed weighted quantile regression model fits the data better than the existing quantile regression method. Pickands (1975) first introduced the Generalized Pareto Distribution (GPD).

Definition 1
The cumulative distribution function (c.d.f.) of the two-parameter GPD( , ) with the shape parameter > 0 and scale parameter > 0 of a random variable Y is given by The th linear conditional quantile of a continuous random variable y with the c.d.f. F(y) for given is defined as Koenker and Bassett (1978) proposed a L 1 -loss function to obtain estimator ̂ ( ) by solving and is a loss function, namely Huang, Xu, and Tashnev (2015) proposed a weighted quantile regression method where w i ( i , ) is any uniformly bounded positive weight function independent of y i , i = 1, … , n.

Proposed weighted quantile regression
In this paper, we propose two specified weights w i ( i , ) in (6): (1) The power function of the L 2 inverse norm of , where M is a finite real number (2) The local conditional density of y for where n is the sample size, f i ( i ( | i )) is uniformly bounded at the quantile points i ( | i ), and M is a finite real number. (4) The reasons for using weights (7) and (8) are as follows: since (y i − T i ( )) is a measure of absolute error from y i to the true th conditional quantile T i ( ), i = 1, 2, … , n. The weight (7) is a function of the inverse norm of . Giving this weight on the errors makes a more balanced measure on total error gives a relative likelihood of y taking values from a small neighborhood of the point y = i ( | i ). As Koenker (2005) suggested, when the conditional densities of the response are heterogeneous, it is natural to consider whether weighted quantile regression might lead to efficiency improvements. In Section 5, we discuss how to select parameters of and based on the data set. In this paper, we are looking for improvement in efficiency using weight (8) in Section 4 simulations and using weights (7) and (8) in Section 5 for the CO 2 Emission and Apple Stock Prices examples. Next, we discuss properties of the proposed weighted estimators. Huang et al. (2015) derived asymptotic distribution of ̂ n(w) ( ) in (6) as Theorem 1 under the following regularity conditions.

Condition 1 (C1).
The F i 's are absolutely continuous, with continuous densities f i ( ) uniformly bounded away from 0 and ∞ at the quantile points i ( | i ), i = 1, 2, … .

Comparison of quantile regression models
In order to compare the regular and weighted quantile regression estimators in (5) and (6), we extend the idea of measuring goodness of fit by Koenker and Machado (1999) and suggest to use a where ̂ ( ) is given by (5), and w i = w i ( i , ), ̂ w ( ) are given by (6).

Simulations
In this section, Monte Carlo simulations are performed. We generate m random samples of size n each from the bivariate Pareto distribution Type II (Arnold, 2015) for random vector (X, Y) with a joint c.d.f. and the conditional quantile function of y given x with the c.d.f. in (10) is The conditional density of y for given x is and the th conditional density of y for given x at the th quantile is Assume that the true conditional quantile is Q y ( |x)= 0 ( ) + 1 ( )x. We use two quantile regression methods: (1) The regular quantile regression Q R ( |x) estimation based on (5), namely (2) The weighted quantile regression Q W ( |x) estimation based on (6), namely For each method, we generate size n = 300, m = 1, 000 samples. (8), then the weights are The simulation mean squared errors (SMSE) of the estimators (12) and (13) are: where the true th conditional quantile Q y ( |x) is defined in (11). N is a finite x value such that the c.d.f. in (10) F(N, N) ≈ 1. We let N = 1, 000 and the simulation efficiencies (SEFF) are given by where SMSE(Q R ( )) and SMSE(Q W ( )) are defined in (15) and (16), respectively. Table 3 displays the SEFF(Q W ( )) for varying values using the weight in (14). It shows that all of the SEFF(Q W ( )) are larger than 1 when = 0.95, … , 0.99. Figure compares the SMSE(Q R ( )) with the SMSE(Q W ( )) for = 0.95, … , 0.99. It demonstrates that all SMSE(Q W ( )) for our proposed weight in (14) have smaller values than SMSE(Q R ( )).
Furthermore, Figure 7 shows the box plots for estimating the true 0 and 1 when = 3 using Q R ( |x) and Q W ( |x) for = 0.95 and 0.97, respectively. It reveals that the proposed Q W ( |x) is unbiased and produces more accurate ̂ w0 ( ) and ̂ w1 ( ) estimators of the true 0 and 1 for = 0.95 and 0.97. Also, the variances of Q W ( |x) are relatively small than Q R ( |x) for = 0.95 and 0.97.
From the results of the simulation, we can conclude that Table 3 and Figures 6 and 7 show that for = 0.95, … , 0.99, the proposed weighted regression Q W ( |x) with the weight in (14) is more efficient relative to the regular quantile regression Q R ( |x).

Real examples of applications
In this section, we applied the following three regression models to the CO 2 Emission (Subsection 1.1) and Apple Stock Prices (Subsection 1. 2) examples introduced in Section 1: (1) The traditional mean linear regression (LS) estimator ̂ LS in (1); (2) The regular quantile regression Q R estimator ̂ ( ) in (5); (3) The proposed weighted quantile regression Q W estimator ̂ w ( ) in (6) with weight w i ( i , ) in (7) and (8).
Remark 1 To estimate the proposed local conditional density in weight w i ( i , ) = f i ( i ( | i )) in (8), we use kernel density estimation (Scott, 1992;Silverman, 1986).
where f (y, ) is an estimator of the joint density of y and and ̂ ( ) is an estimator of marginal density of . We estimate the conditional quantile function ( | ) of y given by inverting an estimated conditional c.d.f. F (y| ) (Li & Racine, 2007) where F (y| ) is the estimated conditional c.d.f. F(y| ).
Note that a d-dimensional multivariate kernel density estimator from a random sample where h is the window width and the kernel function K( ) is a function defined for d-dimensional which satisfies ∫ R d K( )d = 1. Fukunaga (1972) suggested using where is the sample covariance matrix of the data, K is the normal kernel and the function k is given by An estimator for the optimal window width is where A(K) = {4∕(d + 2)} 1∕(d+4) is the constant for a multivariate normal kernel.

Remark 2
The weighted quantile regression models based on ̂ w ( ) in (6) with two weights in (7) and (8) are applied to each example are as follows: (1) We use = 4, = 3 in weights w i ( i , ) in (7) and (8) for the CO 2 Emission example.
(2) We use = 4, = 1 in weights w i ( i , ) in (7) and (8) for the Apple Stock Prices example. Figure 8 shows the reason for choosing and according to empirical data and weight changes which help justify the reasoning for using these weights. One can see in Figure 8 (8) decrease for the two examples. While the CO 2 emission example is more meaningful when relatively small weights are used, the Apple stock prices example is more meaningful when relatively large weights are used. It is worthwhile to study these ideas further and discover the reasoning behind them in order to help find an optimal weight.
with ̂ ( ) in (5) and the weighted quantile regression model with ̂ w ( ) in (6) with two proposed weights, namely w i ( i , ) = (x −2 i1 + x −2 i2 ) 4 in (7) and w i ( i , ) = (f i ( i ( | i ))) 3 in (8). The scatter plot of the data-set is seen in Figure 3. The mean regression ̂ LS curve in Figure 3(a) only estimates the average CO 2 emission. Here, we will use the quantile regression model in (19) to estimate high conditional quantiles of extreme corresponding annual CO 2 emission.
Firstly, the data y 1 , y 2 , … , y n are transformed to y * 1 , y * 2 , … , y * n via y * i = y i − with the intention of fitting the data to the GPD model in (4) with = 14 and MLE ̂ MLE = 6.8814 and MLE ̂ MLE = 0.0557. The log − log plot labeled Figure 9(a) indicates that the data fit the distribution well.
Additionally, the Kolmogorov-Smirnov (Kolmogorov, 1933), the Anderson-Darling, and the Cramervon Mises (Anderson & Darling, 1952) goodness-of-fit tests were performed. Table 4 indicates that the Kolmogorov-Smirnov (K-S) test allows one to conclude that the transformed data fit the GPD model with a probability of 84.60%, and, comparably, the Anderson-Darling (A-D) and the Cramervon Mises (C-v-M) tests imply a similar conclusion with a 90.01% probability and a 90.03% probability, respectively. Figure 9(b) shows that the transformed data fit the GPD model well.  (19) is therefore used. It can be seen that, when the quantile is high, Q W , in general, has a higher CO 2 emission than Q R . This is further demonstrated in Table 5. Figure 10(a) and (b) as opposed Figure 10(c) and (d).
With the intention of comparing the regular quantile regression with the weighted quantile regression, we use a relative R( ) defined in (9). Figure 11(a) and (d), and Table 6 show that R( ) > 0 when 0.90 ≤ ≤ 0.94 in Figure 11(a) and when ≥ 0.95 in Figure 11(d) which means that V weighted ( ) < V regular ( ), we also note that the Q W curves fit the data better than the Q R curve in these cases. Also, Figure 11(b), (c), (e), and (f) as well as Table 7 show that the values of ̂ 1 ( ) and ̂ 2 ( ) are consistent with both Figure 11 and Table 5. It is clear that, based on Figure 11(b) and (c) where w i ( i , ) = (x −2 i1 + x −2 i2 ) 4 , the estimated high conditional quantiles of CO 2 emission increase more as the transformed oil consumption increases and increase less as the squared transformed oil consumption increases if one is using the weighted quantile regression method in comparison to the regular quantile regression method. These differences are certainly noteworthy when, for example, = 0.90. It is clear that, based on Figure 11(e) and (f) where w i ( i , ) = (f i ( i ( | i ))) 3 , the  estimated high conditional quantiles of CO 2 emission increase more as the transformed oil consumption increases and increase less as the squared transformed oil consumption increases if one is using the weighted quantile regression method in comparison to the regular quantile regression method. These differences are only noteworthy when = 0.95 and Table 5. Estimated high quantiles of the CO 2 emission (kt) example The quantile regression model of the transformed CO 2 emission, and moreover the weighted quantile regression models of the transformed CO 2 emission have proven to be very useful as a guide as well as an assistive tool in proper environmental-related preparation and, hence, correct decision-making for those in both environmental analysis and environmental implementation fields since these models appear to fit the data significantly better than the alternative mean regression model as shown by several visual representations and numerical computations.

Apple stock prices example
In Subsection 1.2, the mean regression model in (1) (1) is applied along with the regular quantile regression model with ̂ ( ) in (5) and the weighted quantile regression model with ̂ w ( ) in (6) with two proposed weights: (8). We use the quantile regression model in (20) The data y 1 , y 2 , … , y n are transformed to y * i = y i − to fit the GPD model (4) with = 30 0.5 and MLE ̂ MLE = 5.5388 and MLE ̂ MLE = 0.0178 . The log-log plot labeled Figure 12(a) indicates that the data fit the distribution well. Table 8 indicates that the Kolmogorov-Smirnov test allows one to conclude that the transformed data fit the GPD model with a probability of 21.50%, and, comparably, the Anderson-Darling and the Cramer-von Mises tests imply a similar conclusion with a 11.35% probability and a 15.45% probability, respectively. Figure 12(b) shows that the transformed data fit the GPD model well. Figure 13 further indicates that there is a linear relationship between (Apple closing stock price (USD)) 0.5 (y) and both (IBM closing stock price (USD)) 0.5 (x 1 ) and (EMC closing stock price (USD)) 0.5 (x 2 ). The 2-regressor (i.e. x 1 and x 2 ) model in (20) was therefore used. It can be seen that, when the quantile is high, Q W , in general, has a higher Apple closing stock price than Q R . This is further demonstrated in Table 9. In general, these observations are more apparent if Figure 13 Figure 13(c) and (d).
Figure 14(a) shows that R( ) > 0 when ≥ 0.95 which means that V weighted ( ) < V regular ( ), further allowing one to conclude that the Q W curves fit the data better than the Q R curve in these cases. Also, Figure 14(b), (c), (e), (f) as well as Table 11 show that the values of ̂ 1 ( ) and ̂ 2 ( ) are consistent with both Figure 13 and Table 9. It is clear that, based on Figure 14 indicates that both methods' dependence on the transformed IBM closing stock price is similar.
These observations suggest that the weight w i ( i , ) = (x −2 i1 + x −2 i2 ) 4 is a better choice for this data-set.
It is interesting to see the difference in the relationship between the transformed Apple closing stock price and the transformed EMC closing stock price when using the mean regression method in comparison to the quantile regression methods. The mean regression method allows one to conclude that there seems to be a negative relationship between the two whereas the quantile regression methods suggest a positive relationship. The mean regression method appears to put more weight on the low transformed Apple closing stock prices, whereas the quantile regression method = .
(CSP-IBM) 0.5 , (CSP-EMC) 0.5 = 4.5 appears to put more weight on the high transformed Apple closing stock prices. The stock market is very complicated and this is therefore worth investigating further to find out which relationship is more plausible in the real world.
The quantile regression model of the transformed closing stock price, and moreover the weighted quantile regression models of the transformed closing stock price, has proven to be very useful as a guide as well as an assistive tool in proper stock price forecasting and, hence, providing correct purchasing decisions for those interested in investing in Apple stocks since these models appear to fit the data significantly better than the alternative mean regression model as shown by several visual representations and numerical computations.

Overall conclusions and suggestions
In this paper, we proposed a new weighted quantile regression method. The main contributions are: (1) Quantile regression has an efficient way to estimate high conditional quantiles with an L 1 -loss function which overcomes the limitation of the traditional mean regression, particularly in the analysis of extreme events.
(2) Based on Huang et al. (2015), the proposed weighted quantile regression estimator has good asymptotic properties and a good convengence rate which are illustrated in this paper.
(3) The proposed weighted quantile regression method performed better than the regular quantile regression method in the computational simulations. The simulation results show the higher efficiencies of the proposed weighted quantile regression estimator relative to the regular quantile regression estimator.
(4) The proposed weighted quantile regression method behaved better via goodness-of-fit than the regular quantile regression in the two examples of extreme events discussed. Additionally, the high quantile values of the response variable related to the explanatory variable(s) can be predicted. The proposed method gives an alternative way of studying extreme events.
(5) The proposed weighted quantile regression method leads our search for an optimal weight which is therefore thought to be worthwhile.