Application of Sampling Strategies for Hot-Mix Asphalt Infrastructure: Quality Control-Quality Assurance Sampling; Specification for Performance Test Requirements

Due to the lack of a rational, effective, and systematic quality control-quality assurance (QC/QA) methodology, the nonconformity of construction quality with design requirements for public works, especially for civil engineering infrastructure systems, can result in increased expenditures over time. Thus, development of a rational QC/QA methodology to ensure that the construction quality complies with the design requirements should have a high priority. The limited sample size constrained by the consideration of cost and time may result in the misjudgement that the construction quality does not meet the design requirements. In this chapter, the effects of sampling size, sampling strategies, and acceptance/rejection criteria for QC/QA projects using statistically based decision making in hot-mix asphalt (HMA) construction are presented. Also, there has developed an increased interest recently in ensuring that the HMA as placed will meet certain performance requirements by measuring the actual performance parameters on test specimens prepared from in situ samples rather than from surrogate values such as asphalt content and aggregate gradation. Examples include direct measures of mix permanent deformation characteristics and fatigue characteristics, mix stiffness, and degree of compaction as measured by air-void content. Determination of sample size is primarily based on an acceptable error level for a performance parameter specified by the agency. It is not uncommon to base quality assurance by many agencies on three samples. Through the t distributions, discussion is presented as to why it is not appropriate to take only this number of samples for qualityassurance. Based only on three samples in a large project, the agency will have insufficient power to reject the null hypothesis given that the null hypothesis is false unless the project quality delivered by the contractor is extremely poor so that the agency is confident enough to reject the project. In addition to providing a general introduction to fundamental statistics and hypothesis testing, two case studies are used to clarify the relationships among sampling size, sample strategies, and performance specifications (or acceptance/rejection criterion). These include the following:


Introduction
Due to the lack of a rational, effective, and systematic quality control-quality assurance (QC/QA) methodology, the nonconformity of construction quality with design requirements for public works, especially for civil engineering infrastructure systems, can result in increased expenditures over time. Thus, development of a rational QC/QA methodology to ensure that the construction quality complies with the design requirements should have a high priority. The limited sample size constrained by the consideration of cost and time may result in the misjudgement that the construction quality does not meet the design requirements. In this chapter, the effects of sampling size, sampling strategies, and acceptance/rejection criteria for QC/QA projects using statistically based decision making in hot-mix asphalt (HMA) construction are presented. Also, there has developed an increased interest recently in ensuring that the HMA as placed will meet certain performance requirements by measuring the actual performance parameters on test specimens prepared from in situ samples rather than from surrogate values such as asphalt content and aggregate gradation. Examples include direct measures of mix permanent deformation characteristics and fatigue characteristics, mix stiffness, and degree of compaction as measured by air-void content. Determination of sample size is primarily based on an acceptable error level for a performance parameter specified by the agency. It is not uncommon to base quality assurance by many agencies on three samples. Through the t distributions, discussion is presented as to why it is not appropriate to take only this number of samples for qualityassurance. Based only on three samples in a large project, the agency will have insufficient power to reject the null hypothesis given that the null hypothesis is false unless the project quality delivered by the contractor is extremely poor so that the agency is confident enough to reject the project. In addition to providing a general introduction to fundamental statistics and hypothesis testing, two case studies are used to clarify the relationships among sampling size, sample strategies, and performance specifications (or acceptance/rejection criterion). These include the following: www.intechopen.com (1) A QC/QA case study is used to illustrate a methodology to determine strategies for a sampling scheme and selection of sample size for QC/QA for HMA construction to ensure that the acceptable level of a mix parameter is obtained with the same risk to the contractor and the agency. A sampling scheme and sampling size based on statistical simulation of a fixed length of a one-lane-width placement of HMA are discussed. Sample size is based on the combination of the sample size of the contractor and that of the agency to balance the risk to both organizations which will result in a mix that will meet the minimum performance requirement. An example is presented for the placement of 15,000 tons of HMA according to the California Department of Transportation (Caltrans) QC/QA requirements. For this total tonnage, the contractor and agency are assumed to perform a specific number of performance tests using the California stabilometer methodology for QC and QA.
(2) A QA case study is used to illustrate the application of the use of uniform design (UD) as a sampling strategy to ensure that the most representative sampling scheme can be achieved with a specified sample size. A sampling scheme using uniform design and sampling size through statistical simulation of a fixed length of a two-lane-width placement of HMA with several segregation data patterns is discussed. Based on the simulation, a QA guideline is developed by inspecting the accuracy of sample mean and the precision of sample standard deviation criteria combined with the application of the UD table is proposed and verified with two full scale pavement sections by measured air-void contents (measure of degree of compaction).

Case I: quality control-quality assurance sampling strategies for hot-mix asphalt construction
The effects of sampling strategies and size on statistically based decision making in hot-mix asphalt (HMA) construction are presented. For sample sizes agreed upon by the agency and the contractor, an acceptable level for an HMA mix parameter is determined with risk balanced between the two organizations. With increased emphasis on specific performance requirements, the use of performance tests on HMA specimens prepared from in situ samples is developing. Examples include direct measures of mix stiffness and permanent deformation characteristics. A measure of rutting resistance, the stabilometer S-value, is used by the California Department of Transportation (Caltrans) for quality control-quality assurance (QC/QA) projects. Although the S-value was used for this simulation because extensive tests were available, this approach is applicable to any performance measures already in use, such as HMA thickness or compacted air-void content. A sampling scheme and sampling size through statistical simulation of a fixed length of a one-lane-width placement of HMA are discussed. Sample size is based on the combination of the sample size of the contractor and that of the agency to balance the risk to both organizations and results in a mix that meets the minimum performance requirement.

Hypothesis testing of inequality
The acceptance or rejection of the null hypothesis, H 0 , is referred to as a decision. Therefore, a correct decision is made in situations where (1) the H 0 is correctly accepted if H 0 is true and (2) the H 0 is correctly rejected if the H 0 is not true. As shown in the following for a decision based on a sample, when the null hypothesis is valid, the probability of erroneously rejecting it is designated as the Type I error (or seller's risk), i.e., = P{Type I error} = P{reject H 0 | H 0 is true} ; when the null hypothesis is not true, the probability of erroneously accepting it is named the Type II error (or buyer's risk), i.e., = P{Type II error} = P{fail to reject H 0 | H 0 is false}.

Truth about the population
The power is defined as the probability 1 -of correctly rejecting H 0 if H 0 is not true, i.e., 1 -= P{reject H 0 | H 0 is false}. Hence, from the viewpoint of the agency (the buyer), it is necessary to have the power as high as possible; likely, from the perspective of the contractor (the seller), the Type I error should be as minimum as possible.  Figure 2 illustrates plots of the upper and lower bounds at various power levels of the agency and the minimum requirement of the contractor under 0 :3 7 H   in terms of  and sample size, 2 n . The minimum requirements of the contractor in Figure 2 are plotted based on the t-distribution and standard normal distribution. It will be noted that the two curves coincide after 2 10 n  . From Table 1 and Figure 2, two observations can be made:

Testing inequality μ ≥ C s and size of test α
1. It is very important to recognize that the minimum requirement of the contractor actually matches the lower bound of 0.5 power of the agency. 2. The distance enclosed by the upper and lower bounds at a specified power level decreases with smaller P S , larger  and  , larger k (0 1 k   ), and, more importantly, larger sample size. www.intechopen.com

QC/QA demonstration example
In this demonstration example 15,000 tons of HMA will be placed on 20 sublots (750 tons per sublot). The contractor is required to conduct 20 tests ( 2 n ), i.e., one test per sublot. The number of tests conducted by the agency ( 12 nk n   ) will include the minimum required by the agency according to Caltrans specifications, i.e, k = 0.1 (2 tests in this case); in addition, determinations will be made for four tests (k = 0.2), six tests (k = 0.3), and eight tests (k = 0.4). The minimum stabilometer S-value has been set at 37 (Type A HMA) (California Department of Transportation [CALTRANS], 2007), and a standard deviation S P is used for the S-value for tests between two laboratories of 6.6 (Paul Benson, private communication transmitting analyses of stabilometer test results for periods 1967-1970 and 1995-1999). The demonstration example will include sampling consistency between QC and QA, sampling stabilization of S P , and minimum requirements for both the agency and the contractor.
To conduct the sampling size simulation, several assumptions were made: 1. Lane width: 12 ft (3.66 m), 2. Unit weight of HMA -145 lb/ft 3 (2,323 kg/m 3 ), 3. HMA layer thickness -8 in. (20 cm), and 4. One stability sample is represented by a 4 × 4-in (10 × 10-cm) square with each square assigned a normalized stability value. For these assumptions, the 15,000 tons of HMA will produce a section ~26,000 ft (7,925 m). long and 12 ft (3.66 m) wide. This results in a N(0,1) stability population of 12 x 3 x 26,000 x 3 = 2,808,000 data points to generate three types of data patterns as schematically shown in Figure 3: (1) random pattern, (2) transverse strip pattern with 40 vertical strips, and (3) longitudinal strip pattern with 6 horizontal strips. The N(0,1) distribution is separated by the points of quantiles into several intervals, e.g., 6 intervals for transverse strip pattern or 4 intervals for longitudinal strip pattern as shown in Figure 3. These intervals are then permuted to vary randomly across the x-direction or the y-direction of a lane of HMA paving. Those points within the interval are also randomly distributed over the transverse strip or the longitudinal strip. The sampling scheme used was illustrated in Figure 4 , 20, 30, 40, 50, 100, 200, and 500). That is, one random QC sample from each cell and one random QA sample from one random cell of 12 nk n  random transverse strips. A total of 8 cases were simulated over three data patterns. Each case, per data pattern, was simulated 200 times.
To verify the minimum sampling size for an HMA paving strip is to show (1) no apparent difference of sampling consistency between the contractor (QC) and the agency (QA) and (2) stabilization of the pooled sample estimate of standard deviation of stability value, S P . (Tsai & Monismith, 2009). In each sampling simulation, the normalized stability values form a distribution with mean and standard deviation; hence, when repeated 200 times, the standard deviations will form another distribution. For each case, the standard deviations of the standard deviation distributions (SDSD) were calculated for QC and QA respectively. The difference of SDSD between QA and QC were used as an index to represent the sampling consistency between the agency (QA) and the contractor (QC). Likewise, for each simulation, the S p was calculated based on the equation in Table 1; hence, when repeated 200 times, the standard deviation of the S p distribution will be used to inspect its stability over the M × N domain.  Figure 5a illustrates the simulation results for sampling consistency between QC and QA at various k values in terms of global smoothed line over three different data patterns. As would be expected, the sampling consistency between QC and QA increases as the k value increases. Figure 5b indicates that sampling stabilization of S P depends only on the contractor's sampling size, 2 n , rather than the k value. From a series of operating-characteristic curves for the four k values and two values (5% and 10%), the values in Table 2 were determined for the required minimum value of S, termed min  . With Figure 6a as an example, under the condition that 5%   , 2 n = 20, k = 0.2, and power = 0.95, d has to be greater than 0.902 to satisfy the agency's power requirement; that is,  has to be greater than 42.95 so that the agency has power 0.95 to clearly accept the contractor's mix. Figure 6b shows a smaller d (0.803) will be obtained when the value is increased to 10%.   Figure 5c, it is also shown that the higher min  -criterion is needed if both the agency and the contractor require a high power level and a low -level, whereas if both the agency and the contractor require a low power level and a high level, then the min  criterion can be much smaller.

Case II: HMA sampling strategies using uniform experimental design for quality assurance
The application of using uniform design (UD) as a sampling strategy for quality assurance (QA) ensures that the most representative and unbiased sampling scheme can be achieved with the sample size based on an acceptable error level of a hot-mix asphalt (HMA) parameter specified by the agency. Through statistical simulations and demonstration of airvoid measurements of two field pavement sections, a QA guideline combined with the UD sampling scheme was developed to justify construction quality using the sample mean and sample standard deviation criteria. This approach can also be applied to any performance measure already in use.

Uniform experimental design
Statisticians have developed a variety of experimental design methods for different purposes, with the expectation that use of these methods will result in increased yields from experiments, quality improvements, and reduced development time or overall costs. Popular experimental design methods include full factorial designs, fractional factorial designs, block designs, orthogonal arrays, Latin square, supersaturated designs, etc. One relatively new design method is called Unifrom Design (UD). Since it was proposed by Fang and Wang in the 1980s (Fang, 1980;Fang et al., 2000;Wang & Fang, 1981), UD has been successfully used in various fields, such as chemistry and chemical engineering, quality and system engineering, computer sciences, survey design, pharmaceuticals, and natural sciences, etc. Generally speaking, uniform design is a space-filling experimental design that allocates experimental points uniformly scattered in the domain. The fundamental concept of UD is to choose a set of experimental points with the smallest discrepancy among all the possible designs for a given number of factors and experimental runs. Many different measures of uniformity have been defined. However, the centered L 2discrepancy (CD 2 ) is regarded as one of the most commonly used measures in constructing the UD tables, the reason is that the CD 2 considers the uniformity not only of P over C s , but also of all the projection uniformity of P over C u which is the u-dimensional unit cube involving the coordinates in u, P u is the projection of P on C u . Hickernell gave an analytical expression of CD 2 as follows (Fang & Lin, 2003  O n e o f t h e m o s t n o t e w o r t h y a d v a n t a g e s o f t h e u n i f o r m d e s i g n i s t h a t i t a l l o w s a n experiment strategy to be conducted in a relatively small number of runs. It is very useful when the levels of the factors are large, especially in some situations in which the number of runs is strictly limited to circumstances when factorial designs and orthogonal arrays can not be realized in practice. Given that the strength of uniform design is that it provides a series of uniformly scattered experiment points over the domain, this homogeneity in two factors has physically become the spatial uniformity of sampling from a pavement section in x and y directions. The application of uniform design resulted in the generation of a sampling scheme with a UD table consisting of pairs of (x, y) coordinates.

Fundamental statistics
If x is the sample mean of a random sample of size n from a normal population, It can be assumed that the error Ex   is equivalent to 2 z n    (Figure 7b).   Table 3 summarizes the 95% confidence interval of sample mean and sample standard deviation at various error levels and sample sizes. Notice that the sample size listed in Table  3 was rounded to its ceiling value. Figure 8a plots the sample size versus the specified error ( Ex   ) in terms of standard error (  ) with 95% confidence interval. The two-sided 95% confidence interval on the sample mean and the one-sided 95% upper confidence bound on the sample standard deviation of a N(0, 1) distribution, as a function of sample size, can be illustrated as shown in Figures 8b and 8c, respectively.

Sampling scheme and size simulation
In this approach, it was assumed that the air-void contents on a project can be represented by a standard normal N(0, 1) distribution. The data from the N(0, 1) distribution were used to generate five data patterns: random pattern, central segregation pattern, bilateral segregation pattern, central-bilateral segregation pattern, and block segregation pattern ( Figure 9). The reasons for selecting these pattern types are as follows: 1. Random pattern: non-segregation, with ideal construction quality. 2. Central segregation pattern: the gap between two augers of an asphalt paver makes coarse aggregate concentrated near the center of the paved area. 3. Bilateral segregation pattern: the gap between the auger and the lateral board of the asphalt paver makes coarse aggregate concentrated near the bilateral regions of the paved area, or provides less compaction of the side area. 4. Central-bilateral segregation pattern: a combined situation of patterns 2 and 3. 5. Block segregation pattern: as demonstrated in gradation segregation, temperature segregation, uneven compaction, etc. The segregation horizontal strips as shown in Figures 9b, 9c, and 9d were randomly generated using the data in the shaded area of the N(0, 1) distribution, which represent higher air-void contents. In the block segregation pattern (Figure 9e), the N(0, 1) distribution was divided into 6 intervals and the data of each interval were randomly distributed into blocks of pavement sections.

www.intechopen.com
The prospective road section was divided into n(X) (x-direction)  n(Y) (y-direction) cells. The n(X) represents the number of intervals in the x-direction. N points were then assigned to these n(X)  n(Y) cells. Hence, a sampling scheme was defined by n(X), n(Y), and N. For instance, x30y6n30 represents 30 runs that were assigned to 30 cells of the 30  6 cells based on the UD table. The sampling schemes considered in this study were combinations of various numbers of n(X) and n(Y) -that is, n(X) = 3, 5, 10, 15, 20, 25, 30, 35, 40, 45, 55, 60 and n(Y) = 1, 2, 3, 4, 6-and N = n(X); however, the cases with n(Y) > n(X) were excluded, resulting in a total of 62 cases. Each case was assigned a UD table with minimum CD 2 value. Figures 10a through 10c respectively illustrate the example sampling schemes (i.e., UD tables), x10y6n10, x30y6n30, and x60y6n60, from the uniform design. These sampling schemes are on the same scales of a 900 ft  24 ft (274 m  7.32 m) pavement section. The black rectangle cell physically represents the area of which one measure should be sampled randomly. For this sampling simulation, a total of 2700  72 points with a standard normal distribution of air-void contents were used to generate five data patterns with the following assumptions: 1. Lane width: 24 ft (7.32 m). 2. Time frame of construction: 1 hour with 900 ft (274 m) of HMA placed, i.e., paver speed = 15 ft/min. (4.57 m/min.). 3. One air-void sample is represented by a 4 × 4-in. (10 × 10-cm) square with each square assigned a normalized air-void value. Each type of sampling scheme per data pattern was simulated 200 times. For each simulation, the sample mean and sample standard deviation were calculated. It should be noted that the data of each simulation were randomly drawn from the cells specified in the UD table with replacement. Consequently, the distributions of the sample mean and standard deviation were generated after 200 simulations. The boxplot was then utilized to characterize the location and dispersion of sample means and standard deviations. The boxplot illustrates a measure of location (the median [solid black dot or white strip]), a measure of dispersion (the interquartile range IQR [lower quartile: left or bottom-edge of box; upper quartile: right or top-edge of box]), and the possible outliers (data points with light circle or horizontal line outside the 1.5 IQR distance from the edges of box; the most extreme data points within 1.5 IQR distance are marked with square brackets) and also gives an indication of the symmetry or skewness of the distribution. The Trellis graph introduced by Cleveland in 1993 (Cleveland, 1993) is a graphical way of examining high-dimensional data structure by means of conditional one-, two-, and threedimensional graphs. As an example, we would like to determine how the sample mean distribution depends on n(X), n(Y), and the data pattern. To inspect this graphically, the simulation results can be split up into groups and can be plot separately as opposed to blurring the effects in a single graph. The Trellis graph of boxplots presented in Figures 11  and 12 was arranged in such a way that each panel consists of all the n(Y) = 1, 2, 3, 4, 6 cases (i.e., 5 boxplots in each panel), each row is made by all the N = n(X) = 3, 5, 10, 15, 20, 25, 30, 35, 40, 45, 55, 60 cases (i.e., 13 panels in a row) with the same data pattern, and each column has 5 panels (i.e., 5 data patterns) with the same n(X). Thus, for each individual column, the effects of data pattern and n(Y) can be examined at the specified n(X); for each individual row, the effects of n(X) and n(Y) can be inspected at the specified data pattern. The Trellis graph was categorized by n(X), n(Y), and five data patterns. The Trellis graphs of the boxplots shown in Figures 11 and 12 summarize respectively the simulation results of sample means and sample standard deviations. Several observations from the Trellis graphs can be made: 1. As n(X) increases, i.e., increase of N, the variations of sample mean and standard deviation reduce exponentially regardless of data patterns. 2. For the segregation data patterns 2, 3, and 4, the increase of n(Y) does benefit the decrease of variation per n(X) and per data pattern. However, no apparent decrease of variation on random or block segregation patterns was perceived. This implies that the UD table provides a uniform sampling strategy. From the perspective of practice, it is suggested that n(Y) should be as large as possible to include all the possible data patterns. 3. It should be noted that the distributions of sample standard deviation at small n(X)s exhibit unsymmetrical and skewed distributions due to the intrinsic properties of www.intechopen.com

UD demonstration example using two field sections
In this demonstration example, the percent air-void content data of two field pavement sections each with 164 ft (50 m) in length and 36 ft (11 m) in width were acquired by the Pavement Quality Indicator (PQI), which is a non-nuclear density measurement device calibrated with core samples. The percent air-void content was taken by a 3.3 × 3.3-ft (1 × 1m) square. These two pavement sections served as the "testing sections" of which the paving operation, compaction pattern/effort, and other construction details were verified and corrected (if necessary) by the contractor. Several performance tests were comprehensively conducted by the agency to guarantee that the pavement quality of the whole project met the specifications afterwards. The material properties of two pavement sections, AC-13 and AC-20, are as follows.

Pavement Section AC-13
Pavement Section AC-20 Design binder content 5.6% 4.8% Target air-void content N(μ, σ 2 ) = N(5, 1): mean 5%; standard deviation 1%. Acceptable air-void content range 5±2 %, i.e., P(3 ≤ AV ≤ 7) = 0.95 of a N(5, 1) distribution The measured percent air-void contents are illustrated in Figures 13a and 13b respectively for the AC-13 and AC-20 pavement sections. As can be seen from the figures, the AC-13 section presents high air-void content on the section edges and seems to have a wide variation of air-void content. The AC-20 section appears to have more uniform distribution of air-void content.
To illustrate the proposed QA approach, it was decided that 20 points (20 runs) will be sampled to ensure that the agency is 95% confident that the error x   will not exceed 0.44σ, i.e., 0.44 percent (Table 3). Two UD tables (Figures 13c and 13d) were generated for both sections which are subdivided into 10 (x-direction) by 11 (y-direction), i.e., x10y11n20. In this case study, the sampling for each UD table was conducted only once. Figures 13e and 13f summarized the sampled, measured, and specified distributions of air-void content. Several findings can be addressed in the following: 1. The sampled distribution based on the UD table matches the measured distribution reasonably well: AC-13 sampled N(6.29, 1.40 2 ) versus AC-13 measured N(6.18, 1.43 2 ); AC-20 sampled N(5.41, 1.22 2 ) versus AC-20 measured N(5.12, 1.24 2 ). 2. The sample mean, 6.29, of AC-13 section is outside the 95% CI (4.56, 5.44) (Table 3); therefore, it is identified as an "inaccurate" distribution. The sample standard deviation, 1.40 exceeds the 95% one-sided upper bound 1.26 (Table 3); thus, it is designated as an "imprecise" distribution. As a result, the construction quality of AC-13 section is not acceptable because of its "inaccurate" and "imprecise" distribution. 3. On the contrary, the construction quality of AC-20 section is not rejected because of its "accurate" and "precise" distribution: the sample mean 5.41 lies in the 95% CI although www.intechopen.com on the high side; the sample standard deviation 1.22 is a slightly less than the 95% one-   Figures 14a and 14b respectively for the situations that (a) δ > 0 and (b) δ < 0] where 0 T is noncentral increasing two samples. In sum, by taking only three samples out of a project, the agency will have insufficient power to reject 00 : H    given that 0 H is false unless the quality of the project delivered by the contractor is so poor that the agency is confident enough to reject the project.

Findings and conclusions
For the Case I study, an attempt has been made to illustrate an approach and the extent of testing required using a performance test to insure reasonable quality in as-placed HMA. Stabilometer S-value test results were used in this example since extensive data were available. It should be emphasized that the same approach could be applied using other test parameters to control the quality of the as-constructed mix. Based on stabilometer test results, the brief discussion of hypothesis testing, and the simulation results of sampling scheme and size, the following observations and suggestions are offered: 1. Cooperation between the agency and the contractor is essential. It is necessary to have the testing process, test equipment, test results, and specimen preparation as consistent as possible between the two organizations. 2. The sampling simulation of the Case I demonstration example suggests that the sample size required to stabilize the sampling consistency and sampling stabilization is around 50 ~ 70 for the placement of 15,000 tons HMA. 3. Likely, sampling as noted (2) is perhaps impractical. However, increasing the sample size is actually beneficial for both the agency and the contractor since it reduces the potential for dispute and guarantees the quality of the constructed mix. By extension, it is advisable for the agency to provide incentives to encourage the contractor to increase sampling size and testing.
4. To ensure the success of the proposed QC/QA guidelines, the contractor's minimum value of the testing null hypothesis must exceed that required by the agency. 5. From the Caltrans case study, the min  criterion depended not only on the contractor's  value and the agency's power level as expected but also on the k value that the agency would select for use. The min  criterion can be smaller if both the agency and the contractor require low power level and high  level and/or the agency increases the k value. A concluding general observation relates to the concern for developing longer lasting pavement at this period of time because of increased costs of both pavement materials and increased traffic that must be accommodated. The added costs of testing by both the contractor and the agency are a very small proportion of the total costs associated with long lasting pavements. Accordingly an "attitude adjustment" for both parties relative to QC and QA testing would enhance long-term pavement performance. From above discussion of Case II for determining sample size, simulation results of the sampling size and sampling scheme using UD tables, along with a demonstration example, the following observations and suggestions are offered: 1. It is important to recognize that the agency can be . The variations of sample mean and sample standard deviation for the 900 ft HMA paving simulation (Figures 8, 11, and 12) suggests that the minimum sample size required to stabilize the variation is around 20 ~30. 2. The UD table not only provides the most representative sampling scheme with the sample size for a given specified error level by the agency but also minimizes the possible effect of the underlying data pattern. Moreover, the UD table gives the agency a more unbiased "random" sampling scheme that can be followed in the quality assurance process. 3. The sample mean and sample standard deviation criteria proposed in the QA guideline demonstrates the accurate/inaccurate and precise/imprecise concept of sampling outcomes. If the sample mean is located in the range of   100 1 %   confidence interval, then it is accurate. Precision is a term to describe the degree of data dispersion; if the sample standard deviation is less than the   100 1 %   one-sided upper bound, then it is precise. The case study presents a very good example of an inaccurate/ imprecise case of the AC-13 field section and an accurate/precise case of the AC-20 field section. The quality of a project can only be accepted if and only if these criteria have been fulfilled simultaneously. 4. The proposed QA guideline with the introduction of the UD table is relatively simple, practical, and robust. The sample mean and sample standard deviation criteria are rational enough for both the agency and the contractor to agree upon. 5. It should be emphasized that the proposed QA approach could be applied with other performance measurement parameters to control the quality of the as-constructed mix, such as thickness, stabilometer testing as used in California, performance testing of fatigue and rutting, etc. Moreover, the decision-making based on this proposed QA approach can also be a basis for pay factor determination. 6. By taking only three samples out of a project, the agency will have insufficient power to reject 0 0 : given that 0 H is false unless the quality of the project delivered by the www.intechopen.com contractor is so poor that the angency is confident enough to reject the project. However, by increasing sample size from three to five, the agency can detect smaller mean difference from 2.30S down to 1.37S by simply increasing two samples. 7. It is likely that the proposed sampling size is impractical. In this regard, the alternative is to establish a "testing section" similar to those in the case study and follow the proposed QA approach with the minimum sampling size (at least greater than 20) to ensure that the compaction pattern/effort, paving operation, and other construction details are appropriate to guarantee that the pavement quality meets the specifications.