Compaction and clay content control mudrock porosity

neutron scattering (VSANS) experiments were conducted on 13 diverse mudrock sets, characterised by differences in mineralogy, stratigraphy, maturity, and depositional environment. We performed multivariate statistics to systematically characterise the pore structure in 71 samples cross a 5 μ m – 2 nm pore size range. Our results indicate a multivariate approach more effectively captures the complex controls on porosity rather than single parameters. Compaction and clay content emerge as key primary and secondary controls on mudrock porosity, respectively, upon which we introduce a new porosity classification. Our complementary experimental-statistical assessment involving SANS-derived multiscale porosity sheds new light on the influence of structural controls on storage or production capacity in mudrocks.


Introduction
Mudrocks are low permeability sedimentary rocks with multiple applications in energy transition.They form seals for carbon dioxide and energy (e.g., hydrogen and methane) storage, repositories for radioactive waste disposal, and serve as reservoirs for large quantities of unconventional oil and natural gas [1].In order to assess the feasibility of mudrocks to form seals or reservoirs, the pore structure needs to be characterised in detail as it controls fluid flow properties and transport phenomena [2,3].In our study, organic rich mudrocks (ORM) are characterised by organic matter contents (TOC) of >2 % [4] and are source rock and potentially also unconventional reservoir for hydrocarbons.Organic lean mudrocks (OLM) with TOC <2 % are not regarded as source rock or unconventional reservoir.These fine-grained sediments are mainly composed of clay minerals, quartz, carbonates, and organic matter [5].The pore structure of mudrocks consists of matrix inter-and intra-particle void space and organic matter intraparticle pores [6].According to the International Union of Pure and Applied Chemistry (IUPAC) pore size classification, macropores are >50 nm, mesopores 50 nm-2 nm, and micropores <2 nm in diameter [7].Pore sizes in mudrocks generally range between few micrometer to subnanometer and can be captured using a combination of small-angle (SANS) and very small-angle neutron scattering (VSANS) [8].When destructive methods are limited to the pore spaces accessible to the fluid used or imaging resolution [9,10], (V)SANS as a non-destructive technique investigates total porosity, pore connectivity, and the development of porosity in mudrocks [11][12][13].
Even though our understanding of the pore structure of mudrocks has greatly improved recently [14][15][16], there are still significant gaps in linking between mineralogy, diagenesis, or compaction with pore size distribution (PSD), connectivity, and fluid flow.Pore structures vary substantially between organic lean and organic rich mudrocks [17].Different depositional environments, the presence of organic matter as well as hydrocarbons generated thereof and variations in dominant geological controls during and after sedimentation all contribute to variations in PSD [18,19].Key structural features of mudrocks include clay aggregates, particularly illite + smectite groups (I + S), and organic matter, which accommodate anisotropic and heterogeneous pore structures within mudrocks and thus control PSD at the meso-and microscale [20,21].
Clay minerals play a pivotal role in the porosity distribution within mudrocks [17].A significant portion, ranging from 20 to 40 %, of mudrock porosity is directly associated with clay content, predominantly linked to meso-to micropores in the matrix [22].Specific clay minerals such as chlorite and kaolinite contribute largely to microporosity, whereas illite predominantly constitutes mesopore volumes [23].This clay-hosted pore network is instrumental in defining the texture of mudrocks, especially at the micro-and mesopore sizes, having a direct impact on the porosity of thermally mature samples [24].Furthermore, the overall porosity is positively correlated with the presence of clay minerals.Notably, clay minerals like illite exhibit significant mesoporosity, leading to a consistent, unimodal pore size distribution [25].
Compaction during sediment burial is pivotal for a holistic understanding of porosity and its alteration with fluctuating clay content and organic matter in mudrocks.Total porosity decreases from the oil window to the gas window.This decrease is attributed to compaction, resulting in the loss of both organic and inorganic matter pore volume [17].The rigid carbonate and/or silicate grain framework preserves interparticle macroporosity against compaction while compaction reduces intraparticle porosity in the organic matter [26,27].Mudrock compaction pertains to mechanical and chemical processes [28].Mechanical compaction, which takes place shortly after deposition, is provoked by the rise in overburden pressure due to accumulating sediments.Here, porosity reduction relates to the realignment and denser packing of grains, predominantly clay minerals [6,28].As burial progresses, chemical compaction takes precedence.Chemical reactions, such as cementation and dissolution, lead to mineral precipitation within the pore space, further diminishing porosity [28,29].The differentiation between mechanical and chemical compaction is demarcated by a critical burial depth ranging from 2.5 to 3.0 km or temperatures between 80 and 100 • C [28][29][30].While mechanical compaction of uncemented sediments at relatively shallow burial depths adheres to the principles of soil mechanics (e.g., effective stress), chemical compaction becomes dominant beyond the critical burial depth [28].This is contingent upon the sediment's temperature trajectory and its mineralogical and textural constitution [28].Integrating these controls in the analysis of the pore structure of mudrocks should facilitate the understanding of transport phenomena.
In this study, we employed SANS and VSANS experiments and included data from Rezaeyan, Pipich [31] to quantitatively capture the multiscale pore structure of 13 sets of mudrocks between 5 μm and 2 nm e.g., porosity, specific surface area (SSA), and fractal dimensions.Our study encompasses a diverse range of mudrocks, taking into account a variety of geological aspects such as depositional environments, geological ages, and lithofacial distinctions.By employing SANS, we assess porosity across a spectrum of pore sizes, not just providing one single porosity value but providing a multifaceted understanding of the porosity control mechanisms within mudrocks.While previous studies have often linked porosity to single controls (e.g., clay content, TOC), we recognise that porosity is influenced by a combination of factors.We performed bivariate and multivariate statistical analyses including principal component analysis (PCA) and multilinear regression (MLR) to identify and quantify the interrelationship between pore characteristics, mineralogical properties, and geological-chemical-mechanical controls on the overall pore sizes as well as individual porosity fractions.Our work emphasises that a holistic consideration of all controls is crucial when determining petrophysical properties.From our characterisations, we develop a predictive porosity model for mudrocks based on the most significant controls, providing a refined estimation of porosity where measurements are limited.We also introduce a new categorical porosity model, which emerges from principal controls of porosity, presenting a nuanced perspective on mudrock porosity influenced by common lithologies.Considered together, our analyses provide novel and unique insights revealing the influence of mineralogical-petrophysical -geomechanical-geochemical controls on porosity in mudrocks at different scale, enhancing our understanding of factors controlling flow and transport in mudrock which is important for evaluating sealing integrity for gas storage, mudrocks as a repository for radioactive waste or mudrocks as an unconventional reservoir.

Samples
Experiments to characterise pore structure were carried out on two types of mudrocks, covering 40 organic lean and 31 organic rich mudrock samples, covering a wide range of lithology, age, depositional environment, and maximum burial depth found in mudrocks (Table 1).

Mineralogical and geochemical analyses
Bulk mineralogical compositions were derived from X-ray diffraction patterns of randomly oriented powder preparates of Opalinus, Carmel, Big Hole, Entrada, Posidonia, Bossier, Haynesville, Eagle Ford, Newark, and Jordan samples taken on a Bruker D8 diffractometer using CuKαradiation produced at 40 kV and 40 mA.Mineralogical information for Carboniferous samples are taken from Rezaeyan, Pipich [31] and from Jacops, Aertsens [32] for Boom Clay samples.Våle shale samples and mineralogical information were provided by Norske Shell, Norway.TOC contents were measured on powdered samples with a LECO RC-412 Multiphase Carbon/Hydrogen/Moisture Determinator.TOC quantifies the amount of organic matter present in mudrocks, indicating their potential for hydrocarbon generation.Vitrinite reflectance (VR r ) was determined to obtain maturity levels for organic rich samples during burial diagenesis.Details of the mineralogical and geochemical properties of the mudrocks as well as analytical techniques are provided in SI (S2.1 and S2.2).

Small-angle and very small-angle neutron scattering
SANS scattering curves for mudrocks contain statistical information on the pore structures.SANS experiments at ambient pressure and temperature conditions were conducted at the Heinz Maier-Leibnitz Zentrum (MLZ) in Garching, Germany.Air-dried samples were cut parallel to bedding, fixed on quartz glass carriers, and polished to a thickness of 0.2 mm.We used the KWS-3 instrument operated by the Jülich Centre for Neutron Science (JCNS) at MLZ to obtain VSANS data of samples, covering pore sizes of 5 μm-250 nm.Data at KWS-3 were collected at λ = 12.8 Å (with a wavelength distribution of the velocity selector Δλ/λ = 0.2), and a sample-to-detector distance of 9.5 m, covering a Q-range from 0.0024 to 0.00016 Å − 1 [33].The KWS-1 instrument, operated by JCNS at MLZ, provided SANS data of samples at pore size of 250 nm-1 nm.SANS data at KWS-1 were collected at a λ of 6 Å (Δλ/λ = 0.1; full width at half maximum).Measurements were performed at sample-to-detector distances of 1.2, 7.7, and 19.7 m, covering a Q-range of 0.002-0.35Å − 1 [34].Data reduction was carried out using the QtiKWS software.The data processing and analysis were carried out using our MATSAS software [35], which provides a comprehensive suite of characteristic factors associated with pore space, including porosity, specific surface area (SSA), and fractal dimensions.Porosities of mudrocks obtained from SANS measurements were not determined under loading and unloading stress conditions.Full experimental and A. Rezaeyan et al. analytical information are provided in SI (S2.3).

Geomechanical analysis
Mudrock porosity is influenced by compaction (S v ) at maximum burial depth (D max ).S v is represented by the effective vertical stress (σ v ) at D max .To obtain information on S v , we estimated D max in three different ways (SI, S2.4 and Table S3): 1) from basin modelling studies; 2) D max = (T max − T surf )/ ΔT ΔD Forrest, Marcucci [36]; where T max and T surf are maximum burial and surface temperatures [ • C].ΔT  ΔD is the geothermal gradient; an average of 25 • C/km was assumed; 3) )/ ΔT ΔD , where VR r is mean random vitrinite reflectance [%] Barker and Pawlewicz [37].

Statistical analysis
We assembled a large data set of mineralogical-petrophysicalgeomechanical-geochemical variables from analyses summarised above.The controls on mudrock porosity include compaction (S v ), clay (x Clay ), quartz (x Quartz ), carbonate (x Carbonate ), and total organic carbon contents (x TOC ), fractal dimension (D f ), cementation exponent (m c ), specific surface area (SSA), as well as three porosity fractions: φ 1 representing pores from 10 nm-2 nm, φ 2 from 50 nm-10 nm, and φ 3 from 5 μm-50 nm.m c represents the degree of cementation between loadcarrying grains [38] and was obtained from fractal models described in Ref. [39].
Pre-processing: Pre-processing of the data set was done to guarantee consistency and to avoid spurious correlations for statistical analysis.Porosity fractions were obtained using the standard centred log ratio transform [40] and the mineralogical and TOC compositions as well as D f and m c using the centred log ratio by Box-Cox transformation [41], following procedures published previously [42].In addition, variables like S v and SSA showed a non-linear function with porosity and therefore demanded a logarithmic transformation (SI, S2.5, Table S4).
Bivariate Analysis: a bivariate Pearson's r correlation analysis was performed to explore the relations between all dependent and Principle component analysis (PCA): PCA was applied on the database to create an orthogonal transformation of the high dimensionality of porosity to lower dimensions along ordered lines of maximum variance while projecting all variables into a 2D space.PCA investigates a vector of v with p variables.PCA starts by identifying a linear function α ′ 1 v that maximises variance: where α ′ 1 is a transposed vector of p constants α 11 , α 12 , …, and α 1p .Following this, additional linear functions like α ′ 2 v, α ′ 3 v, and so on, are identified.These linear functions, known as Principal Components (PCs), maximise variance while being uncorrelated with each other.PCs condense most of v's variation with far fewer components than p.A comprehensive study on the application of PCA is provided in Jolliffe [44].To assess the controlling factors of the pore structure, all eleven log-normalised variables obtained were analysed for all mudrocks.The scores were standardised to have a mean of 0 and a standard deviation of 1. Eigenvalues of the correlation matrix, percentage of variances, and cumulative variance of principal components are provided in SI (Table S5).The scree plot showed four PCs with eigenvalues ≥1 (SI, Fig. S3).With four PCs remaining in PCA, a cumulative variance of 91.73 % of the data set was attained.The bivariate analysis shows that these four PCs are independent, which implies PCA results are not influenced by one control or two (SI, Fig. S4).PCA only allows identification of controlling parameters for mudrock porosity, but it does not help quantifying them.
Multilinear Regression Analysis (MLR): we employed MLR to quantify interrelations among individual controls with a focus on regression coefficient (RC).MLR specifies a multilinear relationship between a dependent variable (Y) and a set of independent variables (X) such as: where b 0 is the intercept, b 1 to b n are the regression coefficients, and e denotes the residual error [45].We performed MLR on porosity fractions (φ 1 , φ 2 , and φ 3 ) as dependent variables using the most dominant independent variables including S v , x Clay , x Carbonate , x TOC , and SSA for all mudrocks.The selection of the dominant independent variables was based on maximum loading on the first 2 PCs.Furthermore, we performed MLR on total porosity (φ) using the same set of independent variables.We aimed at predicting porosity as accurate as possible while porosity predictive models remain statistically significant [45].For simplicity, in the following we denote our MLR models as φ-model.Each φ-model is based on assumptions that justify the use of a least square estimation [45].We checked the four φ-models to avoid any violation of the assumptions and ensure that the model presented is the singular model that could be fitted.This φ-model check is provided in SI (S2.5).Dependent variables are continuous; independent variables are continuous except for S v , which was mostly linear after transformation.The lack of collinearity was met, evidenced by variance inflation factors (VIF) close to 1, which indicates that independent variables are uncorrelated (SI, Table S6).The studentised (adjustment made consisting of a division of a first-degree statistic derived from a sample) residuals appeared normally distributed, met linearity, and satisfied homoscedasticity (i.e., variance of the dependent variable is the same for all the data; SI, Figs.S5 and S6).Leverage points were removed, and hypothesis tests were performed (SI, Table S7).The F-test proved a test of high statistical significance of the overall relationship for the individual φ-models with tail probability values (P values) of zero for φ 1 -, φ 2 -, and φ-models and 10 − 13 for the φ 3 -model.The t-test for each independent variable provided additional predictive power with P values < 0.05 for the individual φ-models.R 2 and adjusted-R 2 values are statistically significant for the individual φ-models.

Porosity, specific surface area, and fractal dimension
Porosity and SSA for pore sizes between 5 μm-2 nm are illustrated in Fig. 1-A, subdivided into macropore and mesopore sizes according to IUPAC [7] as well as pores from 10 nm-2 nm that will be further referred to as nanopores in the following.Nanopores are taken to observe their contribution to porosity.Mudrocks show a log-normal distribution for both porosity (total and subgroups) and SSA.For all mudrocks except Carmel, the macroporosity is larger than the corresponding mesoporosity.In general, macropores make up only ~1 % of the total SSA.The majority of SSA and total porosity resides in nanopores, which makes up ~98 % and ~60 %, respectivelysee supporting information (SI, S3.1), Table S8.Nanopores are well-connected and oriented along bedding, resulting in low and anisotropic permeability [14], influencing gas flow, which ranges from transitional to diffusional regimes [31].
In this study, organic lean mudrocks (OLM) have maximum burial depths of <3 km.Their porosity and SSA vary significantly (Fig. 1-A): While Opalinus, Boom, and Våle contain high porosities in the range of 20-38 %, Carmel, Big Hole, and Entrada have low porosities with values between ~ 3 and 8 %.Våle Shale contains the highest (~52 m 2 /g) and Big Hole and Entrada the lowest SSA (~10 m 2 /g).Opalinus, Boom, Carmel, and Entrada contain intermediate SSAs of ~17-~39 m 2 /g.Organic rich mudrocks (ORM) are difficult to differentiate (Fig. 1-A).They have maximum burial depths of >3 km.Posidonia Shale samples show very similar porosity (6.1-6.5 %) and SSA values (6.9-8.6 m 2 /g) even though they have quite different maturity levels.Carboniferous Shales feature similar maturity but a wide porosity of 2.7-10.7 % and SSA of 4.3-21.6m 2 /g.Bossier and Haynesville samples differ in φ and SSA, with values between 2.7 and 14 % and 2.8-21.5 m 2 /g for Bossier and 3.7-19.8% and 4.4-23.8m 2 /g for Haynesville.Eagle Ford samples possess similar porosities (~5.5 %) and SSA (~4.5 m 2 /g).Jordan and Newark shales contain porosities and SSAs with values of 7.7-10.3% and 4.1-6.1 m 2 /g and 3.3 % and 5 m 2 /g, respectively.While high porosity is indicative of gas storage capacity in macropores, high SSA is critical for sorption-based gas storage in nanopores.This highlights the significance of pores smaller than 10 nm in sorption processes and flow dynamics.However, a high porosity does not inherently improve fluid conductivity, since permeability is primarily determined by the dimensions of pore throats [46].
The fractal dimensions of mudrocks determined by SANS are presented in Fig. 1-B.The scattering intensity I decays with Q − m with variable power-law exponents m; m is related to the dimensionality of the pore network based on the concept of fractality [47].D f represents the fractal dimension across all pore scales, D s is the surface fractal defined at the mesoscale, and D p the pore fractal at the macroscale [35].For a pore fractal scatterer D p = m, with values 1 < D p < 3 and for a surface fractal D s = 6− m with values 2 ≤ D s ≤ 3 [48], and D s =D f [35].The fractal values tend to a minimum for smooth planar homogeneous systems and towards the maximum for rough heterogeneous systems [8].
Fractal dimensions distribute differently between OLM and ORM samples (Fig. 1-B).D f values are comparable in both mudrock groups; ~2.8 for OLM and ~2.85 for ORM.D s of OLM is higher than that of ORM (~2.82 compared to ~2.67, respectively).In contrast, ORM samples have higher D p values (~2.7) compared to OLM (~2.55).The fractal dimensions do not always stay within the ranges (SI, Table S8), which is in accordance with previous studies [49,50].This is because mudrocks are not self-similar across the entire pore size range due to combined fine-and coarse-grained organic and inorganic aggregates.The application of a fractal model is however very useful approximation.Fractal dimensions provide structural information on surface roughness of pore networks that can be used to predict fluid flow and migration properties at micro, meso and macro length scales [51].The full characterisations obtained from (V)SANS analysis are provided in SI, Table S8.

Clay content control on porosity
The pore orientation is largely controlled by the relative orientation of clayey particles (typically random to normal to overburden stress), resulting in a direct control on matrix permeability.Permeability can decrease to nano-Darcy ranges due to preferential orientation of small pores where chaotic microfabrics turn into lamination [24].Clay sheets accommodate significant porosity and SSA with direct implication on production and/or storage.Although previous studies have shown a positive correlation between porosity and clay content [23,52,53], this finding is not confirmed from our study for either the entire dataset or sets of samples from the same location (Fig. 2): Instead, low porosity mudrocks are associated with high clay contents, as observed for Carmel, Bossier, and Haynesville.We further find SSA and fractal dimensions cannot be explained by clay content (SI, Table S8), nor nanoporosity or nanoSSA (SI, Fig. S7).This might be due to the wide range of influencing factors, such as clay composition, organic matter content, compaction, or diagenesis.Although numerous previous studies aimed for a single parameter control on the pore structure of mudrocks (e.g., mineralogy, organic matter content, maturity) [16,18,22,23,[53][54][55][56][57][58][59][60], they failed to account for these inconsistencies.
Therefore, a multivariable approach is needed to qualitatively or quantitatively capture the complex controls on porosity.

Statistical multivariate analyses of porosity
The results for PCA models constructed for the structural   S9.The PC functions are available in SI, S3.2.PCA suggests that mineralogy, compaction, and SSA are key controls on pore structure (Fig. 3).First principal component (PC1) shows that SSA and clay content influence the pore structure at the nanoscale (10 nm-2 nm: φ 1 ).Clay content significantly contributes to the formation of nanopores within mudrocks, which leads to an increase in SSA [24].The intraparticle pore nature of clayey matrix can impede directional fluid flow, it simultaneously enhances the capillary sealing capacity.PC1 also describes the texture and fabric for φ 3 (5 μm-50 nm), which can be related to compaction and carbonate and quartz contents.Second principal component (PC2) suggests organic matter controls textural characteristics of mudrocks in pore sizes of 50 nm-10 nm (φ 2 ), while cementation influences the texture of mudrocks, mainly for pores between 50 nm-10 nm (φ 2 ), followed by pores between 5 μm-50 nm (φ 3 ).The meso-and macropores control the flow characteristics in mudrocks [61], which are influenced by compaction as well as carbonates and quartz contents.PCA shows two separate clusters for OLM and one cluster for ORM (Fig. 3), which might indicate differences between mudrocks.Although Opalinus, Boom, and Våle originate from different depositional settings, with different mineralogical composition, they show comparable pore structures.In contrast, Bossier and Haynesville were deposited in the same area with similar textural features and maximum burial depths [27,62], however, these formations are sparse in the cluster of ORM, indicating strong heterogeneity, possibly due to post depositional processes.Furthermore, with their eigenvectors remaining in the cluster of ORM, coarse grained minerals (quartz and carbonates) as well as compaction are two key characteristics of macroporosity (φ 3 ) in ORM.The clustered samples related to these eigenvectors show preservation of macroporosity due to compaction.
The multilinear regression equations obtained for the three φ-models are available in SI, S3.3.The correlation coefficient (CC), calculated using bivariate Pearson's r correlation analysis, represents the strength and direction of the linear relationships between pairs of controls (e.g., S v vs x Clay ), which are crucial for identifying patterns and making predictions about mudrock porosity.The CC results indicate that compaction is highly attributed to porosity change by influencing organic matter hosted porosity rather than clay and/or carbonate hosted porosity, which are indicated by CC values of 0.64, − 0.4, and 0.3 for the interrelationship of S v with x TOC , x Clay , and x Carbonate , respectively (Table 2).A coarse-grained matrix provides resistance against pore compaction, indicating that carbonate content is not the sole determinant of porosity preservation.Additionally, porosity within organic matter shows an increase with maturity, highlighting compaction processes that effectively develop porosity in organic-rich zones [6,22,27].Presented by a positive CC value (0.6) between x Clay and SSA, clay sheets contribute to SSA in mudrocks, not organic matter or carbonate as indicated by negative CCs (Table 2).The regression coefficient (RC) denotes the expected variation in mudrock porosity in response to a one-unit change in porosity controls, assuming other controls remain constant.RCs help to develop predictive models and elucidate the specific contribution of each porosity control to variations in mudrock porosity.The RC results show SSA is a statistically significant predictor of mesoporosity (pores between 50 nm-2 nm) with RC of 0.76 for φ 1 -model and 0.63 for φ 2 -model, and compaction is significant predictor of macroporosity (pores between 5 μm-50 nm) with RC of − 0.43 for φ 3 -model (Table 2).Rather unexpected, clay and carbonate contents only have a secondary control on porosity due their lower RCs for the three φ-models compared to SSA and compaction.Among all predictors, carbonate with an RC of 0.23 appears to contribute to macroporosity (φ 3 -model), followed by TOC with an RC of 0.1, while clay content has an insignificant contribution.Porosity is unlikely to be controlled by organic matter at nanopores (φ 1 -model), but by clay minerals as indicated by an RC of 0.04 and 0.17 for x TOC and x Clay , respectively (Table 2).In general, a porosity decrease can be attributed to compaction of the entire pore size range, the influence of which is however statistically reduced by 25 % on porosity from macropores to nanopores.Mechanical compaction only effectively controls porosity at maximum burial depths < 2-3 km [28], in which platy clay sheets generate a pore network of regular geometry, resulting in lower tortuosity [39].This is not the case for organic rich mudrocks since they develop higher tortuosities and lower total porosities [39] due to chemical compaction at maximum burial depths > 2-3 km [28].
Partial regression plots indicate that the relationship between multiscale porosity and each independent variable is linear in each of the φ-models, when considering the other independent variables (Fig. 4).The Pearson correlation coefficient for each model confirms multiple moderate to strong correlations between multiscale porosity and the independent variables.Therefore, we restrict the independent variables to those suggested by PCA for each model only (Fig. 4).
PCA and MLR reveal that TOC has a minor control on pore structure.This suggests that organic matter hosted porosity insignificantly contributes to the effective pore system, which is in accordance with previous studies on Marcellus [63], Woodford [54], and Kimmeridge Clay [64].Nevertheless, the most influential contribution of TOC is found at meso-and macropore sizes.Organic matter porosity might generally enhance the reservoir quality for ORM due to an increase in probability of larger pores being connected by small pores in organic matter [21,65].In comparison, high organic matter maturities and large amounts of pores located at the interface between inorganic and organic matter can enhance mudrock permeability [19,27].
According to the φ-model, the predicted porosity φ [%] can be obtained by: for all mudrocks (R 2 = 0.93).Mudrock porosity is mainly controlled by compaction and clay content with RCs of − 17.69 and 11.89, respectively, followed by carbonate content and TOC with RCs of 8.67 and 4.91, respectively.The predictive φ-model is trained with samples analysed in this study.The model tested with literature data shows mudrocks studied relatively reflect global distribution of mudrocks (SI, S3.4,Table S10).If mudrock has independent variables with values outside of the range of this study, a revised predictive model is recommended since porosity might be over-or under-estimated.The application of the φ-models provides the basis for a reliable method to understand the influence of a variable in the pore structure as well as estimate total porosity based on coupled mineralogical-geochemicalpetrophysical-geomechanical data.The statistical workflow effectively characterises a set of samples, which allows studying the influential controls to be integrated with the relevant scales such as pore size dependent flow and transport properties [31] and scale dependent permeability [66].

A new classification of mudrock porosity
Previous studies have defined a variety of pore classifications to characterise mudrock porosity in terms of pore size, shape, and type [6,61,67].These classifications provide useful perspective to relate fluid flow to pores in mudrocks; however, the major constituents of mudrock A. Rezaeyan et al. porosity as well as mudrock itself and their influence on mudrock applications are given less attention in previous classifications.(re-)precipitation, e.g., Bossier, Haynesville, and postmature Posidonia [19,28].This classification results from samples analysed in this study, however, it equally works for other mudrocks such as Barnett, Condor, COx, and Sichuan [69]; Haynesville, Woodford, Marcellus, Barnett, Doig Phosphate, and Doig Siltstone [21]; New Albany Shale [55]; Longmaxi Shale [70]; Newark [71]; and Silurian Shale [72].

Conclusions
We characterise the pore structure of 13 sets of worldwide mudrocks using a combination of small-angle (SANS) and very small-angle neutron scattering (VSANS) for assessment of pore sizes that range from 5 μm to 2 nm.Mudrock porosities vary significantly, which cannot be explained by one single parameter like clay content.We identify and quantify mineralogical-petrophysical-geomechanical-geochemical controls on porosity at cumulative and size resolved classes using principal component (PCA) and multilinear regression (MLR) analyses.PCA and MLR relate the porosity of mudrocks to their influencing controls.PCA suggests compaction, clay content, carbonate minerals, TOC, and SSA influence porosity.Building upon the PCA suggested controls, MLR indicates that compaction plays a primary role in the formation of porosity, while clay content has a secondary role in the pore structure.Among these controls, TOC has a minor role on porosity; however, its most influential contribution is found at meso and macropore sizes.Informed by our statistical approach, we introduce a new mudrocks porosity classification that is based on compaction and clay content and verified against other mudrocks.The statistical workflow effectively characterises a set of samples, which allows studying the influential controls to be integrated with the relevant scales such as pore size dependent flow and transport properties and scale dependent permeability.

Funding
None.  x Clay g g − 1 Clay content x Quartz g g − 1 Quartz content x TOC wt.% Total organic carbon content

Fig. 1 .
Fig. 1.Porosity, SSA, and fractal dimensions of mudrock samples at macro-, meso-, and nano-scales.(A) Cumulative and size resolved porosities and SSAs of all samples, divided into macro, meso-and nanoporosity.The distribution curves are obtained by fitting Kernel functions with the data frequency in each bin; the number of bins is 50 and the bin width is 1.2.(B) Fractal dimensions obtained by SANS for organic lean and organic rich mudrocks.The distribution curves are obtained by fitting Kernel functions with the data frequency in each bin; the number of bins is 13 and the bin width is 0.1.

Fig. 2 .Fig. 3 .
Fig. 2. Porosity versus clay content.The mudrocks show normal and lognormal distributions for clay content and lognormal distribution for porosity, displayed in the distribution lines in blue for organic lean mudrocks, red for organic rich mudrocks, and green for both mudrocks placed on the right and top of the figure, respectively.XY error bars are 5 % of the individual porosity-clay content data.(For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Fig. 5 Fig. 4 .
Fig. 4. Partial regression plots for the individual φ-models.Row A through D displays partial regression plots for φ-model, φ 3 -model, φ 2 -model, and φ 1 -model, respectively.Column 1 through 5 shows the partial regression plot of a φ-model for S v , x Carbonate , x Clay , x TOC , and SSA, respectively.For example, subplot B3 exhibits the partial regression plot of φ 3 data vs x Clay .

Fig. 5 .
Fig. 5. Classification of porosity as a function of clay content and compaction.Compaction data are normalised to [0 1].Further explanation about the figure is provided in SI, S3.4.

Table 1
Overview of the sample sets used in this study; full details are provided in SI, S1.

Table 2
Correlation coefficients and regression coefficients indicating controls of variables on total porosities and porosity fractions used in the MLR analysis.P values are <0.05 for all independent variables of the φ-models (SI, S2.5, TableS7).