1 Introduction

In the context of sequence stratigraphy, facies relationships and stratal architecture are studied within a chronological framework [1, 2]. The goal of sequence stratigraphy is understanding depositional conditions and spatial distribution of the facies [3,4,5,6]. Low-resolution (basin scale) sequence stratigraphy has widely been used to study petroleum systems and to understand the distribution of source, reservoir and cap rocks [2, 7]. In high-resolution (reservoir scale) sequence stratigraphic analysis, outcrops, drill cores and logging data are integrated in order to determine configuration and characteristics of the reservoir facies [8,9,10].

In stratigraphic analysis, drill cores are the most reliable source of data. Due to high cost of coring, it is only done within reservoir intervals of exploratory wells. Instead, well log data provide more continuous information from much larger depth intervals within each well. The volumetric mechanism of well log acquisition makes the well-logs less precise than the drill cores [11]. So, in the absence of core data, precise identification of key sequence stratigraphic surfaces is limited. This limitation is exacerbated in the carbonate reservoirs as they are more heterogeneous compared to the clastic reservoirs. The heterogeneity is because of diagenetic processes [12] and fracturing [13].

In the past decades, several logging data (especially gamma-ray and magnetic susceptibility) have been studied to interpret stratigraphic sequences [14, 15]. Gamma-ray logs were mostly used for identifying lithofacies [16], paleoclimate interpretations [17, 18] and chemostratigraphic correlations [19]. Some studies are based on qualitative descriptions of gamma-ray changes in the key sequence stratigraphic surfaces [20,21,22,23], and some focus on quantitative methods for automatic identification of key stratigraphic surfaces, e.g., integrated prediction error filter analysis (INPEFA) based on gamma-ray well-log [24,25,26].

In the current paper, a clustering method is used for automatic identification of key surfaces and stratigraphic interpretation based on the gamma-ray log. The method provides an algorithm for reconstructing curve of the water depth change in well-scale (sampling rate of ~ 15 cm). The water depth (or relative sea-level) change is a key element for sequence stratigraphy analysis, which is fundamental for studying petroleum systems.

2 Geological setting

This study is applied to the Sarvak Formation data from two wells, drilled on Dehluran oilfield, northwest of the Dezful Embayment, Ilam Province, Iran. The carbonate Sarvak Formation, deposited during the Late Albian, Cenomanian and Turonian, is equivalent to the Mishrif, Khatyah, Mauddud and Nahr Umr Formations in the south Persian Gulf and the neighboring countries (Fig. 1). These successions exist in numerous oilfields in the Zagros Basin [27,28,29].

Fig. 1
figure 1

Modified after [50,51,52]

Comparison of lithostratigraphy of the Cretaceous successions in SW Iran (the Abadan Plain and Khuzestan) and the south Persian Gulf.

During the Early Cretaceous, an epeiric carbonate platform developed throughout the Zagros Basin [30]. Development of the carbonate platform to intra-shelf basin along with reduction in siliciclastic contents resulted in deposition of carbonate sediments of the Sarvak Formation. In some provinces, e.g., Ilam and Fars, a Cenomanian unconformity separates the Sarvak Formation from the overlaying Ilam Formation (Fig. 1). This hiatus resulted in diagenetic processes, carbonate leaching and bauxite mineralization [31], which has improved porosity and storage capacity of the upper Sarvak because of an abundance of vugs and karsts, c.f. supplementary material.

The Sarvak Formation in the Dehluran oilfield is divided into an upper and a lower part. The upper Sarvak Formation (USF) consists of thick argillaceous limestones, containing pelagic foraminiferal wackestones and packstones, which overlie the shallow water deposits of the lower Sarvak Formation (LSF). This study focuses on the USF (mid-Cenomanian to early Turonian) that is correlated to the Rumaila and Mishrif Formations in Iraq [32].

Within the Abadan Plain, the Sarvak Formation was subdivided into 37 microfacies, three main lithofacies, four carbonate rock types and six environmental facies. The Sarvak environmental facies show a wide range of a shallow carbonate ramp system: tidal flat, restricted and unrestricted platform interior, Rudist mound, above and below the sea wave base (SWB) open platform [33].

The Sarvak Formation is divided into four sequences of the third order (duration of 1.5–3 Ma and thickness of 50–150 m). In addition, several sequences of the fourth and fifth orders are also distinguished within this formation [34, 35].

By applying the classification of [36] to the core observations within the Dehluran oilfield, the USF is divided into three sequences of the third order. USF1 is the earliest sequence, then USF2, and finally USF3 is deposited. Each sequence contains both transgressive and regressive systems tracts. In the USF1, the transgressive pelagic deposits dominate the lower part of the sequence and pass up into the regressive deposits, consisting shoal/reef and lagoonal facies. The middle sequence USF2 consists of alternating open-marine and shoal deposits and is capped by reef-related facies. The latest sequence USF3 consists open-marine deposits at the base, passing up into shoal and reefal facies and finally a thick interval of lagoonal deposits [37].

3 Assumptions and methodology

3.1 Gamma-ray well-log acquisition

The natural gamma radiation of rocks originates from formation radioactivity. This radiation depends on the presence of three elements: uranium, thorium and potassium (K) [38]. In carbonates, gamma radiations are mainly related to the concentration of siliciclastic components (clay minerals, micas and K-feldspars) and organic materials [15]. Logging instruments are able in distinguishing the source radionuclide according to energy of the emitted gamma-ray. Following receiving each gamma-ray, the detector needs a temporal delay for processing it prior to detecting another gamma-ray, i.e., it misses the gamma-rays during this delay. To compensate this stochastic effect, several detectors are designed in a logging instrument, and the gamma-ray intensity is averaged and then interpolated on the intervals of 15 cm. The averaging also serves as filtering the white noise from the acquired data. Prior to using the gamma-ray log for interpretations, it is also needed to correct it according to the drilling mud weight because it consists of potassium-bearing shale [39]. The gamma-ray log in the current study is limited to about 330 m of the Upper Sarvak Formation, within two oil wells, drilled on the Dehluran oilfield.

3.2 Gamma-ray interpretation

Since the gamma-ray intensity shows cyclical changes in the depositional successions [40, 41], it is useful in stratigraphic studies [19, 42]. In carbonate platforms, the sea water transgressions and regressions control the shale-to-carbonate ratio, resulting in the accumulation of “dirtier” or “purer” limestones, respectively. Transgression increases the deposition depth, so shale-to-carbonate ratio increases in the deposits, resulting dirtier limestone. In contrast, regression decreases the deposition depth, therefore shale-to-carbonate ratio decreases, resulting in purer limestones. The difference in the number of terrigenous compounds in these two modes is reflected in the gamma-ray records and its pattern.

Conventionally, sonic, density and gamma-ray logs are used for facies studies [43]. Sonic and density logs are strongly affected from diagenetic alterations, e.g., dissolution, vuggy porosity and fracturing increase acoustic transfer time and decrease density [44, 45]. But gamma-ray log is less affected by the majority of the post-sedimentary processes, i.e., dissolution, cementation, fracturing and compaction. Instead, the diagenetic clay deposits are effective on gamma-ray recordings. However, formation of diagenetic clay minerals depends on the specific physicochemical conditions, and such minerals have limited distribution in the carbonate reservoirs [46]. Thus, the gamma-ray log is here selected for carbonate sequence stratigraphic analysis.

3.3 k-means clustering algorithm

Clustering is a non-supervised pattern recognition approach that regroups unlabeled data points [47]. The term “cluster” refers to a set of samples that have the highest similarity (minimum distance) with each other and yet the least similarity (maximum distance) with samples of other clusters [48]. The k-means clustering algorithm regroups n data points with d dimensions into k clusters in a way that the within-cluster distance is minimized. The algorithm contains four steps:

  1. (i)

    Placing randomly k prototypes in the space of data points (t = 0; t is iteration counter). These prototypes represent initial group centroids.

  2. (ii)

    Associating data points to the closest centroid.

  3. (iii)

    Recalculating the positions of the centroids by Eq. 1. It is a weighted average of data points, driven from descent-based optimization of performance index (Eq. 2), i.e., \(\frac{\partial J}{\partial v} = 0\) [49]:

    $$v_{i}^{\left( t \right)} = \frac{{\mathop \sum \nolimits_{j = 1}^{n} \left[ {A_{i}^{\left( t \right)} \left( {x_{j} } \right)} \right]^{m} .x_{j} }}{{\mathop \sum \nolimits_{j = 1}^{n} \left[ {A_{i}^{\left( t \right)} \left( {x_{j} } \right)} \right]^{m} }}$$
    (1)
    $$J_{m} \left( {P^{\left( t \right)} } \right) = \mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{i = 1}^{k} \left[ {A_{i}^{\left( t \right)} \left( {x_{j} } \right)} \right]^{m} .\left\| {x_{j} - v_{i}^{\left( t \right)} } \right\|^{2}$$
    (2)

    where t is iteration counter and m is fuzzifier (here m = 1). \(A_{i}^{\left( t \right)}\) is a membership function of the jth data point \(\left( {x_{j} } \right)\). In the k-means algorithm \(A_{i}^{\left( t \right)}\) either takes 0 (not member) or 1 (member) values, but in the fuzzy c-means algorithm it takes a value from the interval of [0, 1], since the latter algorithm is a generalization of the former.

  4. (iv)

    Checking the stopping condition: if \(\left| {P^{\left( t \right)} - P^{{\left( {t + 1} \right)}} } \right| \le \varepsilon\), then stopping iteration; otherwise: t = t + 1 and returning to step (ii).

    $$\left| {P^{\left( t \right)} - P^{{\left( {t + 1} \right)}} } \right| = \mathop \sum \limits_{{i \in N_{c} ,k \in N_{n} }} \left| {A_{i}^{{\left( {t + 1} \right)}} \left( {x_{k} } \right) - A_{i}^{\left( t \right)} \left( {x_{k} } \right)} \right|$$
    (3)

Equation 3 compares the performances of two consecutive iterations t and t + 1. If the sum of membership function variations falls below the threshold \(\varepsilon\), iteration will be stopped.

3.4 Gamma-clustering sequence stratigraphy

The k-means algorithm is applied to the gamma-ray well-log in order to divide it into two clusters. The cluster with higher average of gamma value is interpreted as being due to deeper sedimentary environments, and the other cluster with lower average is due to shallower sedimentary environments or shelf deposits. At each sampling depth, a positive label is assigned to the cluster with higher, and a negative label to the cluster with lower average. A cumulative curve of the cluster labels is calculated at each sampling depth. Due to the interpretation of the clusters, the cumulative curve is a reconstruction of water depth change by gamma-ray log (Fig. 2).

Fig. 2
figure 2

The procedure of gamma-clustering sequence stratigraphy

4 Results

The procedure of gamma-clustering sequence stratigraphy was applied to the Sarvak Formation at two wells of the Dehluran oilfield in order to reconstruct the curves of water depth change. The reconstructed curves were plotted, then sequence boundaries (SB) and maximum flooding surfaces (mfs) were interpreted and correlated in both the wells (the rightmost tracks of Fig. 3).

Fig. 3
figure 3

Lithofacies groups, sequence stratigraphy and reconstructed curve of water depth change of the upper Sarvak Formation (USF), the exploratory wells of Dehluran oilfield, modified after [12]. For the abbreviations, refer to the nomenclature. Water depth change is unitless, i.e., the relative change is plotted

The sequence USF1 is located at the lowest part of the studied succession. It is expanded from deep outer shelf environment with pelagic and benthic Foraminifera to shallow lagoon deposits, containing Rudist, Peloid and Orbitolina. Its lower boundary is erosional, and pelagic sediments are overlaying reef facies. The mfs.1 is marked within the pelagic sediments. The TST of USF1 is composed of outer- and intra-shelf pelagic sediments; besides, water depth is rising upward. Whereas in the RST of the USF1, the deep sediments are replaced by mainly shoal and reef facies, i.e., facies progradation; meantime the general trend of water depth curve is decreasing upward.

The middle sequence USF2 is deposited on the USF1. The Echinoderm-bearing USF2, starts from shoal open-marine environment and ends to reefal environment. Below the mfs2, the core study shows mostly open-marine; besides, the water depth is rising. Whereas above the mfs2, the open-marine facies are replaced by shoal and reef facies; in parallel, the water depth is reducing. In the well#2, the mfs2 is not identifiable by the reconstructed water depth change.

The uppermost Rudist-bearing sequence USF3 have open-marine and shoal facies at the lower parts; besides, till mfs3.2, the water depth is increasing and the facies are shallowing upward. Above the mfs3.2, there are interlayers of shallow facies reef and lagoon. In parallel, the water depth starts declining. Unfortunately, no core is recovered from the USF3 in the well#1, so unable to compare the water depth change with core descriptions.

5 Discussion

In order to validate the gamma-clustering sequence stratigraphy, the identified sequences were compared to those previously interpreted by integrating core descriptions, thin sections (texture, facies and depositional environments) and well-log data, studied by [12]. All the sequence boundaries of the third order (SB1, SB2, SB3 and SB4) are well-identified. Some of the maximum flooding surfaces (mfs) of the third-order sequences and some key surfaces of the fourth-order sequences are also identified by the reconstructed curve of water depth change. There is a depth mismatch between the gamma-clustering sequence stratigraphy and core-based interpretations, presented in Table 1. The depth mismatch is because of depth displacement of core plugs. Therefore, an application of the gamma-clustering algorithm is in depth correction of core-based stratigraphic surfaces.

Table 1 Depth mismatch of core-based stratigraphic surfaces and gamma-clustering sequence stratigraphy

Within the sequence USF1, the key surfaces SB1, mfs1, mfs1.3 and SB2 are interpreted by both core studies and the gamma-clustering algorithm. The gamma-clustering algorithm identified SB1.2 while it is not interpreted by core studies. In the sequence USF2, both the methods succeeded in identifying SB2, SB3 and mfs2.2, in both the wells; whereas the gamma-clustering algorithm did not identify mfs2 in well#2 (Table 1). The reason is that sequence stratigraphy by cores is partly subjective, and the interpretations of two sedimentologists might not be exactly the same. The complementary information, micropaleontology and seismic data, are always necessary for confirming or modifying the interpretations. Hence, another advantage of the gamma-clustering algorithm is providing a quick-look sequence stratigraphy based on the gamma-ray well-log. Meanwhile, the output of the gamma-clustering sequence stratigraphy must be checked by core descriptions and other sedimentologic evidences.

Comparing the sequence surfaces, based on the gamma-clustering sequence stratigraphy and core study, illustrated that the algorithm is suitable for studying third-order sequences. There are some high-frequency cycles on the curve of water depth change in well#1 of Fig. 3. They could be interpreted equivalent to two complete sequences and a half of the fourth order. In addition, the mfs2.2 is identified by both the procedures in both the wells. In fact, the gamma-clustering algorithm is not powerful enough to study the fourth-order sequences. This limitation is due to resolution limitation of the input data, i.e., gamma-ray well-log. The recorded gamma-ray is a volumetric response of the formation. The dimension of volume of investigation of gamma tool is ~ 60 cm. In addition, sampling rate is about 15 cm. Therefore, the resolution of gamma-ray log is ~ 75 cm (sum of the mentioned values). For further explanations, refer to the volumetric Nyquist frequency introduced in [11]. Hence, the sequence stratigraphy based on gamma-ray log is less precise than 75 cm, i.e., not applicable in the thin-bed condition.

6 Conclusion

The gamma-clustering sequence stratigraphy reconstructs the water depth change, which is here used for identifying sequence surfaces. It is illustrated that the water depth curve is compatible with the facies descriptions, also the identified sequence surfaces are in compliance with core-based sequence stratigraphy in the carbonate Sarvak Formation, Dehluran oilfield, SW Iran. Then, it is argued that the gamma-clustering sequence stratigraphy could be used as a quick-look stratigraphic analysis of the third-order sequences. But its application is limited to the medium- to thick-bed strata and is not recommended in thin beds. In addition, the algorithm could be used for depth correction of the sequence surfaces, interpreted by core descriptions. Finally, gamma-clustering sequence stratigraphy should be coupled with the expert interpretations in order to provide a thorough sequence stratigraphy of a petroleum reservoir. The algorithm is especially important in the carbonate reservoirs, since expert-based sequence stratigraphy is more uncertain and subjective in the carbonates compared to the clastic reservoirs.