Perceived Omnichannel Customer Experience (OCX): Concept, measurement, and impact ✩

Efforts to measure customer experiences (CX) in multifaceted, omnichannel, retail contexts are crucial but lacking research guidance. Prior service quality literature has established methods for measuring CX in traditional, single-channel contexts but not adapted such measures to omnichannel contexts. With a mixed method research design and studies in eight phases, the authors propose a comprehensive measurement instrument that incorporates a schema- and categorization-based theoretical conceptualization of how customers assess omnichannel retail experiences; they also integrate means–end chain theory to explain perceived omnichannel customer experience (OCX) as a construct. This construct captures multiple omnichannel evaluation dimensions: social communications, value, personalization, customer service, consistency of both product availability and prices across channels, information safety, delivery, product returns, and loyalty programs. Multiple applications of the measurement model empirically conﬁrm the suitability of this instrument in consumer goods omnichannel retail settings. Its 36 items reﬂect nine ﬁrst-order quality dimensions that combine to form the overall, second-order OCX construct. The measurement instrument offers sound psychometric properties, as conﬁrmed by several reliability and validity tests, and predicts customer behavior reliably across studies. Thus, the OCX measurement instrument offers utility for theory, management practice, and further research

channel instruments. Yet no parallel insights reveal how customers evaluate service performance across the various attributes that appear in omnichannel retail settings or which standards they apply to gauge their CX ( Ostrom et al. 2015 ;Verhoef et al. 2015 ). Although some conceptual CX frameworks exist (Becker and Jakkola 2020 ;Lemon and Verhoef 2016 ), we lack a rigorous, empirically validated measurement instrument that comprehensively measures CX according to the relevant attributes and encounters that matter across all retail channels, as sources of a single, seamless omnichannel CX.
Therefore, this research seeks to establish a theoretical basis for a new measure of perceived omnichannel customer experience (OCX). Perceived omnichannel customer experience can be defined as customer evaluations of their seamless experiences across all the retailer's channels, as they move through the various customer journey stages, and according to several relevant dimensions. We propose measuring it as a second-order formative construct that comprises nine firstorder dimensions: social communications, value, personalization, customer service, consistency of both product availability and prices across channels, information safety, delivery, product returns, and loyalty programs. We test a measurement model for OCX in consumer goods retail settings ( Fig. 1 ). In establishing the OCX construct, we also identify attributes that are most critical, from a customer perspective, for creating a seamless, satisfying, omnichannel retail CX.
Relying on schema theory ( Bartlett 1932 ;Rumelhart 1978 ) and categorization theory ( Mervis and Rosch 1981 ), we conceptualize how customers assess their omnichannel retail experiences. Then by drawing on means-end chain (MEC) theory ( Gutman 1982 ), we explain the nature of our proposed OCX construct and specify a model. With data from an exploratory laddering study (Study 1, Web Appendix W1), we develop a hierarchical value map (HVM) that illustrates the relationships of salient attributes of OCX with their (functional and psychosocial) consequences. The patterns offer empirical support for the prediction that customers form an overall, omnichannel, retail customer experience perception according to their evaluations of the first-order dimensions.
In addition, to establishing the measurement model, we apply MacKenzie et al.'s ( 2011 ) procedures to assess its nomological validity (Studies 2-6) ( Netemeyer, Bearden, and Sharma 2003 ). The resulting empirical evidence affirms that the nine OCX dimensions are unique, relative to dimensions that reflect single-channel assessments, and that OCX better predicts managerially crucial outcomes such as word of mouth (WOM), loyalty, and trust in omnichannel retail settings. In addition to validating a condensed 9-item OCX scale with multiple data sets, in Study 6 we empirically confirm that the proposed measurement model can assess omnichannel CX in multiple settings, regardless of the level of OCX, and its predictive performance is superior to that achieved by the wellestablished SERVQUAL instrument ( Parasuraman, Zeithaml, and Berry 1988 ).
The resulting findings establish four main contributions to marketing. First, from a customer perspective, the empirical evidence clarifies how customers assess the overall experience of omnichannel retailing in consumer goods categories. No definitive measurement has been established previously, so we carefully develop and test a comprehensive conceptualization of OCX. Second, by running empirical tests of the OCX model that include customer performance outcomes, we provide a coherent, evidence-based framework for successful omnichannel retailing. Third, the OCX model reveals which elements really matter to omnichannel customers, so retail managers can allocate their investments effectively to enhance critical experience features, then gauge the performance of their initiatives. Fourth, to encourage similar efforts in various sectors, we detail a viable methodology that researchers can continue to apply, to extend the OCX measurement model to other omnichannel sectors.
In the next section, we synthesize relevant literature to establish why a new model is required to measure OCX specifically. After we establish a theoretical basis for conceptualizing this construct and specifying the model, we provide the empirical results of our measurement model development efforts. Finally, this article concludes with some implications for theory and retail practice and directions for further research.

Literature review
Prior SQ literature identifies a variety of relevant quality attributes and perceptual dimensions in single channels, using multiple models, such as Parasuraman et al.'s ( 1988 ) SERVQUAL model, the dominant method used to assess the SQ of pure service providers. With the growth of online shopping, models for measuring website SQ also emerged (e.g., Blut 2016 ;Holloway and Beatty 2008 ;Parasuraman, Zeithaml, and Malhotra 2005 ;Wolfinbarger and Gilly 2003 ), followed by adapted versions of SQ models that have sought to address multichannel retail settings ( Lin and Hsieh 2011 ;Montoya-Weiss, Voss, and Grewal 2003 ;Sousa and Voss 2006 ). Table W1.1 in Web Appendix W1 offers a fuller list of relevant studies in this domain, which provide four key insights.
First, perceived quality reflects a customer's judgment of an entity's (e.g., retailer's) overall excellence. Omnichannel customers are channel-agnostic; they use a myriad of touchpoints interchangeably to complete shopping tasks ( Inman and Nikolova 2017 ;Verhoef et al. 2015 ). Therefore, any assessment of perceived omnichannel retailing quality must cap-ture customer-retailer interactions in every channel that the omnichannel customer uses in the purchase journey. Instead, extant SQ models mostly capture perceptions of the performance of a single channel. The underlying logic is that customers typically shop using one channel, and they use distinct perceptual processes to assess their CX in each channel ( Balasubramanian, Raghunathan, and Mahajan 2005 ;Parasuraman et al. 2005 ). The simultaneous influence of many channels on cognitive quality assessments or in determining which attributes evoke positive omnichannel retail experiences remains unclear though ( Ostrom et al. 2015 ).
Second, current SQ models do not capture the influence of all experiences during omnichannel retail shopping scenarios. Inconsistent service across different channels and touchpoints can stimulate negative experiences though, including frustration and confusion ( Broniarczyk and Griffin 2014 ). Studies of the influence of social media on consumer behavior also suggest that the availability of peer reviews helps customers validate their purchase decisions and affects their decision confidence ( Broniarczyk and Griffin 2014 ). Some existing SQ dimensions, such as assurance ( Parasuraman et al. 1988 ) or website design ( Wolfinbarger and Gilly 2003 ), capture the quality of information provided through a retailer's physical shops or website, but they do not offer evidence of the consistency of the information offered across the channels, nor do they integrate the likely influence of peers.
Third, we note a tension regarding how SQ measurement models should be specified Parasuraman et al. (1988) . and Dabholkar, Thorpe, and Rentz (1996) operationalize measures as reflective, but Wolfinbarger and Gilly (2003) and Blut (2016) regard similar measures as formative. In assessing these specifications, Collier and Bienstock (2006) and Blut (2016) raise concerns about the reflective operationalization; for example, Parasuraman et al. (2005) use MEC as a theoretical basis, which would predict hierarchical, formative relationships of each dimension with the overall quality construct. A latent construct, such as overall quality, does not inherently take a reflective or formative structure though, so researchers must adopt the conceptualization that is most congruent with the conceptual definition of the construct they study ( MacKenzie, Podsakoff, and Podsakoff 2011 ). To address this concern, we rely on robust theorizing to avoid model misspecification and biased measurements ( MacKenzie et al. 2011 ).
Fourth, multichannel studies suggest that the benefits of different channels for specific stages in the purchase journey influence customers' attribute-based decision-making ( Melis et al. 2015 ;Verhoef, Neslin, and Vroomen 2007 ). But in an omnichannel retail environment, this distinction across channels may diminish ( Lemon and Verhoef 2016 ). For example, customers assess the intrinsic benefits of esthetic attributes (Holbrook and Hirschmann 1982 ;Mathwick, Malhotra, and Rigdon 2001 ) in both offline ( Bitner 1992 ;Parasuraman et al. 1988 ) and online ( Eroglu, Machleit, and Davis 2001 ;Kahn 2017 ) settings. We need to clarify which benefits/attributes really matter in a complex mix of channels ( Grewal et al. 2021 ;Verhoef et al. 2015 ), as well as the influence on customers' perceptions of SQ attributes ( Lemon and Verhoef 2016 ). We establish a theoretical foundation for OCX and then, to gain a customer-driven conceptualization of OCX dimensions and generate measurement items, we conduct a comprehensive, exploratory, laddering study (Web Appendix W1). Table 1 provides an overview of our multistage measurement model development process. The methodological details for each phase and step are detailed in the next sections.

Phase 1: conceptualization of the measurement model Theoretical foundation of OCX
Omnichannel retailing empowers customers with expanded information and decision-making tools ( Broniarczyk and Griffin 2014 ). Customers want to make the most accurate purchase decision with minimum cognitive effort ( Payne, Bettman, and Johnson 1993 ) and mental resource allocations ( Alba and Hutchinson 1987 ). Categorization theory, based in cognitive psychology, suggests that a perceptual categorization process allows people to maximize their information processing by minimizing their cognitive effort ( Cohen and Basu 1987 ). To evaluate experiences with less cognitive effort, customers likely recall relevant information stored in predefined perceptual categories and seek similarities between prior and current information ( Rosch and Mervis 1975 ). Perceptual categories are flexible; they can take different forms and store contextual information in multiple layers ( Barsalou 1983 ). During the categorization process, new belief structures can be created ( Petty and Cacioppo 1986 ), existing categories get updated, and subcategories might evolve ( Sujan and Bettman 1989 ). Over time, consumers develop mental sets of features or attributes for different categories ( Mervis and Rosch 1981 ).
When categorizing consumption-related information, they might develop a network of nodes in memory, storing information in hierarchical order ( Brewer and Nakamura 1984 ;Halkias 2015 ;Meyers-Levy and Tybout 1989 ), such that attributes related to finer information (e.g., operating hours) get stored at a subordinate level ( Meyers-Levy and Tybout 1989 ). Broader information, such as overall quality judgments, instead is installed at a superordinate or higher level ( Meyers-Levy and Tybout 1989 ). According to schema theory, customers use category-driven, top-down processes to evaluate experiences quickly and with little cognitive effort ( Alba and Hutchinson 1987 ;Fiske and Pavelchak 1986 ;Meyers-Levy and Tybout 1989 ), but they engage in slower, effortful, attribute-by-attribute, bottom-up information processing if existing category knowledge is inadequate to understand the information provided by their current experiences ( Fiske and Pavelchak 1986 ;Sujan and Bettman 1989 ).
The degree of incongruence between existing and new information triggers either assimilation or accommodation processes associated with category and schema development ( Rumelhart and Norman 1976 ). Assimilation occurs if the new information is just moderately incongruent, such that it can be integrated into an existing category ( Sujan and Bettman 1989 ). For example, a new online shopping feature likely can be integrated with just minor adjustments into the electronic SQ dimensions customers already hold in their schema. When inconsistencies are more substantial though, an accommodation process starts and creates new subcategories ( Sujan and Bettman 1989 ), which share features with multiple categories.
In multichannel literature, we find arguments that at any point of a customer purchase journey, the channel might invoke unique schema, which increases the cognitive load ( Balasubramanian et al. 2005 ). Because customers seek to minimize their cognitive load, they are unlikely to change the channel they use at that point of their multichannel purchase journey ( Balasubramanian et al. 2005 ;Parasuraman et al. 2005 ). In an omnichannel retail environment though, the variety of information and choice options already increases customers' cognitive load ( Broniarczyk and Griffin 2014 ). Moreover, omnichannel customers seemingly jump across prepurchase, purchase, and postpurchase stages of the customer journey, using myriad touchpoints in multiple channels interchangeably, to make the best purchase decision quickly ( Broniarczyk and Griffin 2014 ;Grewal and Roggeveen 2020 ), so their overall cognitive load could increase substantially if they frequently change schemas.
The combination of these theoretical insights suggests that customers might develop distinct schematic cognitive processes, specific to omnichannel shopping. They recognize clear differences in the characteristics of omnichannel, multichannel, and single-channel retailing ( Valentini, Neslin, and Montaguti 2020 ). Therefore, an accommodation process likely guides their development of schema to facilitate their omnichannel shopping. Through this process, customers develop new perceptual subcategories, or omnichannel quality dimensions, that fall within the broader retail shopping category. These new dimensions may share some attributes with existing categories, such as those in traditional SQ. Yet the unique schema enable customers to process information from the channels efficiently, without needing to invoke different schema. As a result, customers can complete their omnichannel retail purchase journey with minimum cognitive effort.
To clarify the nature of a schema, it is critical to understand the complex associations of different levels of abstraction in a category ( Meyers-Levy and Tybout 1989 ). Concrete cues, as might appear in a service environment, influence the cognitive attribute development process ( Parasuraman et al. 2005 ), through which customers attach value to each attribute. Then they use their preferred method to add or average all information related to each attribute and form perceptual dimensions. According to MEC theory ( Gutman 1982 ), each perceptual quality attribute is associated with a quality dimension; each dimension then is associated with an overall, abstract, higherorder summative construct, such as overall perceived quality ( Zeithaml 1988 ). In addition, MEC theory suggests that customers' preferences for certain attributes are influenced by Table 1 Process for Developing the OCX Measurement Model.

Phase 1: Conceptualization of the measurement model
Step 1: Theoretical foundations of OCX Step 2: Laddering study to explore attributes and conceptualize dimensions of OCX -Laddering questionnaire development, review by expert judges and pretests -Qualtrics online survey development and pretests -Data collection using online laddering: U.S. omnichannel customers, n = 79 (Study 1) -Content analysis using NVivo (v. 12) and category theme development -Literature review and synthesis -Subject matter expert review: conceptualization of 14 OCX attributes -Development of a hierarchical value map (HVM) using LadderUX, 232 ladders -Conceptualization of a measurement model with 14 first-order dimensions for the second-order overall construct OCX.

Phase 2: Development of measures
Step 1: Item generation and qualitative content and face validation -Generate initial items using laddering data and literature -Content validity assessment by three marketing academics -Content validity assessment by two marketing practitioners -Face validity assessment by three omnichannel customers -Finalization of an initial pool of 129 items Step 2: Quantitative content and face validation of developed items -Content validity assessment by six marketing academics by rating all items in an item rating matrix using a Qualtrics online survey -Face validity assessment by seven omnichannel U.S. customers, rating all items in a Qualtrics online survey -Finalization of a pool of 87 content and face valid items. Phase 3: Measurement model refinement -Survey instrument development using Qualtrics online tool -Six pretests of the survey instrument -Data collection: U.S. respondents, n = 359, MTurk panel members with master's qualification (Study 2) -Exploratory factor analysis using SPSS (v. 25), 45 items in 9 factors -Confirmatory factor analysis using AMOS (v. 25), 36 items in 9 factors -Empirical assessment of the formative measurement model specification using confirmatory tetrad analysis (CTA-PLS) -Common method bias tests (CFA marker variable technique, VIF) -Nomological validity assessment of the newly developed OCX model -Predictive relevance ( Q ²) tests using blindfolding and PLSPredict procedures and SAT, LOY, and WOM outcomes. -HTMT-based discriminant validity test to assess whether all OCX dimensions are distinct from one another and from SERVQUAL dimensions -OCX is superior to SERVQUAL for measuring CX in omnichannel consumer goods retailing contexts. Phase 7: Validation of condensed OCX model -Validation of a 9-item OCX measurement model, with one item for each OCX dimension ( data sets from Studies 3, 4, and 5 ).
-Comparison of predictive ability; the 9-item condensed OCX is superior to the short-form SERVQUAL ( data set from Study 6 ). Phase 8: Predictive validity tests with an experiment -Pretest assesses the effectiveness and realism of two vignettes ( n = 33). -Data collection: U.S. respondents, n = 214 (107 responses in both OCX_High and OCX_Low groups), MTurk panel members (Study 6) -PLS-MGA test to confirm the OCX_High and OCX_Low data groups do not show significant differences in group-specific parameter estimates. -Comparison of means with an independent samples t -test indicates a significant between-group difference for the OCX construct.
-Confirm OCX's predictive validity in an experimental setting for both high and low conditions. the consequences of those attributes, such as functional or psychological benefits ( Reynolds and Gutman 1988 ). When they make consumption decisions, customers seek to maximize positive outcomes and avoid negative ones; they assess an attribute positively if it enables them to attain universal life goals, such as their personal values ( Reynolds and Gutman 1988 ;Schwartz 1992 ). Thus, attributes, consequences, and values form interrelated and hierarchical structures (i.e., means-to-ends chains) in customers' minds.
Underpinned by MEC theory and prior SQ studies, we suggest that perceived omnichannel customer experiences (OCX) arise from seamless consumption experiences, involving customer-retailer and customer-customer interactions, in disparate integrated channels across their customer journey. First, customers may use omnichannel attributes to assess their experience at a dimension level. Second, they aggregate dimension-level quality assessments to form an overall OCX judgment. Third, they use their OCX evaluations to determine, for example, their satisfaction ( Fornell et al. 1996 ), repurchase intentions ( Zeithaml, Berry, and Parasuraman 1996 ), WOM ( Blut 2016 ), share of wallet ( Wulf, Odekerken-Schröder, and Iacobucci 2001 ), or trust in the retailer. The development of omnichannel retail customer experience attributes and dimensions that matter also may be influenced by the increased benefits that customers perceive from an omnichannel retailer. For example, more interaction channels might make it easier to return merchandise, so they provide functional benefits; increased options for providing information enhance psychological benefits, because consumers enjoy easy access to accurate information ( Broniarczyk and Griffin 2014 ).

Conceptualization of OCX dimensionality
Data collection. We explore which attributes relate to OCX, conceptualize its dimensions, and specify the nature of the OCX construct. Without substantial prior evidence related to OCX-related aspects, we turn to qualitative data from customers and expert opinions to develop a reliable measurement model ( MacKenzie et al. 2011 ). A laddering technique is suitable for collecting qualitative data when a construct conceptualization is anchored in MEC theory ( Reynolds and Gutman 1988 ). Therefore, we adopt online hard laddering (OHL) to elicit, directly from customers, perceptions that matter to them, using a sequence of direct questions. We adapt an existing OHL questionnaire ( Henneberg et al. 2009 ), using a series of pilot studies and expert reviews, then administer the resulting questionnaire as a Qualtrics online survey.
The output of a laddering study is an estimate of respondents' cognitive structures in relation to the concept being studied ( Reynolds and Olson 2001 ). The estimates improve if the laddering data come from homogeneous respondents with well-developed perceptual categories for the pertinent concept ( Grunert and Grunert 1995 ;Reynolds and Olson 2001 ). Considering the vast volume of omnichannel retail shopping activity in U.S. consumer goods sectors ( PwC 2017 ), as well as evidence provided by large-scale market research ( Nielsen 2020 ) and recent retailing literature ( Valentini et al. 2020 ), these om-nichannel customers likely have relatively mature perceptions, derived through their prior omnichannel experiences. Therefore, we screened and recruited U.S. participants from Amazon Mechanical Turk (MTurk) and applied recommended best practices to ensure the data were of high quality ( Kees, Berry, Burton, and Sheehan 2017 ). 1 Data analysis. Following Reynolds and Olson (2001) , we analyzed the laddering data (n = 79) in three steps. First, after exporting the data from Qualtrics into NVivo software, we content-analyzed the data to develop an overall sense of the types of elements elicited. Second, we defined meaningful categories of attributes (A), consequences (C), and values (V). Third, in an implications matrix, we display direct and indirect links across A → C → V elements, then develop an aggregated HVM ( Reynolds and Olson 2001 ) that graphically depicts the dominant perceptual patterns in the data and provides an estimate of the cognitive structure. Gengler and Reynolds (1995) recommend limiting laddering analyses to fewer than 50 categories. Relying on a series of reviews by two expert judges, we identified 39 categories (see Web Appendix W1): 14 attributes (Table W1.4), 21 consequences (Table W1.5), and 4 values (Table W1.6). Although we include multiple channels, the breadth of important attributes is similar to that of SQ studies involving single channels (e.g., 11 attributes in Parasuraman et al. 2005 ; 16 attributes in Blut 2016 ). The breadth of consequences reflects the complex nature of an omnichannel retail environment. Using category definitions and a LadderUX online laddering tool ( Vanden Abeele, Hauters, and Zaman 2012 ), we develop the implications matrix (Table W1.7), individual ladders, and HVM ( Figure W1.1). The HVM relationships (A → C → V) and their strengths indicate how the different attributes relate to consequences ( Gutman 1982 ;Reynolds and Gutman 1988 ), and the C → V link reveals why customers might prefer a particular attribute.
Conceptual measurement model. The changing uses of technology by retailers and customers over time influence customers' preferences for concrete cues, as might be available from wearable technology or social commerce, especially as they gain more experience ( Parasuraman et al. 2005 ). If we were to assess OCX at a concrete cue level, the measurement model would not be scalable or capable of considering changes to omnichannel retailing and customers' perceptions over time. Therefore, we use perceptuallevel attributes as items to measure OCX. Building on MEC theory and the quality conceptualization offered by Parasuraman et al. (2005) , we argue that customers' evaluations of their experience with an omnichannel retailer, according to perceptual attributes, coalesce into evaluations along more abstract dimensions, which produce the higher-order assessment of OCX. Informed by the empirical findings in the laddering study, we initially specify OCX as a second-order construct with 14 first-order dimensions that matter to cus-tomers: social communication, product selection, price level, personalization, customer service, information content, search efficiency, purchase process, tangibles, security, privacy, delivery, product returns, and loyalty programs (operational definitions are in Web Appendix W1, Table W1.4). Next, we integrate MEC theory, empirical studies of customers' perceived retailer quality ( Blut, 2016 ;Collier and Bienstock 2006 ;Wolfinbarger and Gilly 2003 ;Zeithaml 1988 ), and recommendations from Parasuraman et al. (2005) to treat first-order quality dimensions as formative indicators of the second-order latent construct and thus conceptualize a reflective-formative Type II model ( Becker, Klein, and Wetzels 2012 ).
To identify the formative relationships between the firstorder dimensions and the second-order assessment (OCX construct), we apply model specification criteria proposed by MacKenzie et al. (2011) andHair, Hult, Ringle, andSarstedt (2016) . First, we confirm that the dimensions are not manifestations of the overall judgment. Instead, each salient OCX dimension represents a specific quality characteristic that matters to an omnichannel customer. A change in any of the 14 salient dimensions alters the meaning of the global OCX construct; it also might depend on the extent to which a customer experiences each dimension. For example, a customer may develop a positive quality assessment of the product selection and proceed to complete a transaction but still rate the OCX poorly if the retailer fails to provide excellent delivery or postpurchase customer service. Second, the combined dimensions capture the quality of customers' interactions with an omnichannel retailer at different stages of the purchase journey. The social communication dimension refers to customers' assessments in a prepurchase stage; a poor score might explain why customers refuse to purchase from a retailer, despite engagement at other touchpoints. The delivery dimension instead refers to postpurchase interactions. Omitting any dimension would significantly reduce the model's ability to assess the OCX of an omnichannel retailer accurately ( MacKenzie et al. 2011 ). Third, the laddering data and HVM indicate that the dimensions do not have the same antecedents or outcomes. Information content, price level, product selection, and social communication all relate closely to low choice difficulty, but product returns, customer service, and delivery have stronger relationships with convenience. Therefore, the effects of these various dimensions on customers' behaviors and attitudes differ, as might the nomological net encompassing the items ( Jarvis, MacKenzie, and Podsakoff 2003 ).

Item generation
Following the conceptualization of the initial OCX dimensions and their operational definitions, we use the qualitative data collected in the Phase 1 laddering study and relevant prior measurement models to generate an initial pool of OCX measurement items. Netemeyer et al. (2003) and MacKenzie et al. (2011) recommend adopting multiple qualitative procedures, such as asking expert judges to assess content validity and population judges to gauge face validity. With a previously established procedure ( Karpen, Bove, Lukas, and Zyphur 2015 ;Lichtenstein, Netemeyer, and Burton 1990 ;Lin and Hsieh 2011 ), we assess both item content and face validity.

Qualitative content and face validation
The response format represents an essential consideration for developing new items. In line with prior studies of customers' perceptions of retailer quality ( Blut 2016 ;Dabholkar et al. 1996 ;Wolfinbarger and Gilly 2003 ), we rely on agreement-based Likert response scales for all items, such that the item wording prompts respondents to indicate their level of agreement with a declarative statement ( Netemeyer et al. 2003 ). When appropriate, we adopt existing measures of attributes or dimensions as a basis for creating the OCX items ( Dabholkar et al. 1996 ;Lin and Hsieh 2011 ;Parasuraman et al. 1988Parasuraman et al. , 2005Verhoef et al. 2007 ;Wolfinbarger and Gilly 2003 ).
Some measurement models purposefully include negatively worded or reverse coded items to keep respondents alert, yet negative terms also can confuse respondents, offering less reliability than positively worded items ( Netemeyer et al. 2003 ;Weijters and Baumgartner 2012 ), and create method bias . Existing measures of customers' perception of quality, such as eTailQ ( Wolfinbarger and Gilly, 2003 ) and SSTQUAL ( Lin and Hsieh, 2011 ), also do not incorporate negatively worded or reverse coded items, and negatively worded items in the original SERVQUAL scale were adjusted to adopt a positive format in a refined version (Parasuraman et al. 1991). Therefore, rather than negatively worded or reverse coded items, our online survey relies on other procedural remedies to improve engagement.
The items were scrutinized and edited several times in an iterative process by three marketing academics, who have measurement model development experience. They checked for double-barreled statements, item clarity, item length, and item simplicity. Following this review, we shared the pool of items with two practitioners, who are native English speakers, have experience supervising marketing activities of an omnichannel retailer, and have completed tertiary business studies. They received a study overview and operational definitions of the dimensions. In a Microsoft Word document, with the track changes tool enabled, they provided feedback about the wording of some items and a few additions to improve overall coverage. Next, we shared the list of items with three omnichannel customers who were native English speakers. Using their feedback, we enhanced the clarity of a few items.
The resulting pool of items was checked and approved again by the three marketing academics who originally checked them. The initial pool contained 129 newly generated items that capture essential aspects of OCX. Although there are no specific rules for the size of an initial item pool, a larger pool may be preferable, according to mea-surement model development studies in a similar domain ( Netemeyer et al. 2003 ). The large size of the initial item pool in this study also is in line with other scale development papers (e.g., E-S-QUAL measure used 121 items for the 11 initial dimensions; Parasuraman et al. 2005 ).
Finally, we measured four theoretically related outcome variables and test for nomological validity by addressing the relationship between OCX and its outcomes. These items were checked by experts, who gauged their appropriateness, clarity, and simplicity. The items for satisfaction come from Burnham, Frels, and Mahajan (2003) , Fornell et al. (1996) , and Blut (2016) . The word-of-mouth (WOM) and loyalty items are adapted from Zeithaml et al. (1996) and Blut (2016) . For share of wallet (SOW), we use items from Wulf et al. (2001) .

Quantitative content and face validation
Multicollinearity results from strong intercorrelations among formative indicators of a construct, which makes it difficult to separate the distinct influence of each formative indicator on the overall construct ( Diamantopoulos and Winklhofer 2001 ). To ensure an item is representative of one conceptual dimension only, and to achieve discriminant validity among first-order dimensions of a second-order formative construct, we draw on quantitative pretests ( MacKenzie et al. 2011 ;Netemeyer et al. 2003 ). They provided suggestions for improved item wording, categorizations of the items according to the conceptual definitions of the dimensions, deletions of inconsistent and redundant items, and indicators of the most appropriate item when several items are similar ( MacKenzie et al. 2011 ). We used Qualtrics surveys to collect data from academic experts (content validity) and omnichannel customers (face validity).
Following recommendations from MacKenzie et al. (2011) , for the content validity assessments, we developed an online survey with the initial OCX measurement items in the first column and the conceptual dimensions in 14 additional columns. The operational definitions of the dimensions appear at the top of each column. The online survey was configured to display five items at a time for ratings, and all ratings were forced. In addition, a textbox for each item allowed the academic experts to provide comments and justify their ratings; this element was optional Netemeyer et al. (2003) . suggest using a three-point categorization rating by at least five expert academics, so the survey was configured to permit academics to rate each item as "clearly representative," "somewhat representative," or "not representative" of each conceptual dimension. Recognizing the complexity and length of the survey, we configured it to allow the experts to pause and complete it later.
The research team also shared the survey details with a few marketing academics, who had not participated in any of the previous item development steps, and explained the task associated with rating the large item matrix. Six marketing academics (from the United States, the Netherlands, New Zealand, and Australia) agreed to complete this task. After starting it, most of them took about a week to com-plete the full content validity survey; they generally rated one or a few sections daily. After assessing their responses, we dropped items rated as not representative by a majority of the six academics. For a few poorly performing items, we undertook modifications based on the feedback. The net result was a refined pool of 87 items with at least 6 measurement items for most dimensions and no less than 4 for any dimension.
From this item pool, we developed an online survey using the Qualtrics tool to conduct the face validity tests. Here, Netemeyer et al. (2003) recommend three-point categorization ratings from at least five target population respondents, so the online survey again was configured to display the definition of each dimension and request a rating of each item as "clearly representative," "somewhat representative," or "not representative" of the dimension. We gathered responses from omnichannel U.S. customers on MTurk but excluded any respondents who had participated in Phase 1. The data collected from seven omnichannel customers did not prompt any changes to the initial pool of items.

Phase 3: measurement model refinement (Study 2)
Data for this phase were collected using the online survey we had developed in Qualtrics. With screening questions and an IP check, we recruited 365 omnichannel customers on MTurk who had earned master's qualifications. No respondents who had participated in previous stages could participate. As suggested by Netemeyer et al. (2003) and MacKenzie et al. (2011) , we carefully sought the most appropriate survey configuration (e.g., response scale type, control for common method bias, motivate respondents). Web Appendix W2 outlines the survey design and demographics; we removed 6 responses due to extremely low standard deviations or short survey completion times, resulting in 359 respondents.
In addition to the initial pool of OCX measurement items, we added items for WOM, satisfaction, and loyalty outcomes, as endogenous variables, to provide an initial test of nomological validity with the model refinement data set, as well as to enable a comparison of the refinement (this phase) and validation (Phase 4) data sets ( MacKenzie et al. 2011 ). To check for social desirability bias  ), we included a social desirability response scale with six items ( Donavan, Brown, and Mowen 2004 ;Karpen et al. 2015 ) in the survey too.

Exploratory factor analysis
The exploratory factor analysis (EFA) of the Study 2 data, using SPSS, involved a series of iterative analyses (Web Appendix W3) to establish a parsimonious and reliable factor structure for the OCX measurement model. First, we conducted a test of normality to select an appropriate factor extraction method for the subsequent analysis ( Fabrigar et al. 1999 ). Both Shapiro-Wilk and Kolmogorov-Smirnov significance values are less than 0.05 for all items, so the data are not normally distributed, and principal axis factoring extraction appears appropriate ( Fabrigar et al. 1999 ). Second, the first-order dimensions are conceptualized (Phase 1) and content validated (Phase 2) as distinct. Therefore, an EFA using an orthogonal rotation, such as Varimax, is appropriate to explore these mostly uncorrelated factors with measurement items that exhibit high loadings on each factor ( Fabrigar et al. 1999 ;Hair et al. 2014 ).
Through several EFAs, using principal axis factoring and Varimax rotation, we decided to drop one item after each analysis, due to item loadings ( λ) less than 0.4, a cross-loading greater than 0.3 across three or more factors, or a significant cross-loading on two factors ( Hair et al. 2014 ;Hinkin 1998 ;Netemeyer et al. 2003 ). After removing each item in each EFA iteration, we checked the Kaiser-Meyer-Olkin measure of sampling adequacy, the total variance explained by the remaining set of items, and reliability (Cronbach's alpha) for potential improvement ( Hair et al. 2014 ). This iterative process resulted in a parsimonious factor structure with 45 items, contained within 9 factors. To increase confidence in the extracted factor structure ( Conway and Huffcutt 2003 ), we also applied an EFA with maximum likelihood extraction and the number of factors to extract fixed to 9; it produced the same factor structure.
The nine factors extracted are consistent with dimensions that occur across various stages of the customer journey. First, the items that we had conceptualized and content validated for the "social communications" dimension loaded onto the corresponding factor. This construct captures a customer's evaluative judgment of peer user endorsements of the retailer across any of its channels, which can be highly influential in customer's decision process in the prepurchase stage of the customer journey. Second, a few items from the initial product selection and price level dimensions load onto a single factor, labeled "value." In the ladder map from Phase 1, both these dimensions relate to "value for money" at the consequence level. Therefore, the value construct, with four purified items, is a parsimonious way to measure customers' perceptions of the value they gain from the omnichannel retailer's product assortment and pricing, across all channels. Third, the "personalization" dimension captures a customer's evaluative judgment of the retailer's ability to tailor services, products, and the transactional environment across its channels. Fourth, "customer service" captures a customer's evaluative judgment of the support services the retailer provides at any stage of the customer journey across any channels.
Fifth, consistency items (i.e., product, price, and information aspects linked to variety, certainty, and low choice difficulty consequences in the ladder map) all load on a single factor, which we label "consistency." This construct, with four purified items, is parsimonious and can measure customers' perceptions of consistent product availability and pricing information across the retailer's channels. Sixth, information safety-related items within the initial safety and privacy dimensions loaded onto a single factor. The formation of this single factor, as a further refinement of the preliminary dimensions we uncovered in the ladder map in Phase 1, is logical; conceptually, safety and privacy both relate to "avoid exploitation" and "purchase confidence" dimensions at a consequence level (Web Appendix W1, Figure W1.1). Accordingly, we use "information safety," with the four purified items, as a parsimonious way to measure customers' perceptions of the omnichannel retailer's efforts to protect them from exploitation or information misuse at any stage of the customer journey (e.g., protection of credit card information for purchases in any channels, secure storage of that information in postpurchase stage).
Seventh, the "delivery" dimension captures a customer's evaluative judgment of the retailer's delivery and pick-up services across any channels. Eighth, "product returns" refers to a customer's evaluative judgment of the retailer's handling of product returns and exchanges across channels. Ninth and finally, the "loyalty programs" dimension captures a customer's evaluative judgment of the retailer's loyalty program across channels. These latter three quality dimensions are highly influential in postpurchase stages. In summary, we derive an OCX model that comprises nine quality dimensions that capture the comprehensive nature of the omnichannel retail experience.

Confirmatory factor analysis
To assess the psychometric measurement properties of the OCX model, we employed AMOS (version 25, with the model estimate plugin and Excel tool; Gaskin and Kim 2019 ) and conduct an iterative confirmatory factor analysis (CFA) to achieve item refinement ( Arnold and Reynolds 2003 ). If an item had a high modification index or a large, standardized residual ( > 2.58) or its removal improved model fit ( Hu and Bentler 1999 ), we considered it for deletion. Then we inspected each candidate for deletion for its domain representativeness and deleted it if the remaining items associated with the same factor exhibited similar aspects ( Nunnally and Bernstein 1994 ). In Table 2 , the final confirmatory model, containing 36 items across 9 factors (4 items per factor), offers superior model fit according to all commonly reported indices. The total variance explained by the 36 items is 66.5%, and each factor explains more than 5% of the total variance ( Hair et al. 2014 ). The final set of items, their loadings, labels for each factor or dimension, and the definitions are in Table 3 .
Using the results of this CFA (Web Appendix W4), we also checked for validity and reliability ( Fornell and Larcker 1981 ). The composite reliabilities of the dimensions exceeded 0.70, which provides evidence of reliability ( Hair et al., 2014 ). The results confirm convergent validity, in that the average variance extracted (AVE) for each dimension is greater than 0.50. In support of discriminant validity, the AVE for each scale exceed the squared correlation of a dimension and any other dimensions in the measurement model.

Empirical assessment of the formative measurement model using CTA-PLS
Any misspecification of a newly developed measurement model is a threat to the validity of subsequent structural equation modeling (SEM) results ( Jarvis et al. 2003 ); empirical tests such as confirmatory tetrad analysis (CTA) Table 2 Illustrative OCX model fit comparisons (AMOS). Thresholds:  9 factors, 45 items 1699.73 n/a n/a 909 n/a n/a 1.87 n/a n/a .93 n/a n/a .05 n/a n/a .05 n/a n/a 2041.7 n/a n/a 2092.1 n/a n/a .92 n/a n/a .62 n/a n/a  ( Bollen and Ting 2000 ) can confirm the appropriateness of a formative measurement model specification ( Hair et al. 2017 ). Therefore, we undertook a CTA-partial least squares (PLS) analysis using SmartPLS ( Gudergan et al. 2008 ) for data set 1 (Study 2), to enhance confidence in the model specification. The latent variable scores of the reflectively measured first-order dimensions provide the formative indicators of the second-order OCX construct, the bootstrapping subsampling is set to 5000, and the significance level is set to 0.1. The results (Table W4.3, Web Appendix W4) indicate that the bias-corrected and Bonferroni-adjusted confidence interval (CI) does not include 0, across multiple rows (tetrads). For example, the CI of OCX's tetrad 9 indicates a lower boundary of 0.04 and an upper boundary of 0.20 ( p = .001); that for OCX's tetrad 161 has a lower boundary of −0.24 and an upper boundary of −0.05 ( p = .001). These results confirm that the measurement model is not reflective.

Common method bias tests
We conducted several statistical tests to ensure the study was not contaminated by common method bias (CMB). First, a principal component analysis in SPSS with rotation set to none and factors to extract set to 1 reveals that the maximum variance explained by a single factor is 31.49, so CMB is unlikely ( < 50), according to Harman's single-factor test. Second, we apply a marker variable technique ( MacKenzie and Podsakoff 2012 ) and test for correlations between the dimensions of OCX and a conceptually unrelated, social desirability response scale using an AMOS-based CFA (i.e., the marker variable should not correlate with a variable in the survey if there is no theoretical relationship between them). As expected, the analysis does not show any association between the OCX dimensions and the marker variable (Web Appendix W5). Third, PLS-SEM can detect CMB, using a full collinearity assessment ( Kock 2015 ). In the SmartPLSbased model assessment, the variance inflation factor (VIF) provides collinearity statistics, and inner VIF values lower than 5 indicate the model is unlikely to be affected by CMB ( Hair et al. 2016 ). In Table 4 , the inner VIF values are lower than 5 across all dimensions. Thus, the combined tests suggest that CMB is unlikely to be a concern for this study.

Nomological validity assessment of the newly developed OCX model
In Phase 2, we tested the association of the newly developed OCX model with satisfaction, loyalty, and WOM, using Study 2 data. In Table 2 , all goodness-of-fit (GoF) indices are above conventional cut-off values ( Hu and Bentler 1999 ), and the OCX model with 36 items contained in 9 factors offers the best model fit. The comparison of commonly used GoF indices implies that the data fit the proposed model reasonably well. In Table 3 , the uniformly high and significant EFA, CFA, and PLS item-construct loadings suggest that the first-order dimensions of OCX are reflected well by the corresponding measurement items. In Table 4 , the ratios are less than 0.9 across all dimensions, which confirms discriminant validity ( Henseler, Ringle, and Sarstedt 2015 ). The VIF  Notes: Study 2 ( n = 359), Study 3 ( n = 447), and Study 4 ( n = 371). Items for each dimension are sorted by data set 2 PLS weight (high to low). The placeholder XYZ can be replaced with the name of any omnichannel retailer. Respondents rated items on a seven-point Likert response scales; each point was clearly labeled from "strongly agree" to "strongly disagree," with "neither agree nor disagree" as the midpoint. The PLS-SEM values are from SmartPLS 3 , with weighting scheme = factor, iteration = 1000, complete bootstrapping with 5000 subsamples, and test type = two-tailed. All values are significant at p = .001. Items marked with C are included in the condensed OCX measurement model; see the Phase 7 discussion. also is well below 5, confirming a lack of multicollinearity ( Hair et al. 2016 ). All first-order dimensions have large ( > 0.35) effect sizes ( Henseler, Ringle, and Sinkovics 2009 ), such that they contribute substantially to the formation of OCX.
Using PLS-SEM also can support the construction of complex, reflective-formative models with many items ( Wetzels, Odekerken-Schröder, and Van Oppen 2009 ). We assess the relationships of OCX with its outcomes using SmartPLS 3.2.8 ( Hair et al. 2016 ). The second-order OCX in SmartPLS relies on repeated uses of the indicators of the first-order dimensions ( Wetzels et al. 2009 ) Table 5 . contains the standardized estimates ( Hair et al. 2016 ), which affirm that OCX is positively associated with satisfaction ( β = 0.76; t = 28.37), loyalty ( β = 0.59; t = 15.04), and WOM ( β = 0.66; t = 20.86). The assessments support the nomological validity of our OCX measurement model.

Phase 4: measurement model validation and finalization (Study 3)
With another data set (Study 3), representing responses from the relevant population, we validate the newly developed measurement model. It is appropriate to include at least one data set with responses from non-MTurk panel respondents in any study involving multiple stages and data sets ( Hulland and Miller 2018 ). Therefore, we collected responses from April-May 2019 from Qualtrics panel members in the United States. In this study, in addition to satisfaction, loyalty, and WOM 2 A blindfolding test predicts original values by reusing the sample after a systematic pattern of data point elimination, based on an omission distance (D). For example, an omission distance of D = 6 implies that every sixth data point is omitted in each blindfolding round, after which the test predicts every data point of the indicators used in the measurement model for a selected latent variable ( Hair et al. 2020 ). With Q 2 , we assess out-of-sample predictive ability by estimating the model on a training sample, then use the result to predict the outcomes for data in holdout samples ( Hair et al. 2020 ). If Q 2 values exceed 0, it indicates meaningful relevance; values greater than .15 and .35 further indicate that the measurement model has medium or large predictive power, respectively, in relation to the focal endogenous constructs ( Hair et al. 2016 ).

Trust in the Omnichannel Retailer (TRUST)
n/a n/a .74 n/a n/a 26.46 n/a n/a .55 TRUST1 XYZ reminds me of someone who's competent and knows what he/she is doing.
n/a n/a .91 n/a n/a 78.11 TRUST2 XYZ has a name you can trust. n/a n/a .92 n/a n/a 80.86 TRUST3 XYZ's product and service claims are believable. n/a n/a .90 n/a n/a 66.03 TRUST4 Over time, my experiences with XYZ have led me to expect it to keep its promises, no more and no less.
n/a n/a .87 n/a n/a 46.78 Notes: Study 2 ( n = 359), Study 3 ( n = 447), and Study 4 ( n = 371). The construct validity measures of the outcome variables are available in Web Appendixes W4 and W6. An HTMT estimate less than 0.90 supports the discriminant validity of the tested subconstructs (Henseler et al. 2015). The PLS-SEM values are from SmartPLS, with weighting scheme = path, iteration = 1000, complete bootstrapping with 5000 subsamples, test type = two-tailed, and repeated indicators for the second-order OCX. All values are significant at p = .001. Respondents rated one random SAT, WOM, LOY, and TRUST item at a time on a seven-point Likert response scale, and each point was clearly labeled, from "strongly agree" to "strongly disagree," with "neither agree nor disagree" as the midpoint. Each SOW item was measured using a different five-point response scale. The n/a (not administered) cells indicate that the SOW items were not administered in Study 2; the TRUST items, adopted from Erdem, Swait, and Valenzuela (2006) , were not administered in Studies 2 and 3.
items, we include three content-valid items to assess share of wallet (SOW). The validation data set (data set 2) contains 447 responses (see Web Appendix W6).

Measurement model validity assessments
To confirm the validity of the newly developed OCX measurement model, we employed AMOS and SmartPLS, using data set 2. In Table 2 , all AMOS-based GoF indices under the Study 3 columns exceed conventional cut-off values ( Hu and Bentler 1999 ). Thus, the data fit the proposed model reasonably well, and the OCX model with 36 items contained in 9 factors is the best option. In Table 3 , the Study 3 columns also reveal uniformly high, significant CFA and PLS itemconstruct loadings, which confirm that the first-order dimensions of OCX are well reflected by the corresponding measurement items. In Table 4 , Study 3 columns, the ratios of less than 0.9 across all dimensions confirm discriminant validity ( Henseler et al. 2015 ). The VIF also is well below the cut-off value of 5, confirming a lack of multicollinearity ( Hair et al. 2016 ) and reducing CMB concerns for this study. All first-order dimensions have large ( > 0.35) effect sizes ( Henseler et al. 2009 ), indicating their substantial contribution to the formation of OCX. The CTA-PLS results (Web Appendix W6) indicate that the CI interval does not include 0, across multiple rows (tetrads). For example, the CI for OCX's tetrad 57 has a lower boundary of −0.22 and an upper boundary of −0.03 ( p = .001). The OCX measurement model is not reflective.

Nomological validity assessments
In Table 5 , data set 2 columns, the uniformly high and significant PLS item-construct loadings confirm that the satisfaction, loyalty, WOM, and SOW constructs are well reflected by their corresponding measurement items. The PLSbased measures that validate the constructs are available in Web Appendix W6. In Table W6.2, a heterotrait-monotrait (HTMT) estimate of less than 0.90 supports the discriminant validity of the subconstructs ( Henseler et al. 2015 ). The results in Table W6.3 also confirm that satisfaction ( α = 0.89), loyalty ( α = 0.80), WOM ( α = 0.92), and SOW ( α = 0.86) are reliable constructs. The standardized estimates in Table 5 , data set 2 column, confirm that OCX is positively associated with satisfaction ( β = 0.81), loyalty ( β = 0.69), WOM ( β = 0.75), and SOW ( β = 0.42). In addition, three blindfolding tests with omission distances of 6, 9, and 12 confirm ( Q 2 > 0) the predictive relevance of OCX for all four outcomes. The positive Q 2 _predict values (fold set to 10), ranging from 0.12 to 0.62, establish its superior predictive performance ( Shmueli et al. 2016 ). Overall, the Phase 4 assessments thus confirm the sound psychometric properties and nomological validity of our proposed OCX measurement model.

Robustness check using FIMIX-PLS and PLS-MGA
To confirm the robustness of the structural model, we check for unobserved heterogeneity, to ascertain if an analysis of the entire data set is reasonable ( Hair et al. 2017 ). Unobserved heterogeneity occurs when subgroups of data exist (i.e., more than one significant segment is present in the observations) that produce substantially different structural model estimates. Therefore, we apply a finite mixture partial least squares (FIMIX-PLS) technique in SmartPLS (v. 3.2.8), using data set 2; the results are available in Web Appendix W7.
To determine the number of segments to retain, we check the fit indices for solutions with one to five segments (Table  W7.1). The indices do not yield conclusive evidence of the existence of more than one segment. According to the relative segment sizes across different FIMIX-PLS solutions (Table  W7.2), any solution with three or more segments should be discarded. A three-segment solution would involve a segment with only 81 (18% of 447) observations, less than the minimum sample size required for the OCX model assessment.
To assess the feasibility of a two-segment solution, we gather segment-specific R 2 values and weighted average R 2 values (Table W7.3). The R 2 values in segment 1 are slightly higher than in the full data set; those in segment 2 are slightly lower for loyalty and WOM and slightly higher for satisfaction and SOW. The weighted average R 2 values of the FIMIX-PLS two-segment solution are only slightly higher than those of the original sample, revealing negligible differences (from 0.002 to 0.04). According to the FIMIX-PLS segment-specific path coefficients (Table W7.4), the strengths of the standardized path coefficients from OCX to the endogenous variables do not differ substantially ( < 0.07) across segments. The combined results empirically show that unobserved heterogeneity is not prevalent and confirm the robustness of the PLS-SEM analysis for the full data set. 3 These assessments of the newly developed OCX measurement model confirm the model is valid and stable for different populations. Furthermore, the FIMIX-PLS and PLS-MGA (multi-group analysis) analyses indicate that the structural model is not affected by unobservable or observable heterogeneity, which helps confirm the validity of the OCX measurement model. Therefore, we finalize the items and dimensions, as reported in Table 3. 4

Phase 5: OCX measurement model revalidation with another data set (Study 4)
The data for Phase 5 came from Qualtrics panel members in the United States, and we worked to avoid including any respondents from any of the previous studies. For example, we requested that Qualtrics refrain from sending invitations to the panel members who had responded to a previous study. Cross-checks of the respondents' Qualtrics account history and IPs helped confirm the uniqueness of this data set. At the beginning of the survey, potential respondents underwent screening, as in previous studies. The validation data set (i.e., Study 4) contains 371 responses, as detailed in Web Appendix W10.
In addition to satisfaction, loyalty, WOM, and SOW, we include items to measure trust in the omnichannel retailer, which might be generated by customers' assessments of quality or experience with the firm's offering ( Carù and Cova 2003 ). Because the OCX reflects customers' assessment of their overall experience with an omnichannel consumer goods retailer across all channels, we expect it relates to such forms of trust. As our laddering study showed empirically (Web Appendix W1, Table W1.5 and Figure W1.1), trust and confidence are key psychosocial consequences of customers' experiences with omnichannel retailers in the consumer goods sector. Leveraging an existing definition of trust in a firm ( Anderson and Weitz 1989 ), we define trust in the omnichannel retailer as a feeling of confidence in the retailer, such that customers believe the retailer is willing and able to deliver on its promises across all channels.

Revalidation of OCX measurement model
To revalidate the OCX measurement model, we employ several assessments with the new data set. First, EFA with maximum likelihood extraction and the number of factors fixed to 9 produced a factor structure identical to that in Phase 4; the SPSS-based EFA loadings are available in Table 3 (Study 4 columns). Second, the CFA in AMOS produces GoF indices that exceed conventional cut-off values, so the data fit the OCX model that features 36 items classified into 9 factors well ( Table 2 , Study 4 columns). Third, the uniformly high, significant CFA and PLS item-construct loadings confirm that the first-order dimensions of OCX are well reflected by the corresponding measurement items ( Table 3 , Study 4 columns). In Web Appendix W10, Table W10.2, the ratios are less than 0.9 across all dimensions; that is, the constructs in the OCX model are distinct ( Henseler et al. 2015 ). In Table W10.3, the VIF values less than 5 revalidate the lack of multicollinearity ( Hair et al. 2016 ) and confirm that CMB is not a concern. All the first-order dimensions have large ( > 0.35) effect sizes ( Henseler et al. 2009 ), confirming their substantial contributions to OCX. Fourth, the CTA-PLS results (Table W10.4) indicate that the CI intervals do not include 0 (e.g., tetrad 151 [.05, 0.28]; p = .000). Thus, we reconfirm that the OCX measurement model is not reflective.

Phase 6: comparison of OCX model against potential alternative (Study 5)
We gathered 209 valid responses from omnichannel U.S. customers on MTurk (further details are available in Web Appendix W11), in an effort to demonstrate that the identified OCX dimensions are unique, relative to dimensions that have been constructed to measure single-channel phenomena, as in the well-established SERVQUAL measurement model. We also test the relative performance of OCX for predicting key outcomes.

Discriminant validity of OCX versus SERVQUAL dimensions
To compare the discriminant validity of the dimensions in OCX versus SERVQUAL, we developed a path model in SmartPLS that includes the nine first-order OCX formative dimensions and the five first-order SERVQUAL reflective dimensions. We used WOM ( α = 0.82; construct reliability = 0.89; AVE = 0.73) as the outcome variable. According to the HTMT-based discriminant validity test (Table W11.7), all OCX dimensions are distinct (HTMT < 0.9) from one another, as well as from all the SERVQUAL dimensions. The correlation matrix in Table W11.8 reveals low correlations ( < 0.7) between OCX and SERVQUAL dimensions. These empirical findings offer support for the argument that the nine OCX dimensions are unique relative to the dimensions of other SQ scales, as exemplified by SERVQUAL.

Relative predictive ability of OCX versus SERVQUAL dimensions
We actually do not recommend applying SERVQUAL to an omnichannel context, but to extend our analysis, we consider the ability of OCX, relative to SERVQUAL, to predict satisfaction, loyalty, WOM, SOW, and trust outcomes. For this analysis, we developed multiple path models to estimate (1) the impact of OCX on the outcome variables, (2) the impact of SERVQUAL on the outcome variables, and (3) the impacts of both OCX and SERVQUAL on individual outcome variables. As we show in Table 6 , OCX performs better than SERVQUAL in influencing satisfaction, loyalty, WOM, and SOW, whereas SERVQUAL outperforms OCX on trust. In the path models with both OCX and SERVQUAL, we find that SERVQUAL does not influence satisfaction, loyalty, or SOW significantly ( t -values < 1.96). Thus, OCX offers superior predictive ability with regard to these managerially critical outcomes.

Phase 7: validation of a condensed OCX model
We recommend administering all 36 items ( Table 3 ) to assess the different dimensions of an omnichannel retailer's performance related to OCX, because this systematic approach can fully capture the nine salient dimensions and reveal where to invest resources to enhance overall OCX. Yet some managers and researchers may require a more parsimonious approach, such as if they are assessing OCX in a supporting role rather than as a key construct ( Netemeyer et al., 2003 ). Therefore, we rely on the three data sets from Studies 3-5 to validate a condensed 9-item OCX measurement model, with one item for each dimension. The correlations of the 9-and 36-item OCX measurement models are 0.96, 0.96, and 0.95 for Studies 3, 4, and 5, respectively.
With regard to the predictive ability of the condensed OCX measurement model, the adjusted R 2 values in Studies 3, 4, and 5, respectively, are as follows: satisfaction (0.60, 0.57, 0.61), loyalty = (0.44, 0.40, 0.51), WOM = (0.51, 0.50, 0.56), SOW = (0.15, 0.09, 0.19), and trust = (na, 0.53, 0.58). Although the estimates are statistically significant ( p = .001), they are weaker than those obtained with the full model in Table 5 . Then we compared the performance of the condensed 9-item OCX model with a shorter 5-item SERVQUAL model, using the Study 6 data set, and the results (Table W11.9) show that the condensed OCX model performs better. Collectively, these findings empirically validate both the full and condensed OCX scales.

Phase 8: predictive validity tests using an experiment (Study 6)
To test OCX's predictive validity further, we conducted an experiment to manipulate omnichannel CX (high/low) and measure the predicted outcomes. Two vignettes provide varying information about a fictional retailer, one that describes high OCX across all nine dimensions and another that indicates low OCX for all nine dimensions. 5 With a between-subject design, the survey experiment, developed us- 5 We conducted a pretest to check the effectiveness and realism of the vignettes among 40 MTurk respondents, accoding to a within-subject design. Thus, all respondents received both vignettes and responded to two questions for each scenario: (1) if they believe the fictional retailer is an excellent omnichannel retailer and (2) whether the scenario is realistic. A paired sample t-test confirms significant differences across vignettes (M OCX_High = 6.35, SD = .84; M OCX_Low = 4.03, SD = 1.94; t OCX_High-OCX_Low = 7.16, p < .001); a one-sample t -test relative to the scale midpoint (4) affirms that both vignettes are realistic (M OCX_High = 5.78, SD = 1.37, t = 8.21, p < .001; M OCX_Low = 5.18, SD = 1.52, t = 4.89, p < .001). Table 7 Comparison of means for OCX (High) and OCX (Low) groups (Study 6; independent t -test).

Constructs
Mean Mean Difference ing Qualtrics, randomly displayed one of the two vignettes to respondents. After reading the vignette, they completed the 9item condensed OCX model, 5-item condensed SERVQUAL model, and the outcome measures, as well as a manipulation check (see Web Appendix W12). A comparison of means ( Table 7 ) between the two OCX groups, according to an independent samples t -test, indicates a significant difference of 2.79 ( t = 19.21, p < .001, r = 0.64). The t -test also confirms significant differences between groups for several outcomes (SQ, satisfaction, loyalty, WOM, trust; Table 6 ). Furthermore, OCX consistently outperforms SERVQUAL if we assess their effects in the same path model ( Table 5 ). Finally, an importance performance map analysis (IPMA; see details in Web Appendix W9), confirms that OCX is a reliable diagnostic tool that retail managers can use to identify attributes that demand particular attention.

General discussion
This study achieves two broad, important objectives. First, it provides a robust theoretical basis for explaining perceived omnichannel customer experience (OCX) and establishing an OCX model specification for consumer goods retail settings. Second, this study responds to calls to develop a new model to measure CX that addresses relevant cues and encounters across all channels, which determine the overall customer experience in omnichannel retailing settings ( Lemon and Verhoef 2016 ). With rigorous construct validation procedures, we establish a hierarchical conceptualization and measurement of a comprehensive OCX construct. In multiple studies, we also empirically demonstrate its suitability; the measurement instrument contains 36 items, related to 9 first-order quality dimensions that form the second-order OCX construct. The OCX scale (and its 9-item short form) offers precise, actionable measures for retailers to gain insights into customers' perceptions of their omnichannel retail experience.

Theoretical contributions
The OCX measurement model contributes to retail and services literature. First, we address a key question related to the appropriate understanding and measurement of CX in an omnichannel retailing era ( Inman and Nikolova 2017 ; Lemon and Verhoef 2016 ;Ostrom et al. 2015 ;Verhoef et al. 2015 ). By theorizing OCX, underpinned by schema and categorization theory, we predict how customers' need to reduce the cognitive load induced by omnichannel retail shopping triggers a cognitive process that seeks an overall assessment. Furthermore, we explain how accommodation during the categorization process can guide the development of schema (i.e., OCX) that facilitate omnichannel retail shopping. This theoretical contribution can also inform researchers' efforts to conceptualize other constructs in omnichannel contexts.
Second, omnichannel retailing is complex, and many attributes contribute to the formation of OCX. Our proposed model establishes nine important dimensions. The value dimension depends on customers' evaluative judgments of the appropriateness of an omnichannel retailer's products and pricing. With respect to the personalization dimension, an excellent omnichannel retailer offers tailored services, products, and transactional environments to meet the needs of individual customers in any of its channels. Regarding the customer service dimension, an omnichannel retailer should provide excellent customer support services across all its channels at all stages of the customer journey. The consistency dimension pertains to product assortment and pricing consistency across channels. The delivery dimension implies superior delivery and pick-up services provided by the retailer in any channel. With respect to the product returns dimension, customers' experience with the retailer's handling of product returns and exchanges informs OCX. The social communication dimension reflects the influence of reputation mechanisms across channels on customers' assessments of the omnichannel retailer's quality. At any stage of the customer journey, an excellent omnichannel retailer ensures strong safety measures to protect customers against payment fraud or loss of personal information, as measured by the information safety dimension. Finally, the loyalty programs dimension affirms that the excellence of the omnichannel retailer's loyalty program across channels is a key component of OCX. This study provides insights into the relevance of the criteria that omnichannel customers use.
Third, with empirical findings across multiple stages, we confirm that all nine first-order dimensions of OCX are distinct and cannot be merged or deleted (e.g., in an effort to reduce the number of items required to measure customers' perceptions), without changing the meaning of the OCX construct. Thus, we propose a hierarchical model of OCX: a second-order factor model that links omnichannel retail quality perceptions to distinct, actionable dimensions. The empirical results align with our MEC theoretical underpinning; the relationship between the first-order dimensions of OCX and the second-order construct OCX is formative rather than reflective in nature. Continued omnichannel studies would benefit from adopting hierarchical models.
Fourth, the rigorous measurement development procedure we have undertaken can serve as a guide for continued efforts to develop perceived quality/experience measurement models in other domains. As we demonstrate, to develop a model based in MEC theory, researchers should use a laddering method to explore attributes, conceptualize distinct and relevant quality dimensions with a series of quantitative and qualitative tests, and use a formative mode to link the dimensions to the overall construct. Additional model development studies might follow the steps that we outline herein and apply the methodological approach we detail for each phase.
Fifth, we empirically validate positive relationships of OCX with satisfaction with the retailer, omnichannel customers' loyalty intentions, word-of-mouth intentions, share of wallet, and trust in the omnichannel retailer in a consumer goods setting. As such, this study extends the theoretical relationship between an experiential construct and its outcomes into an omnichannel domain. Studies that seek to test the nomological validity of their theoretical frameworks can use these rigorously validated adapted outcome measures.

Implications for practitioners
This study provides new insights that might enhance the effective design of omnichannel CX in practice. Recent technology and increased customer-retailer and customer-customer interactions across channels and touchpoints allow retailers to capture vast volumes of data. Confronted with all these data, managers of consumer goods retailers can benefit from knowing which metrics to consider in their omnichannel environment to support their efforts to fuel growth . By gauging the performance of the nine firstorder dimensions of OCX, omnichannel retail managers can identify which investments are likely to result in improved CX across channels. By assessing OCX at the first-order dimension level, the retailers also can assess each quality dimension in-depth, then differentiate well versus poorly performing dimensions. With such information, managers can focus on specific features to improve, then evaluate the impacts of their strategic actions and investments on the overall CX.
Retailers also need robust measures of customer satisfaction and loyalty in omnichannel retail settings ( Kumar et al 2017b ). This study offers a parsimonious, easy-to-administer measurement tool (long-or short-form) that can predict customers' satisfaction with omnichannel retailers. With importance performance map analyses (see Web Appendix W9), we find that the customer service, value, and social communica-tion dimensions are the most critical for customer satisfaction and loyalty. Omnichannel retailers should invest in optimizing these dimensions if they seek higher profitability. Retail managers also might administer OCX, perhaps in the short form, in periodic customer surveys to monitor their changing preferences and responses to CX improvement initiatives ( De Keyser et al.2020 ).
In addition to helping managers in omnichannel consumer goods markets, these insights can help providers, such as Qualtrics, Forrester, and McKinsey, develop new tools they can offer to their retailer clients; they even might establish the OCX measurement model as a key offering. If they assess different omnichannel retailers, these service providers could compare competitors' performance in a market, then benchmark a focal retailer's performance ( De Keyser et al. 2020 ).

Directions for further research
This study addresses OCX in the consumer goods retail sector, and the OCX measurement model offers an effective measure of customers' overall evaluations of their experiences with omnichannel retailers. Schema theory postulates that schema such as OCX also might spread through customers' perceptual category structures to facilitate information processing in other, similar contexts ( Lajos et al.2009 ). Such a process might reduce customers' cognitive load further, in that they could use the existing schema as a template rather than developing entirely new schema ( Lajos et al. 2009 ). The robust theoretical basis that we establish for conceptualizing customers' evaluations of their omnichannel experience in turn provides a foundation for researchers to advance omnichannel literature further, in domains beyond consumer goods (e.g., tourism, education, finance, public agencies) ( Kumar, Anand, and Song 2017a ). Customers' assessments of their experiences also may differ across industries, contexts, and cultures ( Lemon and Verhoef 2016 ), so we call for studies of other omnichannel markets.
Marketing literature has moved beyond purchase-related considerations to focus on customer engagement (comprised of the four dimensions of customers' lifetime value, referral value, influence value, and knowledge value; Kumar and Pansari 2016 ) and efforts to help retailers allocate their resources more efficiently to drive long-term profitability. Recent conceptual studies suggest CX in omnichannel settings affects their engagement (Kumar et al. 2017a), so we hope empirical studies might explore the relationships of OCX with each engagement dimension. Furthermore, omnichannel retailing creates unstructured, multimodal data, gathered through digital, social media, and mobile technologies ( Wedel and Kannan 2016 ). The OCX framework we propose might inform further research into multimodal communication, including text and picture mining or machine learning model development ( Liu, Burns, and Hou 2017 ;Ordenes et al. 2018 ). We hope researchers continue to make use of the OCX dimensions in dynamic fashion.

Executive Summary
Efforts to measure customer experiences (CX) in multifaceted, omnichannel, retail contexts are crucial but lacking research guidance. Prior service quality literature has effectively established how to measure CX in traditional, singlechannel contexts but has not adapted such measures to the omnichannel context.
With a mixed method research design and studies in eight phases, the authors propose a comprehensive measurement instrument that incorporates a schema-and categorization-based theoretical conceptualization of how customers assess omnichannel retail experiences, together with means-end chain theory, to explain perceived omnichannel customer experience (OCX) as a construct.
OCX captures the following omnichannel evaluation dimensions: social communications, value, personalization, customer service, consistency of product availability and prices across channels, information safety, delivery, product returns, and loyalty programs. Multiple applications of the measurement model empirically confirm the suitability of this instrument in consumer goods omnichannel retail settings; its 36 items reflect nine first-order quality dimensions that combine to form the overall, second-order OCX construct. A 9-item short-form is also provided.
The measurement instrument offers sound psychometric properties, as confirmed by several reliability and validity tests, and predicts customer behavior reliably across studies. Thus, the OCX measurement instrument offers utility for theory, management practice, and further research.

Supplementary materials
Supplementary material associated with this article can be found, in the online version, at doi: 10.1016/j.jretai.2022.03. 003 .