A Risk and Benefits Behavioral Model to Assess Intentions to Adopt Big Data

Everyday a constant stream of data is generated as a result of social interactions, Internet of things, e‐ commerce and other business processes. This vast amount of data should be collected, stored, transformed, monitored and analyzed in a relatively brief period of time. Reason behind is data may contain the answer to business insights and new ideas fostering competitiveness and innovation. Big Data technologies/methodologies have emerged as the solution to this need. However, being a relatively new trend there is still much that remains unknown. This study, based on a risk and benefits perspective, uses the theory of planned behavior to develop a model that predicts the intention to adopt Big Data technologies.


Introduction
Understanding the adoption of information technology (IT) innovations continues to be a challenge for information systems (IS) researchers (Venkatesh, 2006). Every aspect of society, including business and culture, is currently in the midst of a technology-based phenomenon. Advances in digital sensors, communications, mobile networks, storage, processing and cloud computing have given rise to huge collections of data, capturing valuable information to business, science, governments, and society (Bryant et al. 2008, Firestone 2010. By 2020, more than 2.7 zettabytes of data will be created annually reaching 35 zettabytes (IDC 2011) this will call into question the ability of firms to analyze information. Traditional decision-making systems are incapable of adequately resolving this problem. Therefore, companies are starting to roll out their own Big Data initiatives and building massive database systems to drive significant new growth in their business operations (Manyika et al., 2011).
Although the concept of Big Data exists since 2001 when the META Group analyst Doug Laney (Laney 2001) defined data growth challenges and opportunities as being three-dimensional, i.e. increasing volume (amount of data), velocity (speed of data in and out), and variety (range of data types and sources), only in the last two years Big Data has become one of the IT industry's hottest topics. In the press literature, Big Data is characterized as the new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high velocity capture, discovery and/or analysis (Woo et al. 2011).
The Big Data market is expanding rapidly since many firms are expending significant resources on related projects, or are planning to. According to IDC (2012), this market is expected to grow from $3.2 billion in 2010 to $16.9 billion in 2015 based on the premise that these technologies will improve operational efficiency and drive innovation.
Software Ve ndors such as IBM, Oracle, Microsoft, EMC or SAP, are already providing Big Data services as a source of competitive advantage for their customers.
Big Data systems are being implemented in multiple industries, including commerce, science, and society (Bryant et al. 2008), but many companies still are not interested in this new trend. A Big Data survey conducted in June 2012 by IDC found that 47% of 502 companies across different industries think that they do not need Big Data technologies and 25.8% of them do not see the value it can generate for their companies. Simon (2010) provides a sobering statistic: three out of five Big Data projects do not meet expectations in terms of cost and performance. The major implementation costs are incurred during the integration of Big Data into the existing IT framework. Also, given the high level of sophistication required for Big Data projects (Mckinsey 2011), there are some fears related to the implementation playing against adoption.
All together, these facts lead to the conclusion that the market is at an early stage of adoption, hence only early adopters are betting on these new technologies.
Overall, Big Data represents a disruption in decision-making by enabling business processes to be effectively based on information. Nonetheless, the main challenge at this point is not the deployment of the technology, but rather the transformation of the culture, processes, and people within organizations.
The overall purpose of this study is to explore the impact of Big Data technologies perceived risks and benefits in the intention to adopt them. Since behavioral intention may not be reflected in actual use, this paper also examined the relationship between intended and actual use.

Theoretical background
The academic literature on Big Data is still scarce. Recent articles published focus more on the software, algorithms and hardware needed for Big Data, especially in techniques such as Hadoop, while the adoption decision issues remain unattended.
The initial definition of Big Data was composed of three-dimensional characteristics (known as the 3vs model): volume, variety and velocity. Volume refers to the need for intensive and complex processing of data subsets that actually contain information of value for an organization. Variety refers to the combination of different types of data from different sources. The attribute of variety therefore alludes to the fact that data can come from inside or outside the organization, and may also be structured, semi-structured, or unstructured. Finally velocity, not all of the data in an organization has the same urgency of analysis. There is a full range of velocities: from data that can be batch processed (as in the case of data warehousing) to data that must be processed in real time (when continuous data streams need to be analyzed). The key to understanding speed in Big Data is to clearly identify the informational requirements of the processes and business users.
In 2012, Gartner updated its definition as follows: "Big Data are high-volume, high-velocity, and/or high-variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization." (Laney 2012).

Perceived benefits of big data
There is a fourth characteristic for Big Data: Value. In the context of Big Data, value refers to: (1) the cost of the technology, which has dropped to allow more companies to undertake this type of projects, and (2) the benefits generated by the use of Big Data (cost reduction, operational efficiency, and business improvements and new revenue streams).
Like any other new technologies, Big Data comes with benefits and drawbacks. Table 1 presents a list of several key benefits and risks developed by McKinsey Global Institute (2011).

Benefits
Risks Creating transparency by making data accessible to relevant stakeholders in a timely manner Improve operational efficiency (cost, revenue and risk) Use data and experiments to expose variability and raise performance Segment populations to customize the way your systems treat people Use automated algorithms to replace and support human decision making Innovate with new business models, products, and services Sector-specific business value creation Data quality Talent scarcity (lack of data scientists) Privacy and security concerns Big Data integration capabilities Decision-making Organizational maturity level  Ajzen (1988Ajzen ( , 1991. TPB encompasses three constructs, the attitude toward the behavior, subjective norm, and perception of behavioral controlthat when combined form behavioral intention. Intention is then assumed to be the immediate antecedent of behavior (Ajzen 2002). Table 2 presents brief descriptions of the constructs used in TPB.

Antecedents of big data adoption intention
Based on DTPB, in our research model Big Data adoption intention is jointly determined by the individual's Big Data Attitude, subjective norms, and Perceived Behavioural Control. Thus we hypothesize:

Construct Definition Behavioral Intention
Refers to individual's intention to perform a behavior and is a function of Attitude, Subjective Norm and Perceived Behavioral Control Attitude Refers to individual's positive or negative evaluation of the behavior (Ajzen, 1988) Subjective Norm Refers to individual's "perception of social pressure to perform or not to perform the behavior" (Ajzen, 1988, p.132) Perceived Behavioral control Refers to the "perceived ease or difficulty of performing the behavior and reflects past experience as well as anticipated impediments and obstacles" (Ajzen, 1988, p.132) Taylor and Todd (1995) also specified that, based on the diffusion of innovation theory, the attitudinal belief has three salient characteristics that influence adoption; relative advantage, complexity and compatibility (Rogers, 1983). Relative advantage refers to the degree to which an innovation provides benefits superseding those of its precursor. This may incorporate factors such as economic benefits, image, enhancement, convenience and satisfaction (Rogers 1983). Complexity represents the degree to which an innovation is perceived to be difficult to understand, learn or operate (Rogers, 1983). The complexity construct is extremely similar, although it is conceived in the opposite direction as ''perceived ease of use'' (Technology acceptance model, Davis 1989). Innovative technologies that are perceived to be easier to use and less complex have a higher possibility of acceptance and use by potential users. Thus, complexity would be expected to have negative relationship to attitude. Complexity (and its corollary, ease of use) has been found to be an important factor in the technology adoption decision (Davis et al. 1989).

Theoretical model and research hypotheses
Synthesizing the theoretical background, we propose the following model (see figure 1) based on DTPB for understanding factors influencing Big Data adoption.

Antecedents of Big Data Adoption
Based on DTPB, the adoption adopt Big Data will be determined by intention to adopt Big Data and perceived behavioral control. As a consequence, we hypothesize:

Antecedents of attitude
Big Data requires of technologies that process and analyze large amounts of heterogeneous data within the right scope of time. These technologies includes A/B testing, association rule learning, classification, cluster analysis, crowdsourcing, data fusion and integration, ensemble learning, genetic algorithms, machine learning, natural language processing, neural networks, pattern recognition, predictive modeling, regression, sentiment analysis, signal processing, supervised and unsupervised learning, simulation, time series analysis and visualization, Massively Parallel-Processing (MPP) databases, search-based applications, data-mining grids, distributed file systems, distributed databases, cloud computing platforms, the Internet, and scalable storage systems. Depending on the degree of knowledge of these technologies, an organization may consider that Big Data is more or less easy to use.
It is reasonable to infer that the perceived ease of use positively influence the company's perceived usefulness and intention to adopt Big Data. Therefore, we hypothesize that: H7. Perceived ease of use has a positive effect on attitude towards Big Data.
Perceived Usefulness is defined as the degree to which a person believes that adopting Big Data would enhance his or her job performance (Davis 1989). Therefore, we hypothesize that:

H6. Perceived usefulness has a positive effect on attitude towards Big Data
Also, as previously discussed, there are three main reasons to Big Data adoption, namely: volume, variety and velocity. Thus we hypothesize: H1. Volume has a positive effect on perceived usefulness towards Big Data H2. Variety has a positive effect on perceived usefulness towards Big Data H3. Velocity has a positive effect on perceived usefulness towards Big Data.
As discussed in section 2.1, Big Data generates many potential benefits for companies such as cost control, revenue generation, risk control, decisionmaking improving, etc. Therefore, it is reasonable to infer that Big Data Technologies perceived benefits positively influence the company's attitude and intention to adopt Big Data.

H5. Perceived benefits have a positive effect on attitude towards Big Data.
Similarly, it is reasonable to infer that the perceived risks of Big Data negatively influence the company's attitude and intention to adopt Big Data. Among them: Talent scarcity, organization maturity, Big Data internal capabilities and data quality.

H4. Perceived risk has a negative effect on attitude towards Big Data.
Compatibility is the degree to which the innovation fits with the potential adopter's existing values, previous experience and current needs (Rogers, 1983). Tornatzky and Klein (1982) found that an innovation is more likely to be adopted when it is compatible with the job responsibilities and value system of the individual. Therefore, it may be expected that compatibility has a positive influence on Big Data adoption. The existence of information systems such as e-commerce platforms, Enterprise Resource Planning (ERP), Business Intelligence (BI), Customer Relationship Management (CRM) or product lifecycle management (PLM), external sources of information and the need to make decision near real-time are factors that generate Big Data situations. It is reasonable to infer that compatibility has a positive influence on attitude towards Big Data. Hence, we hypothesize: H8. Compatibility has a positive effect on attitude towards Big Data.

Antecedents of perceived behavioral control
According to Ajzen (1988), Perceived Behavioral Control reflects beliefs regarding access to the resources and opportunities needed to perform behavior, or alternatively, to the internal and external factors that may impede performance of the behavior. This notion encompasses the component of "facilitating conditions" (Triandis 1980) and self-efficacy (Bandura 1982). In this research, we define Perceived Behavioral Control as the degree to which external and internal factors influence, knowledge-seeking behavior in an EKR. Thus, we hypothesize: H9. Self-efficacy has a positive effect on Perceived behavioral control to adopt Big Data.

Research methodology
Data for this study was collected using an online survey questionnaire. The participants in the survey were managers involved in Big Data adoption decision and usage such as CIOs, marketing directors, and business analytics managers.
Based on the list of the top 100 Spanish companies firms, we contacted the users through email and/or Linkedin. The questionnaire has two parts. The first considers demographic information with control variables such as the job role of the participant, size of the company, and existence of a data mining data center. The second part considers the theoretical model. The measurement items in the questionnaire were developed for the decision variables of attitude, perceived behavioral control, intention to adopt, and actual adoption by adapting the measures proposed and validated by Azjen (2002) to fit the Big Data context. The total number of answers was 53. A SEM technique was used to examine the relationships among the constructs. The Partial Least Squares (PLS) approach was chosen for its capability to accommodate small-sized samples (Chin 1998). Further, PLS recognizes two components of a causal model: the measurement and the structural model. Additionally, PLS is especially suitable for exploratory research focusing on explaining variance. Given the aforementioned PLS seemed particularly relevant for this exploratory studyone that is limited by sample size. Table 4 shows the factor loadings, Cronbach's alphas (A), Average variance extracted (AVE), and R 2 values. All Cronbach's alphas exceeded the recommended minimum value of 0.7 with the exception of perceived risks variable and, all of the observed construct reliabilities (C.R.) were higher than 0.8 (Fornell and Lacker 1981) with the exception of perceived risks variable. All construct loadings were found to be significant at greater than the recommended p-value of 0.05 (Gefen and Straub 2005) and typically exceeded the recommended threshold value of 0.707 (Barclay et al. 1995) with the exception of perceived risk, perceived benefits and behavioral intention that were inferior in some constructs. Average variance extracted (AVE) was found to account for a minimum of 50 percent of the variance in each construct and the square root of AVE for each construct was much larger than the construct's correlation with every other construct (Barclay et al. 1995;Gefen and Straub 2005). Measurement items loaded on their respective constructs at a value of at least 0.1 greater than their loading on other constructs (Barclay et al. 1995;Gefen and Straub 2005) and all items loaded higher on their intended construct than on any other construct. Hence, it was concluded that the construct measurement items were consistent and exhibited a substantial degree of convergent and discriminant validity.

Discussion
Adding to previous literature on Big Data, the first contribution of this study is the recognition that volume and velocity are the key aspects in Big Data adoption and they have a significant impact in the intention to adopt these technologies. Although, Variety seems not having still such effect, it is expected to become an important factor in determining adoption. The logic behind is that the more heterogeneous and unstructured the data is, the higher the barriers to capture and analyze data. What is clear is as corporate systems are built into Database Management Systems (DMBS), companies perceive volume and velocity as more urgent matters than variety. Also, companies have traditionally focused more on numerical and structured data rather than working with different types of data. However, with the increasingly diversity of data, being able to manage that aspect will play a key part in companies´ data strategy.
Even though the traditional definition of perceived usefulness does not have an impact on the attitude toward Big Data, our model shows that perceived benefits have a significant impact on behavior. Thus, in the subsequent/confirmatory study we plan to use perceived benefits as the construct that replaces perceived usefulness.
Regarding perceived risks, the exploratory results suggest that perceived risks variable and measurements need to be re-defined. Construct loadings are not statistically relevant, so we need to adjust the constructs definition. Hence, the definition of the potential Big Data risks needs to be reviewed and perhaps extended with more risks. However, the results lead to the belief that perceived risks might have a moderate effect on the attitude towards Big Data adoption. Finally, our results suggest that Media and press news about Big Data have a stronger impact on the decision to adopt Big Data than social influences (friends and/or colleagues suggestion to adopt Big Data).
Therefore the results indicate that specific opportunities as well as challenges exist in Big Data technologies adoption.

Considerations and future work
This research-in-progress contributes to the existing body of knowledge on Big Data by developing a theoretical model to explore and predict the intention to adopt Big Data technology. By extending the theory of planned behavior with the concepts of perceived benefits, risks and perceived usefulness of Big Data, we seek to understand the adoption of Big Data. Overall, our exploratory results suggest that the proposed model is a first fruitful step to design a theoretical model to predict Big Data adoption.
Also, our exploratory model provides insightful evidence to further research and analysis, especially in terms of perceived risks and the variables that impact on the attitude to adopt Big Data such as velocity and volume.
As a future work, we will review the literature on Big Data risks and redesign the perceived risks construct and then conduct a confirmatory study with a bigger sample size.