Bigdata Driven Cloud Security: A Survey

Cloud Computing (CC) is a fast-growing technology to perform massive-scale and complex computing. It eliminates the need to maintain expensive computing hardware, dedicated space, and software. Recently, it has been observed that massive growth in the scale of data or big data generated through cloud computing. CC consists of a front-end, includes the users’ computers and software required to access the cloud network, and back-end consists of various computers, servers and database systems that create the cloud. In SaaS (Software as-a-Service – end users to utilize outsourced software), PaaS (Platform as-a-Service-platform is provided) and IaaS (Infrastructure as-a-Service-physical environment is outsourced), and DaaS (Database as-a-Service-data can be housed within a cloud), where leading / traditional cloud ecosystem delivers the cloud services become a powerful and popular architecture. Many challenges and issues are in security or threats, most vital barrier for cloud computing environment. The main barrier to the adoption of CC in health care relates to Data security. When placing and transmitting data using public networks, cyber attacks in any form are anticipated in CC. Hence, cloud service users need to understand the risk of data breaches and adoption of service delivery model during deployment. This survey deeply covers the CC security issues (covering Data Security in Health care) so as to researchers can develop the robust security application models using Big Data (BD) on CC (can be created / deployed easily). Since, BD evaluation is driven by fast-growing cloud-based applications developed using virtualized technologies. In this purview, MapReduce [12] is a good example of big data processing in a cloud environment, and a model for Cloud providers.

layer (SSL) and Transport Layer Security (TLS). Multiple users' data and multi-tenancy are stored at the same location. Prevention of intrusion can be achieved by VM placement. IaaS: It deals with the delivery of computing resources such as servers, storage, network, and other computing resources in the form of virtualized systems, which can be accessed through the internet. It offers the infrastructure as-a-service, to execute appropriate applications; it can be combined or layered to derive customized environment with various building blocks. Popular IaaS cloud models are Amazon Web Services (AWS), Rackspace etc., Security Issues: Associated issues must be addressed by providers to minimize the threats arise from creation, communication, monitoring, modification and mobility. In virtualization, user can create, copy, share, migrate and roll back Virtual Machines (VM), allows running variety of applications. Virtual Machine Monitor (VMM) is responsible for VM isolation (VM security). PaaS: It abstracts the infrastructures and supports a set of application program interface to clod applications. The CSP is responsible to provide the risk free and robust environment for software product development. Popular examples of PaaS are Google App Engine, Acquia.com, Force.com etc., Security Issues: It depends on a secure and reliable network and secure web browser. PaaS security comprises of two types: 1. Security of the PaaS platform itself. 2. Security of customer applications deployed on a PaaS platform. DBaaS: It is getting popularity recently, assures the shift of data management related responsibilities. It offers a shared, a self-service model, elasticity to scale out and scale back database resources and charge-back based on database usage. Some of data related operational burdens include upgrade, provisioning, failover, management, configuration, seamless scaling, performance tuning, backup, privacy, access control etc., In this service model, the full access to a complex database can be achieved through very simple service calls only. Providers are Amazon RDS and Microsoft SQL Azure.

Types of cloud / Cloud Deployment models
The following four types of cloud deployment models classified based on the scope of accessibility and spatial location.

Public cloud (Off-site infrastructure over the Internet):
It is open to end-user (general public), or a cluster of organizations, who has the Internet availability. This system is hosted, managed, and owned by an organization selling cloud services [11]. It offers high efficiency and shared resources with low cost. The analytics services and data management are handled by the provider and the QoS (e.g., privacy, security and availability). Organizations can leverage these clouds to carryout analytics with a reduced cost or share insights of public analytics results. The cloud security alliance [8] has summarized five essential characteristics.

1.
On-demand self-service CSP provides computing capabilities to the consumer as and when needed automatically.

Broad Network access
Services can be accessed through standard mechanisms by heterogeneous thin or thick client platforms like mobile phones, tablets, laptops, and workstations.

Resource Pooling
Computing resources (different physical and virtual resources) are pooled to serve multiple consumers using multi-tenant model. Examples of resources includes; storage, processing, memory, network bandwidth and virtual machines.

Rapid Elasticity
Capabilities can be elastically provisioned and released automatically, to scale rapidly out-ward and in-ward based on the demand.

Measured Service
The service purchased by customers can be quantified and measured. Resource usage will be monitored, controlled, metered and reported.
Confidentiality refers to any authorized parties having access to protected data. Customer's data and computational task are to be kept confidential from both cloud provider and other 4 customers [3]. Data Integrity refers that data can be modified only by authorized parties or in authorized ways. It protects data from unauthorized deletion, modification or manipulation. i.e., data lost, altered, or compromised should be detected. Also, data should be honestly stored on cloud servers, and any violations must be detected. Potential security threats and mitigation strategies (Refer Table 2) in cloud security and data security solutions are explained in Table 3.  It is a Fundamental service enabled within the cloud paradigm. Fully Homomorphic Encryption (FME), Searchable Encryption

Building blocks for secure systems
In security, basic building blocks for secure systems are confidentiality, integrity and availability. CC security is a new emerging area in computer security that refers to set of policies, controls and encryption primitives to protect online data, system application and infrastructure for cloud computing. Security issues in CC include application level security, network level security, information security, and data privacy. Cloud security has to implement many different type of control such as deterrent control, preventive control, detective control and corrective controls for safeguard of its security architecture. Cloud computing comes with numerous possibilities and challenges simultaneously. Security is considered to be a critical barrier for CC in its path to success as a challenge. Security and Privacy both are concerns in CC due to the nature of such computing approach. The security issues for CC are not related to the technical and direct security breach only; a number of social inconsistencies might also be resulted even without any "hard breach" having taken place. The security issues in CC are somewhat sensitive and crucial on the basis of sociological and technological viewpoints. The technological inconsistency results in security breach in CC might lead to significant sociological impacts. Epistemological factors are also to be considered as it gives equal importance in security issue. The below table depicts the details on nature of issue and its reason behind it.

Methods supported by Cloud
CC has leveraged a collection of existing techniques, such as Data Center Networking (DCN), Virtualization, Distributed storage, MapReduce, Web applications and services etc., 1. Modern data center: 2. Virtualization: 3. MapReduce: It is a programming framework, supports distributing computing on mass data sets. Large data set breaks into small blocks that are distributed to cloud servers for parallel computing [12], [11]. It speeds up the batch processing on massive data sets.

Table 4: Type of security / threats and its causes Type of Security / Threats Reason behind Security / Threat Confidentiality, Availability and Integrity [13]
Generalized categories where security concerns falls in cloud environment Security threats in cloud environment [14] Database, virtual servers, network to O.S, Load Balancing, Memory Management, Concurrency Control Threats for a cloud infrastructure [13] Applicable to both data and infrastructure Potential & unavoidable sec. threats in cloud users [15] Data segregation and session hijacking Level of abstraction & dynamism in scalability [15] Results in poorly defined security or infrastructural boundary Privacy and its underlying concept [15] May lead to security breach-cloud services in specific contexts.

Data loss and various botnets [16],[17]
It breach security of cloud servers Security in the data-centers of cloud providers [18] Lead to security issues as a single physical server would hold many client's data Storage security at the CSPs data centers [19] Directly linked with the security of the cloud services Traditional security risks [13] Added degree of potency in cloud infrastructure -a quite challenge in success of CC Data Location [20] Crucial factor in CC security Location Transparency [20] It is security threat and prominent flexibility in CC Cloud Users' personal Data Security (DS) [21], [22] Crucial concern in CC Customers' personal or business DS [23] Strategic policies of the cloud providers are of highest significance as the technical security solely not adequate Trust (Trust establishment) [24] It is not a technical issue but security concern directly related to credibility and authenticity of CSP. Some of the influential soft factors like automation management, human factors, processes and policies are driven by security issues. All kinds of attacks -applies to cloud based services [25] Man-in-the-middle attack, phishing, eavesdropping, sniffing and other similar attacks It will define the integrity and level of security of a cloud environment. Accounting & Authentication [27] Part of security concerns in CC Using specific type of O.S [17] It may pose security threat or security risk Allocation of responsibilities among the parties involved in CC infrastructure [17] Result in experiencing inconsistency lead to security vulnerabilities Insider-attack in network scenario [17] Valid threat for CC Security tools or other kinds of software used in cloud environment [28] [29] Lead to security loopholes and pose security risks to cloud infrastructure itself. API and Spammers-with third party vendors [28] Threats to cloud environment

Huge amount of data transfer [30]
Adapted communication technology becomes a security concern Broadcast nature of some communication technology [31] Core concern of security issue Physical and virtual resources in cloud environment [32] Pose different level of security issues -having no sophisticated authentication mechanism to address security threat Virtualized resources [33] Intrusion related security concerns Cloud portability [34] May bring severe degree of API based security threats Using cloud products or services [35] Lead to security concerns for the consumers

3.1Health Care (HC) related Cloud Services
The following cloud services are specially relates with health care domain.

Cloud-Assisted Big Data Models
It provides efficient and most accurate solutions of complex big data problems. Some of the available models are; Systems (HIS), and CDSS, Medical Body Area Networks (MBANs). They offered cloud computing infrastructures for designing and developing BD analytics. [37] Microsoft Health Vault, Dossia, and Mphrx are some public health management systems based on CC.

Data security in CC -HC applications
The main barrier to the adoption of CC in health care relates to Data security [40]. The Data Security risks in the use of IT are hacker attacks, network breaks, natural disasters, public management interface, poor encryption key management, and privilege abuse [38]. However, the current cloud providers are better equipped to provide much better security than the on-premises security [39].

Task of Cloud Computing for Big Data
CC and BD paradigm is emerged to address the data-oriented challenges. CC and BD are conjoined. BD provides an ability to use commodity computing to process distributed queries across multiple datasets. CC provides the underlying engine through the use of Hadoop, a class of distributed data-processing platforms. BD utilizes distributed storage technology based on cloud computing rather than local storage attached to a computer or electronic device. Big data evaluation is driven by fast-growing cloud-based applications developed using virtualized technologies. Therefore, CC not only provides facilities for the alliance of big data cloud providers. MapReduce [12] is a good example of big data processing in a cloud environment and a popular CC framework. It allows for the processing of large amounts of datasets stored in parallel in the cluster. MapReduce, is the preferred computation model of cloud providers [41]. It is evident that from the paper [42], cited many research issues that can be resolved by using BD on CC atmosphere. computation and processing of BD but also serves as a service model. Table 5 shows the

Conclusion
Security issues could brutally affect cloud infrastructures and also a non-compromising constraint. Robust, consistent and integrated security models for BD on cloud computing could be the right path of motivation in ongoing cloud investigation. Research endeavor on "robust security models" for CC (exclusively security met health care applications) scenarios is the most prioritized factor for winning cloud based infrastructure development and deployment. Since, cloud is able to provide at-rest analytics (i.e., retrospective analysis) for stored data. This domain gives much potential exploration area to the researchers who are interested in health care [43] [44].