Smart cities with big data: Reference models, challenges, and considerations

Cities worldwide are attempting to transform themselves into smart cities. Recent cases and studies show that a key factor in this transformation is the use of urban big data from stakeholders and physical objects in cities. However, the knowledge and framework for data use for smart cities remain relatively unknown. This paper reports ﬁ ndings from an analysis of various use cases of big data in cities worldwide and the authors' four projects with government organizations toward developing smart cities. Speci ﬁ cally, this paper classi ﬁ es the urban data use cases into four reference models and identi ﬁ es six challenges in transforming data into information for smart cities. Furthermore, building upon the relevant literature, this paper proposes ﬁ ve con- siderations for addressing the challenges in implementing the reference models in real-world applications. The reference models, challenges, and considerations collectively form a framework for data use for smart cities. This paper will contribute to urban planning and policy development in the modern data-rich economy. by smart cities on the basis of the Saudi Arabian experience. This study identi ﬁ ed the policy implications for Saudi Arabia and the lessons in using GeoICTs in smart city development. the modeling of IoT-based services for smart cities, which can allow to daily This study proposed an adaptive service composition framework supporting the dynamic reasoning of user tasks and – This study adopted eight study that a ﬀ ect how and planning organizations select and decide tools for in aid of The result of the study showed should on basis of following: of characteristics of who will tool, and and the tool's capabilities. how open-sensorized into and A of was to study the determinants of a successful multi-service application in aid of smart city study ﬁ showed that key drivers of citizen intent continue using smartcards. The empirical study of a foundational framework for smart sustainable city development. This in-depth interdisciplinary and transdisciplinary study highlighted the relevance of the development phase of smart sustainable cities. value smart system technologies to develop state-of-the-art healthcare systems. The conceptual framework of a big data-enabled smart healthcare system was constructed to aid in the theoretical representation of intra-and inter-organizational business models in the healthcare context. Recommendations for the e ﬀ ective application of the models to the healthcare industry were proposed. how IoT-enabled sensor-based big data applications can a ﬀ ect the environmental sustainability, as well as related data processing platforms and computing models, of future smart sustainable cities. An analytical framework for IoT-enabled data-centric applications, which can improve the environmental sustainability of smart sustainable cities, was developed. The key challenges of using IoT and big data analytics were identi ﬁ ed, and some associated open issues were discussed.


Introduction
A smart city is composed of and monitored by pervasive ICT (Neirotti, De Marco, Cagliano, Mangano, & Scorrano, 2014). "In the last two decades, the concept of smart city has become more and more popular in scientific literature and international policies" (Albino, Berardi, & Dangelico, 2015). This popularity could be attributed to the traction that the smart city concept has gained as a vision for improving the economy, mobility, environment, people, living standards, and governance of cities (Abella, Ortiz-de-Urbina-Criado, & De-Pablos-Heredero, 2017;Angelidou, 2015;Caragliu, Del Bo, & Nijkamp, 2011;Vanolo, 2014). IBM accomplished > 100 smart city projects worldwide in 2010-2017, projects whose themes included administration, citizen engagement, economic development, education and workforce, the environment, public safety, social services, transportation, and urban planning (IBM Smarter Cities Challenge, 2017).
The recent proliferation of big data has contributed to smart city transformation (Barns, 2016;Bibri, 2018b;Hashem et al., 2016;Kitchin, 2014;Rabari & Storper, 2015). "Big data" generally refers to large and complex sets of data that represent digital traces of human activities and may be defined in terms of scale or volume, analysis methods (Chen, Chiang, & Storey, 2012), or effect on organizations (McAfee & Brynjolfsson, 2012). Cities around the world collect massive quantities of data related to urban living from objects (e.g., energy infrastructure) and stakeholders (e.g., energy-using residents). Use of these data contributes to the creation of useful content for various stakeholders, including citizens, visitors, local government, and companies. For instance, the Seoul government collects data related to public health, transportation, and residence and has made them available for data scientists to produce meaningful knowledge for the city and its citizens. As a result, the Seoul government identified the patterns and demands of the usage of the city bus at midnight and subsequently improved midnight public bus services (NIA, 2013). Similarly, the San Francisco government analyzed crime records to improve public security services (Lee, 2013), and the Rio de Janeiro government used data from cameras and sensors to address various concerns of the city such as weather, energy, and safety (Kitchin, 2014). Other cases include Santander in Spain (Díaz-Díaz, Muñoz, & Pérez-González, 2017) and Cosenza in Italy (Cicirelli, Guerrieri, Spezzano, & Vinci, 2017).
Despite the emergence of such cases, the understanding of data use for smart cities remains limited in the literature. Several studies have investigated the use of big data in smart cities and identified relevant issues in specific areas of application, such as transportation, public safety, and sustainability (e.g., Ang & Seng, 2016;Clarke & Steele, 2011;Díaz-Díaz et al., 2017;Perera, Zaslavsky, Christen, & Georgakopoulos, 2014;Steenbruggen, Tranos, & Nijkamp, 2015;Su, Li, & Fu, 2011). However, few studies have emphasized the generic knowledge of using big data for smart cities independent of the application area (Hashem et al., 2016;Mora & Bolici, 2017). In particular, research describing some design knowledge to actually create the benefits of big data to smart cities remains limited, though such work is key for developing real-world applications for smart cities in this datarich economy. How can we create value using big data from various sources in cities? Knowledge useful for this task has not been developed, nor is it deeply understood.
This work attempts to fill the research gap by developing reference models from existing cases as well as by identifying challenges and considerations from studying government projects (Fig. 1). In this paper, we first classify various use cases of big data in cities worldwide into four categories by utilizing a 2 × 2 classification matrix, showing the big picture of data use in smart cities. Each category in the matrix suggests a reference model for data use for smart cities independent of application. Second, we empirically identify six challenges in data use for smart cities based on real-world lessons from our action research (Avison, Lau, Myers, & Nielsen, 1999) on four projects with government organizations. The four projects are highly relevant to the four reference models. Finally, we propose five items that should be considered in a smart city project with big data; these items were identified by integrating the previous findings and the relevant literature (e.g., Al Nuaimi, Al Neyadi, Mohamed, & Al-Jaroodi, 2015;Bibri, 2018b;Hashem et al., 2016). Fig. 1 summarizes the context and contribution of our research. Based on an analysis of existing cases and our empirical studies, this work addresses the research necessity for value creation with big data from various sources in smart cities. From the viewpoint of theoretical contribution, this work develops an integrated framework of models, challenges, and considerations of using big data for smart cities. Although a few studies provide such knowledge independently (e.g., Bibri, 2018b;Díaz-Díaz et al., 2017;Hashem et al., 2016), there is no empirical work that connects dispersed knowledge in an integrated framework, despite the fact that a data use project for smart cities requires all such knowledge. The findings of our work are grounded in and meaningfully integrate existing work from both within and outside the smart cities literature. From a practical contribution viewpoint, this work provides actionable information for smart cities based on learning from R & D projects with government originations. The current academic debate, as exemplified in recent studies (e.g., Bertot & Choi, 2013;Cao, Giyyarpuram, Farahbakhsh, & Crespi, 2017;Rathore, Ahmad, Paul, & Rho, 2016), aims to develop actionable information pertaining to the use of big data for developing smart cities. Our study therefore provides, to the best of our knowledge, the very first action research on data use for smart cities to empirically identify real-world implications. Our findings can contribute to the development of smart cities in this data-rich economy and provide a basis for the use of big data from an urban planning and policy development perspective.
In the succeeding parts of this paper, we first review literature related to smart cities; second, we describe the research method; third, we discuss the four reference models, identify the six challenges, and propose the five considerations; fourth, we discuss implications of our findings to policy development and ICTs for the application of our findings; and finally, we conclude this paper by suggesting issues deserving future research.

Diverse aspects of smart cities
Researchers have studied diverse aspects of smart cities. Fig. 2 shows the top 100 words representing the smart city literature based on a total of 2856 articles, in which the degree of representation is measured with the Latent Dirichlet Allocation (LDA) method, a widely used and powerful technique for understanding topics of a corpus (Blei, Ng, & Jordan, 2003). The word cloud was created using the text of titles, abstracts, and keywords of journal articles, reviews, and conference papers on smart cities in 1991-2017. As of April 27, 2017, these data are the full population (i.e., not a sample) of the smart city literature identified from the Web of Science Core Collection databases of the Science Citation Index Expanded (SCIE, 1945-), the Conference Proceedings Citation Index -Science (1990-), the Social Sciences Citation Index (SSCI, 1987-), the Arts & Humanities Citation Index (1987), and the Emerging Sources Citation Index (2015-) using two queries: {TOPIC: ("smart cities")} and {TOPIC: ("smart city")}. Data sources included Cities, Societies, Sustainability, Energy, IEEE Transactions on Intelligent Transportation Systems, IEEE Internet of Things Journal, IEEE Access, and Sensors. Cities published 22 smart-city-related articles, ranking first in the SSCI database in terms of the number of publications. Sensors, in the SCIE database, was determined to be the journal that has published the largest number of smart-city-related articles, 73.
The font size of a word in Fig. 2 is proportional to its degree of "topic representation" in the word distribution in the corpus, which was measured using the LDA method (Blei et al., 2003) for a single topic of the entire corpus (in this case, "smart city"). The score represents the probability of the word in describing the topic; thus, a high value indicates that the word more probably represents the topic. This visualization method is more scientific than the use of simple frequency to describe the keywords and main themes of a corpus. Before analysis, we cleaned the data by deleting a datum if the abstract or title information was missing or duplicated, eliminating stop words (e.g., "it," "for") and Big data from various sources in cities Value creation from big data use in smart cities How can we fill this gap with some design knowledge?
Four action research projects with government and analysis of urban data use cases building upon the relevant literature Big data from various sources in cities Reference models, associated challenges, and considerations Value creation from big data use in smart cities C. Lim et al. Cities xxx (xxxx) xxx-xxx non-alphabetics, changing the font to lowercase (e.g., from "Smart" to "smart"), lemmatizing all words (e.g., from "processes" to "process"), and applying other customized rules (e.g., do not lemmatize "glasses" to "glass"). We also did not include some words with high LDA scores in Fig. 2, such as "paper," "proposed," and "existing," because their high scores are not attributable to their relevance to the topic but to the high frequency of their appearance in scientific documents. Fig. 2 indicates that smart city keywords include "data," "system," "service," "network," "urban," "technology," "sensor," "environment," "citizen," "public," "social," "mobility," "sustainable," "life," "open," "knowledge," "policy," "integration," "decision," and "local." In addition to creating Fig. 2 (quantitative analysis), a detailed qualitative review of the literature finds that the smart city concept involves diverse aspects (Angelidou, 2014) such as urban services (Belanche, Casaló, & Orús, 2016), ICT, sustainable economic growth, high quality of life (Caragliu et al., 2011), high-tech intensiveness, connection (Bakıcı, Almirall, & Wareham, 2013), intelligence, integration (Barrionuevo, Berrone, & Ricart, 2012), aware citizens (Giffinger et al., 2007), job growth (Eger, 2009), preventive maintenance, security monitoring (Harrison et al., 2010), learning, creativity (Komninos, 2011), knowledge intensiveness, high productivity (Kourtit, Nijkamp, & Arribas, 2012), smart economy, smart people, smart governance, smart mobility, smart environment, and smart living (Lombardi, Giordano, Farouh, & Yousef, 2012). These are just a few examples of the multiple aspects of the smart city concept that indicate the transdisciplinary nature of smart city research. The smart city concept lies in an intersection of city administration, citizen value creation, local business, ICT development and application, urban big data, economics, and sociology, among others.
The smart city concept also involves various application areas. Lim and Maglio (2018) identified 12 application areas related to smart cities from a textual analysis of 1234 news articles; these are "smart device," "smart environment," "smart home," "smart energy," "smart building," "smart transportation," "smart logistics," "smart farming," "smart security," "smart health," "smart hospitality," and "smart education." These areas form a hierarchical structure of smart cities (Fig. 3). In smart cities, local resources, government, companies, citizens, and visitors are connected by smart devices and smart environments, key resources that facilitate the collection of data from the resources and stakeholders and the delivery of various smart services to the stakeholders, such as smart energy, transportation, and health services. The stakeholders interact with each other and co-create value through the services. Smart cities incorporate all these elements at the top of the hierarchy.

Development of smart city applications through the use of ICT and big data
Among the diverse aspects of smart cities, the ICT aspect is key for the development of smart city applications. Researchers have investigated ICT-driven initiatives and approaches toward developing a smart city (e.g., Albino et al., 2015;Angelidou, 2015;Bibri & Krogstie, 2017c;Stratigea, Papadopoulou, & Panagiotopoulou, 2015). Table 1 summarizes some of the studies on the ICT-rich nature of smart cities. As shown, existing studies have discussed the utility of various ICTs for the development of smart city applications, such as IoT (Alam, Mehmood, Katib, Albogami, & Albeshri, 2017;Cicirelli et al., 2017), smartcards (Belanche-Gracia et al., 2015), sensors (Ang & Seng, 2016), big data analytics (Bibri & Krogstie, 2017c), security management systems (Li & Shahidehpour, 2017), geographic information systems (Aina, 2017), and virtual reality (Jamei et al., 2017). According to a textual analysis of 5378 papers, ICTs for smart cities can be categorized into four technology factors, namely, "4Cs": Connection between things and people, Collection of data for context awareness, Computation in the cloud, and Communication by wireless means (Lim & Maglio, 2018).
Although all these technologies should be considered in a smart city application development project, the framework for ICT-enabled planning of smart cities (Stratigea et al., 2015) shows the utility of the datadriven approach for smart cities. Other recent studies (e.g., Al Nuaimi et al., 2015;Batty, 2013;Bertot & Choi, 2013;Cao et al., 2017;Hashem et al., 2016;Perera et al., 2014;Rathore et al., 2016;Vilajosana et al., 2013) also have shown the strong potential of urban big data planning and policy development for smart cities. Table 2 summarizes some of the studies that have focused on using big data for smart cities. For example, Hashem et al. (2016) investigated the role of big data in smart cities, and Al Nuaimi et al. (2015) reviewed applications of big data to smart cities. Abella et al. (2017) developed a model for the analysis of data-driven innovation and value generation in smart cities. Bibri and Krogstie (2017c) studied big data and context-aware augmented typologies and design concepts for smart cities.  Cities xxx (xxxx) xxx-xxx developed a model for trustworthy data sharing in smart cities, and Ben Sta (2017) discussed data quality in smart cities. Bibri (2018b) developed an analytical framework for sensor-based big data applications for environmental sustainability. All these studies can be used as a basis for the development of smart city applications through the use of ICT and big data since they provide relevant knowledge, such as application examples (Chen et al., 2017;Cicirelli et al., 2017), benefits and opportunities (Al Nuaimi et al., 2015;Solanas et al., 2014), challenges and requirements (Batty et al., 2012;Hashem et al., 2016), and models and frameworks (Bibri, 2018b;Díaz-Díaz et al., 2017;Pramanik et al., 2017). However, none of these studies provides an integrated framework that connects dispersed knowledge such as reference models, associated challenges, and considerations for the urban-data-based value creation for city stakeholders. Such a framework would be practical for planning and evaluating projects for smart city application development since the models, challenges, and considerations should be used at the same time. Moreover, most of these studies are conceptual works; only a few provide empirically identified and tested knowledge. There is a surprising lack of studies providing practical knowledge from real projects to support a databased smart city transformation, despite its significance in future project development for smart cities with big data. Our work aims to fill the research gap, building upon the current body of literature: First, we classify various use cases of big data in smart cities into four reference models. Second, from our four smart-city-related projects with government, we identify challenges and considerations in using big data for smart cities. Finally, we integrate these empirically identified knowledge pieces and the relevant literature into a single framework for future research and development projects on smart cities with big data.

Research method
In this work, we employed two methods, use case analysis and action research, to develop design knowledge of big data use for smart cities independent of application. For a smart city application design project, project planners and participants should be aware of reference models to which they can refer for the project, as well as challenges and considerations related to the success of the project. In our experience, we found that they should possess all such knowledge. The case analysis method is useful for identifying the reference models, whereas the practice-driven action research method is useful for empirically understanding the challenges and considerations. The two methods complemented each other to develop an integrated framework of reference models, challenges, and considerations. We were able to tightly integrate our findings as our four action research projects are highly relevant to the four reference models.
First, we used a multi-case analysis to develop reference models of the data use applications. A multi-case analysis is an effective approach for determining the general mechanisms of complex phenomena or systems (Eisenhardt, 1989;Maglio & Lim, 2016). Through this approach, researchers can gain improved understanding of theoretical constructs of new phenomena or systems in question (Ketokivi & Choi, 2014;Koskela-Huotari, Edvardsson, Jonas, Sörhammar, & Witell, 2016). We collected various cases of big data use in smart cities from journal articles, books, technical reports, news articles, and blogs (e.g., Angelidou, 2014;Kitchin, 2014;Perera et al., 2014;Purohit & Bothale, 2011) and organized the case information into a database that included data used, data sources, information derived from data, information beneficiaries, and information delivery channel for each case. We then conducted a cross-case analysis to achieve a sense of generality (Ketokivi & Choi, 2014) by categorizing the cases to identify similarities and differences. We repeated the categorization process several times, finally arriving at the 2 × 2 classification matrix (Section 4) summarizing big data use in smart cities.
Second, we used the action research method to identify challenges and considerations of the use of big data for smart cities. Action research is "an orientation to knowledge creation that arises in a context of practice and requires researchers to work with practitioners" (Huang, 2010). In this work, "action" refers to the actions involved in the data use process for smart city applications; these actions include data collection, data analytics, information creation, and information delivery design. This particular research method "is unique in the way it associates research and practice, so research informs practice and practice informs research synergistically" and has contributed to the development of empirical insights in operations management and information systems (Avison et al., 1999;Coughlan & Coghlan, 2002). This research method was appropriate for achieving our objective because (1) action research is concerned with bringing about change in organizations (Shani & Pasmore, 1985), and our study is concerned with change (i.e., city transformation) with urban data; (2) action research aims at developing holistic understanding (Coughlan & Coghlan, 2002), and our study aims to understand data use for smart cities broadly; and (3)  C. Lim et al. Cities xxx (xxxx) xxx-xxx foremost, the research topic of using urban data to transform cities originally emerged from rapidly evolving practice, and our study aims to scrutinize and help improve practice by offering specific actionable knowledge to practitioners. We conducted action research on multiple projects with government organizations to design public services wherein the use of urban big data greatly contributes to value creation. The findings of this paper are based mainly on the four projects in Table 3, guided by the following research question: "What are the challenges and considerations in the use of big data to develop applications for smart cities?" In Projects 1 and 2, we designed services for smart transportation in cities (Kim, Lim, Lee, Kim, Park, & Choi, 2018;Lim, Kim, Heo, & Kim, 2018a), and in Projects 3 and 4, we designed services for healthy cities. These services are highly relevant to smart cities (Lim, Kim, Kim, Kim, & Maglio, 2018b;Lim, Kim, Kim, Kim, & Maglio, 2018c). The motivation, approach, and outcomes of each project can be utilized as references for developing similar initiatives for future smart city projects.
The four projects involved the analysis of real-world big data related to the smart city context, design of information content for customers, analysis of existing data-based public service cases in multiple cities, interviews with experts and practitioners who have extensive experience related to data-based public services in modern cities, and design Table 1 Studies on the ICT-rich nature of smart cities.

Source Description
Paroutis, Bennett, and Heracleous (2014) Studied how ICT organizations can incorporate smart city technology into the strategic options of recession environments, in which the case study on IBM Smarter Cities was used as basis. This study focused on the stakeholder and actor perspective instead of highlighting the particular role of cities. Chen, Ardila-Gomez, and Frame (2017) Studied how intelligent transportation systems can contribute to the energy saving of smart cities. This study identified the four main steps of smart mobility solutions as a means to achieve energy savings, and then discussed the institutional, technical, and physical conditions required by each step. Aina (2017) Studied how geospatial information and communication technologies (GeoICT) are leveraged by smart cities on the basis of the Saudi Arabian experience. This study identified the policy implications for Saudi Arabia and the lessons in using GeoICTs in smart city development. Urbieta, González-Beltrán, Ben Mokhtar, Anwar Hossain, and Capra (2017) Studied the modeling of IoT-based services for smart cities, which can allow mobile users to dynamically perform daily tasks. This study proposed an adaptive service composition framework supporting the dynamic reasoning of user tasks and service behavior. Cicirelli et al. (2017) Developed an IoT-based platform for a generic cyber-physical system to facilitate the design and implementation of smart city services and applications. This study adopted the design and implementation platforms of a real smart street in the city of Cosenza. Díaz-Díaz et al. (2017) Studied the business models of IoT and other technologies in smart cities by focusing on eight urban services provided in the city of Santander. This study established how IoT can contribute to reduced costs and energy consumption, how data can be managed, and how citizens can be incentivized in the services. Afzalan, Sanchez, and Evans-Cowley (2017) Studied the factors that affect how city and planning organizations select and decide online tools for public engagement in aid of smart city development. The result of the study showed that planning organizations should choose a participation tool on the basis of the following: capacities of organizations, characteristics of communities who will use the tool, user-community norms and rules, and the tool's capabilities. Trilles et al. (2017) Studied how to embed open-sensorized platforms into smart city hardware and software. A network of sensorized platforms was embedded inside a university campus to monitor the environmental phenomena. Li and Shahidehpour (2017) Developed a framework that can act as a general firewall and function interactively with several critical infrastructures to protect smart city operations from cyber threats. This study focused on smart city traffic management in various conditions. Belanche-Gracia, Casaló-Ariño, and Pérez-Rueda (2015) Studied the determinants of a successful multi-service smartcard application in aid of smart city development.
The study findings showed that privacy and security were regarded the key drivers of citizen intent to continue using smartcards. The empirical study focused on smartcard use in Zaragoza. Walravens (2015) Studied quantitatively and qualitatively the trends and challenges of the app economies and mobile service landscapes of different cities and regions in Brussels. Policy recommendations were derived from city-level mobile service ecosystems. Bibri and Krogstie (2017a) Studied the nature and practice of ICTs and their impact on the new types of urban sustainability computing in smart sustainable cities. This study identified the applications and services needed by smart sustainable cities, related urban domains and systems, and enabling ICTs for the new types of information-flow computing. Batty et al. (2012) Defined the elements that constitute a smart city, which they defined as a city in which ICT is merged with traditional infrastructures and one that is coordinated and integrated using new digital technologies. Research challenges and smart city scenarios were identified, and project areas for smart cities were proposed. Bibri and Krogstie (2017b) Conducted a comprehensive research on smart and sustainable cities by focusing on their underlying foundations and assumptions, state-of-the art research and development, research opportunities and horizons, emerging scientific and technological trends, and future planning practices. Big data analytics and context-aware computing were identified as the disruptive technologies required in the design, development, and deployment of data-centric and smart application of smart sustainable cities. Khan, Pervez, and Abbasi (2017) Identified a comprehensive list of stakeholders and modeled their involvement in smart cities using the onion model approach. The framework of the stakeholder involvement model was used to establish the end-to-end security and privacy features of trustable data acquisition, transmission, processing, and legitimate service provisioning. Jamei, Mortimer, Seyedmahmoudian, Horan, and Stojcevski (2017) Studied the capacity of virtual reality (VR) to address the current challenges of creating, modeling, and visualizing smart cities through material modeling and light simulation in VR environments. This study focused on three aspects, namely, pedestrian thermal comfort visualization, smart transportation visualization, and cognitive behavior of urban city dwellers. Bibri (2018a) Reviewed the fundamental theories and academic disciplines of smart sustainable cities, a scientific area considered complex and dense. This study proposed the core dimensions of a foundational framework for smart sustainable city development. This in-depth interdisciplinary and transdisciplinary study highlighted the relevance of the development phase of smart sustainable cities. Lim and Maglio (2018) Explored the various service components of smart cities by data-mining 5378 papers and 1234 news articles. The findings identified four technology factors that are likely to affect smart cities: connection, collection, computation, and communications. The smart city applications were categorized into 12 areas, including smart energy, smart transportation, and smart health. and evaluation of new data-based public service concepts. These tasks were organized and conducted coherently to design attractive and workable data-based public services. We then identified challenges related to the transformation of data to information based on the four projects. Furthermore, we identified items that should be considered in data use for smart cities. After several iterations, we arrived at the six challenges in Section 5 and the five considerations in Section 6 by incorporating the relevant literature. We now describe our findings in turn in Sections 4-6.

Reference models for data use in smart cities
The use of big data in smart cities is characterized by a diverse set of dimensions, including data, data collection method, and value created with data. Classifying existing cases of big data use in smart cities could help make sense of this diversity by identifying categories that share a number of similar attributes. Furthermore, it may suggest reference models for data-based smart city transformation. In a business context, data may come from companies (e.g., business transaction, human resource, and financial data) or customers (e.g., demographic, behavioral, and purchase history data). For example, data analysis may rely on business transaction data for business process management, or automobile manufacturers may rely on customers' driving data. Big data use in smart cities can also be classified from these same perspectives, as indicated by the smart-city-related literature that discusses data sources and beneficiaries (e.g., Barns, 2016;Batty, 2013;Kitchin, 2014;Vanolo, 2016). Fig. 4 shows our classification of big data use cases in smart cities into four categories. The two axes represent the direct beneficiaries of big data use (y-axis) and the main sources of big data in smart cities (xaxis), and each is split into two levels, namely, local service providers (i.e., local government and companies) and individual customers (i.e., Table 2 Studies on using big data for smart cities.

Source
Definition or description Abella et al. (2017) Studied how big data from smart city portals can be used by private and public entities to create new services and deploy big data businesses. The study proposed the adoption of a value creation model for reusing smart city data. Aguilera, Peña, Belmonte, and López-de-Ipiña (2017) Studied how smart cities can be achieved by combining available resources, such as government data and sensor networks deployed in cities, with city knowledge in the form of the citizens' active contribution of data by means of their smartphones. This study developed a platform to ease the generation of citizen-centric services by exploiting urban data in different domains. Bibri and Krogstie (2017c) Studied the application potential of big data and context-aware technologies of smart and sustainable cities. Models of smart sustainable cities and their technologies, as well their application potential, were initially identified. Subsequently, the models for the urban forms of these sustainable cities and their design concepts and typologies were explored. Ruijer, Grimmelikhuijsen, and Meijer (2017) Developed the democratic activity model of open data use. Then, the model was represented by an exploratory qualitative multiple case study of three democratic processes. The findings suggested that a context-sensitive open data design can facilitate the transformation of raw data into meaningful information. The data explored in this study were constructed collectively by public administrators and citizens. Cao et al. (2017) Studied how smart city data can be shared and managed in a trustworthy, transparent, and traceable manner. This study proposed the data usage control model to capture the diversity of obligations and constraints by which data providers impose for the use of their data. Ben Sta (2017) Developed a theoretical framework that is generic enough to allow for the representation, propagation, and combination of several types of imperfect information. Packets of information are normally collected from different smart city sources. An experimentation method was used to model imperfect healthcare data. Fernández-Ares et al. (2017) Developed a novel mobility monitoring system that can track the movement of people and vehicles in radioelectric space. The system was designed to track WiFi and Bluetooth signals emitted by personal (smartphones) or on-board (hands-free) devices. This study illustrated the application potential of a system to handle city-based four-traffic and mobility scenarios. Bibri and Krogstie (2017d) Reviewed the core enabling technologies of big data analytics and context-aware computing of smart sustainable cities. This study was able to establish the basis for the development of conceptual frameworks that can be used or tested in future urban research. Hashem et al. (2016) Studied the state-of-the-art communication technologies and smart applications of smart cities and subsequently proposed a structure for big data. The study focused on the challenges of combining IoT and big data and their effect for achieving the goal of future smart cities. Ang and Seng (2016) Reviewed the networked sensor systems of representative urban environments, including those used for air pollution monitoring, assistive living, disaster management, and intelligent transportation. This study explored how value can be extracted from the big data of networked sensor systems. Al Nuaimi et al. (2015) Reviewed how big data can support the sustainability of smart cities. The opportunities, challenges, and benefits of incorporating big data applications into smart cities were explored. This study also identified the implementation requirements of big data applications for smart city service provision. Bertot and Choi (2013) Studied the policy implications of using of big data to improve digital government services. Major themes, such as government-private-businesses sector interaction, access and dissemination, asset management, archiving and preservation, privacy, and security, were highlighted. This study, which focused on the U.S. policy context, offered recommendations to facilitate big data initiatives. Solanas et al. (2014) Studied the new concept of smart health, a context-aware complementation of mobile health in smart cities. This study identified the main knowledge fields involved in the development of this concept and the corresponding challenges and opportunities, which may provide a common ground for researchers in their future work. Rathore et al. (2016) Developed a four-tier architecture for smart city development and urban planning using IoT and data analytics. The bottom tier of the architecture is allocated for the IoT sources and data generation and collection. The intermediate tier is for sensor, relay, base station, and Internet communication, in which data management and processing are guided by the Hadoop framework. The top tier is for the application and usage of data analytics and similar methods. Pramanik, Lau, Demirkan, and Azad (2017) Reviewed various big data and smart system technologies to develop state-of-the-art healthcare systems. The conceptual framework of a big data-enabled smart healthcare system was constructed to aid in the theoretical representation of intraand inter-organizational business models in the healthcare context. Recommendations for the effective application of the models to the healthcare industry were proposed. Bibri (2018b) Explored how IoT-enabled sensor-based big data applications can affect the environmental sustainability, as well as related data processing platforms and computing models, of future smart sustainable cities. An analytical framework for IoT-enabled data-centric applications, which can improve the environmental sustainability of smart sustainable cities, was developed. The key challenges of using IoT and big data analytics were identified, and some associated open issues were discussed.

Table 3
Projects related to the use of data in smart cities.

Projects Description
Project 1 This project designed a driving safety enhancement service based on analyses of driving data of commercial vehicles (278 city buses, 46 taxis, and 931 trucks) and accident data of commercial vehicle drivers (4289 city bus, 1550 taxi, and 490 truck drivers) with the Transportation Safety Authority. The institute was concerned with driving safety of commercial vehicle drivers (i.e. bus, taxi, and truck drivers) for citizens and had collected vehicle operations data from vehicles through the use of digital tachograph (DTG) devices. The institute wanted to develop services to manage the drivers and transportation companies. The designed service monitors the safety of local driving of city buses in certain cities and provides interventions to drivers and transportation companies. This project is highly relevant to ICT-enabled smart transportation in cities, particularly to the use of sensors throughout a city for data collection. The designed service can be classified into the local operations management category in Section 4. Project 2 This project designed eco-driving support services based on the analyses of data on driving and fuel consumption of 33 bus drivers with the same institute as that of Project 1. The fuel efficiency of buses is often lower than that of other vehicles because buses are heavier (including the weight of the vehicle and passengers) than other vehicles. Thus, the Transportation Safety Authority was also concerned with improving the fuel efficiency of buses by modifying the driving behavior of bus drivers. The service designed in Project 2 monitors the fuel efficiency of buses and provides educational contents to drivers about the necessity of eco-driving. For example, the service system provides drivers with feedback on their previous driving behavior from whenever they check their driving schedules before departure. If drivers decelerate rapidly while driving, an alarm of rapid deceleration may be sent through an in-vehicle device. At the end of the day, a report about driving behavior during the day is relayed to drivers through their smartphones. This project is highly relevant to ICT-enabled smart transportation in cities, particularly to the use of sensors throughout a city for data collection. The designed service can be classified into the local operations management category in Section 3. Project 3 This project designed health-related data-based services for health-related stakeholders with the National Health Insurance Service, based on interviews with 34 experts such as doctors, public health scientists, managers and executives in the industry, and government employees. NHIS had collected various types of healthrelated data, including insurance, diagnosis, treatment, and medical examination data, and aimed to design service concepts that would serve as bases for the innovation of data-based services in the health industry. This project is highly relevant to ICT-enabled smart health care and management in cities, particularly to the data integration and security across organizational connection. One of the designed services, called "cloud family doctor," supports both citizens and local family doctors in reviewing the personal health records of citizens and caring their health using local healthcare resources. This service can be classified into the preventive local administration and local network development categories in Section 4. Another designed service, called "local health-nostics," provides diagnostic and prognostic health information, such as disease maps and local health statistics to local governments. This service can be classified into the preventive local administration category in Section 4. Another designed service, called "hospital QA," assesses the service quality of hospitals and provides this information to patients and their family to assist them in choosing adequate local hospital. This service can be classified into the local information diffusion category in Section 4. Project 4 This project designed a hypertension patient management service with a government organization. The government was alarmed regarding the cost needed to treat hypertension and wanted to provide data-based hypertension management to citizens using the data from the same public health insurance system used in Project 3.
The service design process is as follows. First, a database for hypertension prediction algorithm development was created based on the cohort database of public health insurance. The database contained insurance data, diagnosis history data, treatment history data, medical examination data, and hypertension onset data from 2008 to 2010 of citizens who did not have hypertension before 2008. Second, the database was analyzed to develop a hypertension prediction model. Third, we designed a data-based hypertension management service through brainstorming, interviews with 17 experts, and analysis of the current service of public health centers. The finally designed service identified high-risk hypertension patients based on the hypertension prediction model, and provided a warning and a guide for users to visit public health centers in cities and get a checkup. This project is highly relevant to ICT-enabled smart health care and management in cities, particularly to the artificial intelligence for data processing. The designed service can be classified into the preventive local administration and local network development categories in Section 4. citizens and visitors). Each quadrant in the figure includes a name, representative cases, and a conceptual depiction of the category. Whereas the previous classification of applications related to smart cities (Fig. 3) is based mainly on the differences among industry areas, each of the four categories in Fig. 4 focuses on a specific area of data and information interactions and suggests a generic reference model for data-based smart cities that is independent of industry area. The local operations management category (Fig. 4, upper right) relies on data from local service providers, such as city infrastructure, environmental, and company resource data. As the conceptual depiction of the category indicates, numerous cases in this category sought to improve the operational processes of local governments and companies based on enhanced communication within processes (between resources, infrastructure, and citizens or visitors) using data. Examples include the following: community energy management in Yokohama, for which data were collected from home energy management, building energy management, and electric vehicles to optimize energy consumption in a local community (KLID, 2014); waste management in Delhi, for which data were gathered from trash bins using RFID tags and trash collection locations and times were scheduled (Purohit & Bothale, 2011); and transportation management in Singapore, for which data were collected from roads and taxis to anticipate future traffic and control traffic lights (Lee, 2013).
The preventive local administration category (Fig. 4, upper left) relies on data from individual customers, including civil complaints, transportation card usage, mobile phone conversation location records, and social network service (SNS) messages for local service providers. Numerous cases in this category aimed to understand the sources of citizens' problems on urban living through data and to anticipate solutions for potential problems, which increases the effectiveness of administration and prevents waste of resources. Examples in this category include civil complaint prevention in Busan, for which 10 years of civil complaint records in a local district were analyzed and strategies were identified to manage illegal parking, dust scattering from construction, streetlights, and other sources of citizen complaints (KLID, 2014); festival congestion prevention in Amsterdam, for which telephone traffic data during festival periods were analyzed to help provide a plan for the extension of transportation service time and the dispatch of police for upcoming festivals (Lee, 2013); and crime prevention in San Francisco, for which crime records were used to predict future crime locations, patrol identified locations, and prevent potential crimes (Lee, 2013).
The local network development category (Fig. 4, lower left) relies on data from customers to help customers. Numerous cases in this category sought to enhance the accessibility of local customer resource networks based on an enhanced understanding of them. Examples include midnight bus service routing and scheduling, for which the mobile call records and taxi-use data generated in Seoul were analyzed to identify the locations of people and their movements in the city during late-night hours, enabling optimization of bus routes and schedules based on late-night demand (NIA, 2013); optimization of Wi-Fi hotspots in Seoul, for which mobile call records were used to find optimal Wi-Fi hotspot locations; and welfare facility location planning in Seoul, for which data on senior citizens' living locations and their welfare facility use patterns were analyzed, ultimately identifying a supply/demand imbalance, knowledge that helped in planning a new welfare facility development (KLID, 2014).
The local information diffusion category (Fig. 4, lower right) relies on data from local service providers to help customers. Numerous cases in this category aimed to efficiently and effectively identify and spread useful information about cities to citizens and visitors. Examples in this category include air pollution monitoring in London, for which data from pollution sources across the city along with European weather forecasts were analyzed in order to create a pollution map; precipitation monitoring in Rio de Janeiro, for which a flood prediction model was developed based on land survey information, precipitation statistics, and radar data (Kitchin, 2014); and intelligent navigation in Milan, for which data on factors affecting traffic flows were analyzed, including real-time traffic situations, accidents, weather, construction, and events, from sensors placed throughout the city (Lee, 2013). In all these examples, the information identified (e.g., flood warnings and fast routes) was spread to citizens and visitors.
A framework that describes the variety of big data use cases in smart cities is necessary to promote the development of new cases with emerging and upcoming urban big data. Our proposed classification matrix is a framework for understanding similarities and differences between such cases, helping us view big data use in cities with beneficiary-centered and application-oriented thinking. The framework suggests four reference models for data-based smart city transformation that can help in conceptualizing and planning smart city projects based on projects previously carried out. The classification can also help in assigning benchmarks for each category. The four reference models can be applied to help improve any operational area of a city, such as education, transportation, and health care.
People and organizations pay for goods and services to perform jobs, whether farming, driving, dating, or other businesses (Ulwick, 2005). Innovation is achieved by enabling jobs to be performed better than before (Bettencourt, 2010;Lim, Kim, Hong, & Park, 2012). Our case analysis shows that big data use for smart cities contributes mainly to the creation of useful information so that citizens, visitors, local government, and companies can do their jobs better. In this light, exploration and exploitation of big data for smart cities should be driven by the stakeholders rather than by technology. This is a finding consistent with the people-centered view of smart cities (Hollands, 2015;Stratigea et al., 2015;Vanolo, 2016). For example, Stratigea et al. (2015) state that "smart-city solutions must start with the 'city' not with the 'smart,' shifting from a technology-pushed to an application-pulled smart-city planning approach, matching different types of 'smartness' (technologies, tools, and applications) with different types of urban functions and contexts."

Challenges in using data for smart cities
The previous sections show that the types and quantities of data in modern cities have increased considerably over time. The proliferation of big data provides numerous opportunities for data-based smart city transformation. However, as with any large-scale initiative for change, the move to data-based smart city transformation is not easy. The challenges in executing smart city projects that depend on using urban big data need to be determined. In this section, we focus on challenges that arise during the transformation of data to information for smart cities. Fig. 5 illustrates the process from data collection to information delivery in smart cities and the six challenges we identified. Challenges 1-3 relate more to data than to information, whereas Challenges 4-6 relate more to information than to data.
Challenge 1 pertains to data quality management. A prerequisite for identifying trustworthy smart city information is the quality of urban data. In Project 1, some of the vehicle driving data collection devices generated incorrect data, including strange or missing values. Moreover, vehicles used data collection devices from different device manufacturers, generating an unnecessarily large variance in the collected data. The quality of driving schedule data and accident history data was also inferior because the importance of data quality was not well understood by the local service providers. This project highlights the importance of local transportation-related data quality. Similarly, in Project 4, some of the data obtained from the national health insurance database were non-standardized or inaccurate. Given this data problem, developing a hypertension prediction algorithm for service was a challenging task. The quality of available data should be considered, and ways to enhance quality should be identified in data-based smart city development projects for project success.
Challenge 2 refers to the integration of data from different sources.
As shown in Fig. 5, various types of data are collected from different sources in modern cities. The key is to connect different types of data to produce a high level of knowledge and high-quality information for citizens and city officials. However, connecting data from different sources is difficult because different organizations use different data structures. In Project 1, the integration of driver, driving, and accident data posed a challenge because these data were archived in databases of transportation companies, vehicles, and the National Police Agency (NPA), respectively. In Project 4, data from different sources were integrated to prepare a cohort database for the development of a hypertension prediction algorithm. Such tasks are time consuming and require a range of expertise in diagnosis and data mining. Many experts interviewed in Project 3 mentioned that data integration and standardization are among the most urgent issues in big data use in the health sector for public as well as industrial purposes. Obtaining useful data for a smart city and planning for data integration should be conducted in data-based smart city development projects to understand the project scope and potential for its expansion. Challenge 3 concerns data privacy. In Project 3, 612 citizens were surveyed to evaluate the cloud family doctor, hospital QA, and other services that target citizens quantitatively and qualitatively. The respondents were users of public health promotion centers in multiple cities. They were asked about their intention to use the service, their requirements for the service, and the priority of the service. Many participants were concerned about their privacy. Although the citizens were receptive of the new value that new services can create, some stated explicitly that a prerequisite of service implementation should be a guarantee of their privacy. Investigating the privacy issues and addressing these concerns are essential in data-based smart city development projects to create valid and sustainable value for citizens and visitors.
Challenge 4 refers to understanding the needs of citizens, visitors, and employees. An expert interviewed in Project 3 mentioned that big data used for public purposes should be citizen-driven rather than ICTdriven. A prerequisite of big data use is identifying the right information for citizens; the requirements of employees are also crucial. The beneficiaries of big data in smart cities include city officials and employees of local companies. In Project 3, employees of the National Health Insurance Service (NHIS) were interviewed to determine their needs concerning the use of its organizational data. These employees aimed to use data for the selection and implementation of concentration strategies for their respective jobs and the identification of employees who required further training or special care. This result indicates that employees of local service providers benefit significantly from urban big data. Determining useful information for citizens, visitors, and employees is crucial in data-based smart city development projects because identifying the information to deliver to customers is directly connected to the value and appeal of a service.
Challenge 5 pertains to enhancement of geographic information delivery methods. Many big data use cases aim to analyze the data and deliver identified information according to a geographic unit (e.g., district and building). In Project 2, analyzing the data delivery by region was challenging because no geographic information system (GIS) was available. For example, a visual comparison of efficiency of different bus routes was difficult to create. Several experts evaluated the local health-nostic service in Project 3 and discussed the importance of the geographical visualization of local health information. These studies indicate that the key success factor of big data use in smart cities is the effectiveness of information visualization and delivery using GIS. In data-based smart city development projects, visualizations of information content should be clear and substantial to enhance information acceptance by citizens, visitors, and employees.
Challenge 6 pertains to the design of smart city services that deliver information from urban big data. As described previously, various kinds of information are available from urban big data. However, we found through Projects 1-4 that a large portion of information often remains in reports and hard disks after the data analysis projects and are not used for actual value creation. If such information can be eventually delivered to citizens and visitors through a service, the relationship between provider and customer can be strengthened. This approach enhances the efficiency of continuous value co-creation. As such, the development of a service that delivers smart city information (such as the kinds shown in Fig. 3) can be useful to citizens and local governments. Designing a data-based smart city service is crucial because this

Citizens and visitors
Local government and companies ... … Data in smart cities

Citizens and visitors
Local government and companies  Cities xxx (xxxx) xxx-xxx task integrates all of the outcomes from the data analytics, ideation, and information content design for a smart city. As shown in Table 4, these six challenges are consistent with those discussed in the literature, such as managing data quality (e.g., Al Nuaimi et al., 2015;Ben Sta, 2017), integrating data from different sources (e.g., Clarke & Steele, 2011;Ruijer et al., 2017), protecting personal data privacy (e.g., Bibri, 2018b;Kitchin, 2014), understanding the needs of people (i.e., citizens, visitors, and employees) (e.g., Afzalan et al., 2017;Díaz-Díaz et al., 2017), enhancing geographic information delivery methods (e.g., Aina, 2017;Hashem et al., 2016), and designing attractive yet feasible smart city services (e.g., Cicirelli et al., 2017;Urbieta et al., 2017). This shows that our action research into practice successfully confirms and augments existing knowledge and arguments in the literature with further empirical evidence.
These challenges are interrelated because data collection, information creation, and information delivery in data-based smart cities are interdependent activities. For example, understanding the needs of citizens, visitors, and employees (Challenge 4) is a prerequisite to proper data integration (Challenge 2), and designing attractive and workable smart city businesses (Challenge 6) requires consideration of the other five challenges. The six challenges indicate that big data use for smart cities requires various types of expertise, including knowledge of citizens, city administration, data management, data analytics, law, and service design. For example, addressing Challenge 1 requires expertise in data management, whereas Challenge 3 requires expertise in law and regulations. Marketing expertise would contribute in addressing Challenge 4. Meanwhile, Challenge 6 can be addressed only through the integration of a diverse range of expertise.

Considerations in using data for smart cities
How can cities implement one or more of the four reference models in Section 4 and address the six challenges in Section 5? We propose five items that should be considered in transforming cities into smart ones using big data by integrating the findings and relevant literature discussed previously: (i) the service orientation in seeing and using data; (ii) the customer experience regarding data and information; (iii) the data orientation in designing service applications; (iv) the synergies and conflicts between data-related stakeholders; and (v) the integration of different perspectives on data and applications. Table 5 lists related studies and supports our findings with relevant literature. Fig. 6 shows the five considerations and their relationships with the challenges and reference models discussed previously, suggesting an integrated framework. The five considerations are inherently correlated with each other in that they overlap in their challenges; our main purpose is to integrate our findings into a comprehensive set of considerations for using big data for smart cities rather than to define mutually exclusive ones. This framework can be used in executing smart city projects that rely on the use of urban data. We describe each of the five considerations in turn.
The first consideration is to adhere to a service-oriented perspective for value creation in collecting and analyzing urban data. Various directions can be taken in using a given dataset, and all of the distinct pieces of data collection, integration, management, and analysis activities will be organized according to the direction(s) taken. Thus, a prerequisite of the use of big data for a smart city is to define the directions or themes of the applications that will be developed and to identify the requirements for data, the data analysis system, and other elements that will facilitate the applications in question. The effectiveness and efficiency of using urban big data can increase considerably when data-related activities are coherently oriented toward ultimate service functionality (i.e., ultimate value from data). To adhere to a service-oriented perspective in seeing and using the data in the four projects, we defined specific service themes that were highly relevant at the first stage of each project (e.g., safety enhancement and hypertension prevention themes in Projects 1 and 4, respectively), and we developed service design templates to efficiently design services. Such themes were useful in considering how the available data can be used effectively for services. Furthermore, the templates were useful for integrating the data analysis results, the service ideation, and the information content design into several services for actual value creation.
Although each consideration is related to all of the challenges to some extent, in our projects we observed utilities of each consideration for specific challenges. The service orientation consideration was useful in addressing Challenge 1, assessing data quality in terms of its usefulness to an actual applications; Challenge 2, identifying and integrating valuable data sources for the service in question; and Challenge 4, designing serviceable information content that is needed by city stakeholders. Furthermore, in our projects, the service orientation in seeing data effectively identified issues in implementing the services for local communities (Challenge 6).
The second consideration is to pay attention to the experience of people (e.g., citizens) in collecting and using personal data and delivering information to them. Aside from utilitarian functions, creating experiences and emotionally appealing moments matter to customers of any type of service. Attention to personal experience (e.g., citizen experience) is also important in a smart city context. In particular, it is important not to provoke citizens regarding privacy issues (e.g., on data for places they have driven) nor burden them (e.g., with time requirements for recording data). Not only should the information provided be useful to the customers, but the data collection method should provide a natural user interface for data collection, the use of data should consider privacy, the visualization of information content should be clear and rich, and the service delivery process should generally please customers. In Project 3, the cloud family doctor service assesses the levels of people's health and provides simple suggestions for enhancing their health through daily exercise and diet goals. The hypertension management service designed in Project 4 first provides a short message to potential patients and then gently warns them about the potential onset of their hypertension and encourages them to visit a Table 4 Existing studies related to the six challenges.

Challenge
Related studies Managing the data quality Perera et al. (2014), Al Nuaimi et al. (2015), Ben Sta (2017) Ang and Seng (2016) local public health center. We observed in our projects that the personal experience consideration in a data-based smart city application can contribute to addressing Challenge 3, in attending to privacy issues in advance as well as the regulation issues that go against or complement their experience; Challenge 4, in devising and delivering the essential information to city stakeholders; and Challenge 5, in creating user-friendly geographic delivery methods to enhance information acceptance.
The third consideration is to employ a data-oriented perspective in designing services. Smart city projects should devise valid and workable services for public purposes and should consider the technological aspects that enable the services. Thus, understanding the data collection, management, and analysis mechanisms is crucial for designing valid and workable services. This requirement is a key difference distinguishing the design of a data-based public service from other types of public service. A data-oriented perspective, such as a critique of the validity of data and feasibility of data analysis, should be applied to obtain practical services because the data in question are the core resource enabling service design and value creation for local communities. For example, in Project 3, experts evaluated 138 service ideas, with one of the evaluation criteria being the feasibility of the idea. Experts pointed out that multiple ideas were feasible and difficult to realize because of the limited availability of required (quality) data and data analysis systems. Whereas the first and second considerations mainly concern the potential use of the available urban data from a service-oriented perspective, the focus of this consideration is the viability and realization of the service in question from a data-oriented perspective.
We observed in our projects that data orientation is useful in addressing Challenge 1, assessing data quality to evaluate the feasibility of the service; Challenge 2, efficiently integrating different data without damaging the information that available data contain; and Challenge 3, designing a reliable data collection method, which is essential for realizing the service in question considering the regulation and privacy issues concerning the data to be used.
The fourth consideration is to create synergies and minimize conflicts between data-related stakeholders. The notion of stakeholders involved in service has been emphasized in the literature to design sustainable and workable services that create value for multiple stakeholders. As shown in Figs. 3 and 4, value creation from the use of big data in smart cities involves various types of stakeholders. We believe the art of using big data for smart cities lies in effective matchmaking among the concerns of different stakeholders. For example, in Project 3, we constructed an implementation plan for each designed service to mediate the conflicts pointed out by the experts interviewed. In one case, an expert mentioned that the cloud family doctor service should start by simply "reviewing" the integrated data of personal health records of a customer (e.g., showing descriptive statistics of the data) without providing any healthcare information, which might be inaccurate or controversial (e.g., predictive or prescriptive information about a disease), and without invading the territory of doctors (e.g., by showing a diagnostic message). The expert suggested emphasizing the Table 5 Existing studies related to the five considerations.
C. Lim et al. Cities xxx (xxxx) xxx-xxx value of the service as a complement to existing healthcare services and evolving the service step by step once it was accepted by multiple stakeholders. Similarly, Projects 1 and 2 considered the different perspectives of bus drivers, citizens, transportation companies, and governments, whereas Project 4 analyzed the different stakes of citizens, public health centers, and central government.
We observed in our projects that creating synergies between datarelated stakeholders and minimizing conflicts are required to address Challenge 2, collection and integration of rich data from different stakeholders; Challenge 3, minimizing regulatory conflicts between the stakeholders; Challenge 4, winning synergetic stakeholders over to the service party in terms of their needs; Challenge 5, enhancing information acceptance based on the authority or contribution of GIS management organizations; and Challenge 6, designing a service delivery process that is based on an appropriate partnership.
The fifth consideration is to form a cross-functional team for the use of big data. Whereas Considerations i-iv concern the methodological challenges in big data use for smart cities, this consideration addresses organizational and cultural challenges. By nature, any city improvement project involves "soft" tasks that require multidisciplinary human activities. More specifically, we observed in our four projects that the use of big data to advance public service requires organizing a crossfunctional team with members from various functional units, including planning, design, engineering, IT, statistics, and administration, all of which should involve various types of experts. We performed Projects 1 and 2 with transportation and mechanical experts; Project 3 with doctors, public health scientists, data scientists, IT specialists, business experts, and government employees; and Project 4 with chronic disease experts and statisticians. In this respect, another artistic aspect of the use of big data for smart cities is to integrate the expertise of different professionals into a set of knowledge for data analytics and service design. Forming a cross-functional and transdisciplinary team is required for addressing all the challenges and considerations previously mentioned. By integrating the expertise of individuals from multiple backgrounds, ideas to address the challenges and considerations can be identified. We believe our proposals  can be used to gain a multidisciplinary perspective and form a cross-functional team.

Policy development toward developing smart cities with big data
The four projects used to derive the above challenges and considerations were all governmental projects; Projects 1 and 2 were with the Transportation Safety Authority (TS), and Projects 3 and 4 analyzed NHIS data (Kim et al., 2018;Lim et al., 2018a;2018b;2018c). As such, the findings of our work should be considered in the development of policies for smart cities with big data. For example, through the four projects we found that a policy to address Challenge 2 (integrating different data) and Consideration iv (creating synergies and minimizing conflicts between data-related stakeholders) will be needed for effectively using big data for smart cities.
In Project 1, we compared risky driving behaviors of truck drivers with and without accident records (34 and 289 drivers, respectively). The driving data were collected November 1-30, 2013. We found that the accident group violated the regulated risky driving behaviors, such as rapid deceleration and lane changing, significantly more than the non-accident group (p value < 0.01). This result indicates the necessity of developing a policy for managing risky driving behaviors of those truck drivers having an accident record. A local operations management service (Fig. 4, upper left), such as the driving safety enhancement service designed in Project 1, can be used for the policy implementation. This service requires the integration of data from different organizations, such as driving data from TS and accident records from the NPA. Therefore, we suggested to the TS government employees that they develop a policy for regularly importing accident records of commercial vehicle drivers to create synergy with regard to national driving safety management.
Likewise, in Project 3, we discussed the direction of policy development with the government employees in the NHIS to facilitate local information diffusion services for citizens' health ( Fig. 4, lower right). As for the development of such services, they agreed with us that the challenge of integrating data from different organizations (Challenge 2) is crucial because different organizations managed different health-related data; for example, the NHIS managed diagnosis and treatment data of insured diseases only, while another organization focused on diagnosis and treatment data of insured diseases. Thus, we suggested to the government employees in the NHIS that they work with the Ministry of Health and Welfare to develop a policy to create synergy between the two governmental organizations by sharing the data on insured and uninsured diseases (Consideration iv).
Policies to address other challenges and considerations should be developed as well. An example is a policy to address Challenge 3 (addressing privacy issues) and Consideration ii (pay attention to the experience of people in collecting and using their data and delivering information to them). As discussed in Section 5, Project 3 involved a survey about the services designed for citizens and visitors (i.e., local information diffusion and local network development) to 612 individuals, and some of the respondents were worried that new services would invade their privacy; therefore, a policy to enforce the protection of individuals' privacy by organizations or at least to address Consideration ii should be developed. Although there can be more discussion on such implications for new policy development, this is a future research issue that should be tackled independently by a separate study. Thus, we encourage future studies to address the challenges and considerations identified in this work.

ICT application toward developing smart cities with big data
Using big data for smart cities inherently involves the ICT-rich nature of smart cities, such as the use of sensors throughout a city for data collection, IoT for real-time data transmission to a central server (Atzori, Iera, & Morabito, 2010), AI and cloud computing for data processing (Iyoob, Zarifoglu, & Dieker, 2013), blockchain for data security and organizational connection (Risius & Spohrer, 2017), geographic information visualization for analysis result delivery (Aina, 2017), and design of smart services for data collection and information delivery (Maglio & Lim, 2016). All four governmental projects aimed to apply such ICT to improve the previous systems and services. Projects 1 and 2 concerned the application of sensor devices to commercial vehicles in cities, development of an AI model to measure and improve driving safety and fuel efficiency, and design of smart services for drivers and transportation companies. Projects 3 and 4 concerned such ICT application in the context of citizens' health care and management.
The four proposed reference models are also highly relevant to the ICT-rich nature of smart cities, particularly to the connectivity among elements in cities. Unlike the connectivity among elements in humandesigned systems, such as mechanical systems, connectivity in cities is not complete or fully tangible because cities are sociotechnical systems. Accordingly, one difficulty in operating complex city systems is the lack of capability to monitor and manage complexity in an automatic or semi-automatic manner. However, recent ICT advances have contributed to enhancing the connections among system elements (e.g., tangible goods directly used by citizens as well as dedicated infrastructures generally required by citizens and local organizations), data collection from elements (e.g., traces of engineering systems, event logs of business processes, health and behavioral records of people, and biosignals of animals), and computation (e.g., context-aware and real-time computation) and communications (e.g., machine-to-machine actuation and machine-to-human guidance) for efficient decision making and control within modern cities. As shown in Fig. 4, these advancements contribute to monitoring and managing the complexity of cities for value creation. The four proposed reference models are examples, and we encourage future studies to extend the models identified in this work.
Ironically, however, enhanced connectivity requires new equipment (e.g., data and servers) and organizations (e.g., data management and analytics companies) and increases the complexity of human jobs. In this regard, another difficulty in operating modern complex cities is the necessity of knowledge in managing the problems arising from connectivity. For example, the security and privacy of interactions within systems need to be managed by intermediary organizations or technologies. The proposed six challenges and five considerations were identified to address this difficulty and facilitate such ICT application to smart cities. Nonetheless, the findings are mainly based on only four projects with government organizations. Thus, we encourage future studies to identify more challenges and considerations for ICT application toward developing smart cities with big data.

Concluding remarks
The fundamental aspect of recent and expected data-based smart city innovations is not ICT, data, or intelligent infrastructure, but the new applications for value creation for stakeholders (e.g., citizens). The use of urban big data contributes to the creation of information for stakeholders to perform their processes better and create value. The main contribution of this paper is the development of knowledge and frameworks for data use for smart cities drawing from this applicationoriented perspective. The proposed classification scheme suggests four reference models to create value for citizens, visitors, local government, and companies using the data obtained from them. There are at least six challenges in transforming urban data into information for smart cities. The proposed five considerations for the collection, management, and analysis of urban data will help address those challenges in implementing the reference models.
Our work is unique in that it used an empirical approach to identify such findings by analyzing existing use cases of big data for smart cities and by conducting four action research projects with government organizations that create new use cases. Some existing studies provide knowledge about data use for smart cities. However, through our projects we found that such knowledge is scattered across different fields and domains. One of our challenges in each project was to integrate and customize existing knowledge of different experts for the context of using big data for smart cities. This context is a truly interdisciplinary research topic. The findings of our work specialized in this context are grounded in and meaningfully integrate existing work from both within and outside the smart cities literature. In summary, our findings will help in conceptualizing, planning, and executing smart city projects that rely on the use of urban big data. We hope this paper will stimulate further application-oriented discussions on the use of urban big data.
Future research could address several issues to further develop our findings. First, more studies are required for policy development and ICT application toward developing smart cities with big data though Section 7 discusses this issue briefly. Second, given the widespread applications of smart cities with big data, more reference models, challenges, and considerations should be identified by different researchers based on different projects and analyses. Third, development of review papers in addition to those existing (e.g., Bibri, 2018b) would be valuable for the comprehensive integration of existing studies into a single framework. Fourth, additional empirical studies on our findings are required to determine the relative impacts of the challenges and considerations in different contexts. Finally, the main scope of this study was the set of reference models, challenges, and considerations itself; the design methodology is outside this scope and thus is not directly addressed here. Nonetheless, the reference models, challenges, and considerations could be used and tested further in future studies to ensure the development of a solid design methodology for smart city applications with big data.