Google Trends data and transfer function models to predict tourism demand in Italy

Giovanni De Luca (Department of Management and Quantitative Studies, University of Naples Parthenope, Napoli, Italy)
Monica Rosciano (Department of Management and Quantitative Studies, University of Naples Parthenope, Napoli, Italy)

Journal of Tourism Futures

ISSN: 2055-5911

Article publication date: 21 March 2024

181

Abstract

Purpose

The tourist industry has to adopt a big data-driven foresight approach to enhance decision-making in a post-COVID international landscape still marked by significant uncertainty and in which some megatrends have the potential to reshape society in the next decades. This paper, considering the opportunity offered by the application of the quantitative analysis on internet new data sources, proposes a prediction method using Google Trends data based on an estimated transfer function model.

Design/methodology/approach

The paper uses the time-series methods to model and predict Google Trends data. A transfer function model is used to transform the prediction of Google Trends data into predictions of tourist arrivals. It predicts the United States tourism demand in Italy.

Findings

The results highlight the potential expressed by the use of big data-driven foresight approach. Applying a transfer function model on internet search data, timely forecasts of tourism flows are obtained. The two scenarios emerged can be used in tourism stakeholders’ decision-making process. In a future perspective, the methodological path could be applied to other tourism origin markets, to other internet search engine or other socioeconomic and environmental contexts.

Originality/value

The study raises awareness of foresight literacy in the tourism sector. Secondly, it complements the research on tourism demand forecasting by evaluating the performance of quantitative forecasting techniques on new data sources. Thirdly, it is the first paper that makes the United States arrival predictions in Italy. Finally, the findings provide immediate valuable information to tourism stakeholders that could be used to make decisions.

Keywords

Citation

De Luca, G. and Rosciano, M. (2024), "Google Trends data and transfer function models to predict tourism demand in Italy", Journal of Tourism Futures, Vol. ahead-of-print No. ahead-of-print. https://doi.org/10.1108/JTF-01-2023-0018

Publisher

:

Emerald Publishing Limited

Copyright © 2024, Giovanni De Luca and Monica Rosciano

License

Published in Journal of Tourism Futures. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode


1. Introduction

As a result of the COVID-19 epidemic, which has caused a worldwide health, social and economic emergency, travel and tourism are among the most adversely affected industries, with planes grounded, hotels shuttered and travel restrictions implemented in almost every nation on earth. The number of foreign visitors in 2021 compared to 2019 is drastically down: 73% from Africa, 60% from the Americas, 94% from Asia and the Pacific, 62% from Europe and 75% from the Middle East (Unwto, 2022). The epidemic has highlighted the weaknesses of the current capabilities in the tourism industry when it comes to dealing with the fallout from a crisis of this magnitude and reviving the industry.

Tourist destinations need to seek a new normal and find new balances. In having to adapt to periods of uncertainty, in looking for alternative sources of financing, in repositioning and diversifying markets, products and services, it becomes not only essential to restart but also do it so in a more inclusive, innovative and sustainable way (Unwto, 2021). In this context, a notable contribution comes from the analysis of the megatrends such as the evolving visitor demand, the sustainable tourism growth, the enabling technologies and travel mobility. It is needed to adapt tourism in an increasingly fast-moving, inter-connected and dynamic environment following a strategic foresight approach (OECD, 2018). It is highly desirable to establish a new governance model between the Institutions, strengthening the organizational, financial and human resources and promoting a data-driven decision process. Good policy and planning decisions are made with authoritative, reliable and accessible data (WTTC, 2022). Recent tourism data indicate that these transformations in the tourism sector must take place in a timely manner not only to adequately address negative shocks but also unsustainable over-tourism phenomena already detected in some locations such as Croatia, Montenegro and Italy in last years (Knezevic et al., 2021). The foresight approach helps tourism to take proactive measures to reduce uncertainty, turbulence and ambiguity. Managerial interest in accurate forecasts of future events and changes in tourism demand is related to the extent to which actions can be implemented to influence the demand, monitor fluctuations for planning or resource capacities as well as manage the (for example, socioeconomic) effects of demand. Hence, the success of tourism businesses largely depends on the ability to predict tourism demand most accurately (Hopken et al., 2019) and timely.

Additionally, given that search queries are increasingly seen as a contemporaneous snapshot of collective consciousness due to the increased usage of the internet for news and information in recent years, what people search for is a good indicator of what they will do in the future (Goel et al., 2010). As a result, information from internet searches is recognized as a significant source of linked contemporaneous data (Jackman and Naitram, 2015).

Besides, emerging tech-savvy generations will become increasingly dominant consumers in the tourism market and recent studies also indicate that post-COVID travelers would have an even more digital profile and explore vacation alternatives on internet platforms. Digital solutions that deliver customized and real-time information will be more appealing to tourists (Knezevic et al., 2021).

The above discussion drivers our motivation to answer the following research question: “How to obtain future scenarios using real-time information provision.” The pandemic strengthens the need for more integrated, timely, comparable and granular data. Improving the evidence base, looking beyond traditional survey approaches using alternate and complementary data sources, to inform policy and business decisions is a key policy priority (OECD, 2022).

Therefore, this paper, following a big data-driven foresight approach (Saritas et al., 2022), predicts international tourism demand using a transfer function model on Google Trends data. Google Trends has become a very popular data source among researchers of a wide variety of fields over the last decade, despite some critical issues in quality (Cebriàn and Domenech, 2023).

The work focused on international tourism considering that although domestic travel has been a key factor in the recovery to date and remains essential (Aldao et al., 2022), the full recovery of the global economy will require the full return of international travel (WTTC, 2022).

In detail, it focuses on the US tourism demand in Italy, a country in which the total contribution of tourism and travel to GDP is 13%, to employment is 15% and foreign visitor spend is 8% of total exports (WTTC, 2022). International tourism accounts for 49% in terms of arrivals and the United States of America are the main extra-European market. Americans represent 9% of international arrivals and 30% of extra-European arrivals (Istat, 2019). The US tourist expenditure represents 13% (Bank of Italy, 2020).

Additionally, a rise in interest in national tourism attractions was seen in Italy between January and April 2021, with a proportion of searches for these locations increasing from 55% to 79% (Knezevic et al., 2021). For the public and private operators of the Italy tourist destination, the availability of reliable and accurate forecasts of the US tourism source market is of crucial importance, also in light of the results of recent surveys which show that the impact of the war in Ukraine could hurt travel sentiment to Europe in addition to COVID-19 health and safety concerns (European Travel Commission, 2022).

In the direction of tourism future research, we believe that our study contributes to the theme “tourism and technology” (Sharma et al., 2021) because it answers the research question on how e-tourism can be used for better markets forecast and furthermore raises awareness of foresight literacy in the tourism sector. Secondly, it complements the research on tourism demand forecasting by evaluating the performance of quantitative forecasting techniques on new data sources. The pandemic has reinforced the urgency to have new tourism data measures and complementary data sources to react quickly under uncertainty (OECD, 2022).

Thirdly, it is the first paper (to the best of our knowledge) that makes the United States arrival predictions in Italy. Despite its significance for destination and attraction management, the ability of the current approaches to precisely forecast visitor arrivals at particular attractions is still lacking (Volchek et al., 2019). Many works stop at the evaluation of the models, but few proceed in the forecast of the tourism demand (Huarng and Yu, 2019; Li and Law, 2020; Antolini and Grassini, 2019). Finally, the findings of this study provide immediate valuable scenarios to private and public stakeholders in the tourism origin and destination markets that could be used to make decisions. The projected scenarios, derived with increased forecasting abilities, can help tourist managers choosing which management methods to consider and implement (Clark et al., 2019).

Bergman et al. (2010) distinguish different typologies of future studies according to making truth claims and/or indicating mechanisms: prediction, prognosis, science fiction and utopia (Bergman et al., 2010). However, a claim that is said to be true may be more or less conclusive. A forecast that assumes many outcomes, such as scenarios, based on which mechanisms are active is an example of this. Our work falls into this typology because we think it makes more sense to present a range of potential outcomes rather than a single-point forecast in the context of economic forecasting. The paper is structured as follows: Section 2 provides a brief background Section 3 presents Google Trends data collection and the statistical analysis. Finally, in Section 4 we discuss the result and draw some concluding remarks.

2. Background

Significant losses in service activity were brought on by COVID-19, particularly in the hospitality, food, entertainment and retail sectors and not just in the nations that strongly rely on tourists (Kanchana and Shyama, 2021). The pandemic period highlighted that tourism needs to be better measured, including its economic, environmental and social performance and it needs having of accurate, reliable and timely predictions. Achieving all this is certainly not easy due to the lack of historical data, seasonal fluctuations, influences of unexpected events like hurricanes and eclipses that affect visitation but are unrelated to the previous year’s visitation, the variety of input factors and the complexity of visitors’ travel decision-making process (Song et al., 2010; Clark et al., 2019). Certainly, these results can be achieved more easily favoring the digital transformation of the tourism industry (Akhtar et al., 2021) considered as information intensive (for instance with the implementation of tourism intelligence systems) and facilitating data access, fundamental elements of a big data-driven foresight approach. Foresight is defined as the process of foresight towards the future to try to find out what could be. This process appears significantly in future studies, where the foresight carries it out by applying the research methods based on orientation toward the future (Aldulaini et al., 2022).

As a result, demand modeling and prediction continues to receive a lot of attention from academics and industry professionals and are among the most important areas of tourist research (Hopken et al., 2019).

Several methods have been suggested to increase the forecasting precision of tourist demand: time-series analysis, econometric models, hybrid techniques, nonlinear modeling and artificial intelligence algorithms (Tian et al., 2021; Li and Law, 2020; Jun et al., 2018; Li et al., 2005). Among the quantitative methods applied to tourism demand forecasting transfer function models (describing the relationship between an output variable Yt at time t and k input variables X1t, , Xkt) have also attracted the attention of scholars (Dinis et al., 2017; Gnanapragasam and Cooray, 2016; Kulendran and Witt, 2003). But traditional studies based on historical statistics, have a certain hysteresis quality, which restricts the accuracy of prediction over the short term. Additionally, forecasting based on the conventional historical data is frequently impacted by unpredictable events like natural catastrophes, which lowers the accuracy of prediction (Feng et al., 2019). Furthermore, the official statistical systems rarely generate tourism data that can be used to drive policy decisions. The current tourist statistics directive suggests gathering monthly data on inbound and outbound travels, including regional (NUTS2) specificity. More punctual information (i.e. at provincial or municipal level) is published annually – monthly only in some countries – with a time delay that in the more recent Italian official survey ranges from 11 months to 23 months (Emili et al., 2020). In order to obtain more precise predictions of visitor arrivals at the destination-level, new data sources are being used increasingly frequently. The amount of information that the individuals can access online has greatly expanded thanks to the internet’s quick development and the proliferation of connected devices like laptops and mobile phones, which has also caused a revolution in the travel and tourism sector. The relationship between travelers and travel agencies has evolved as a result of new technology and online services. For instance, consumers may now compare thousands of hotels, arrange their entrance to a museum, or reserve a table at a restaurant using the internet. Tourism intelligence systems or destinations with dashboards, which provide real-time aggregate data on tourist arrivals, overnights, transport and spending, are important for smart and data-driven decision-making and are a vivid example of data utilization for both business and destinations. The tourist interactions with the internet, such as search engine data, web traffic, social media and mobile data, have generated structured and unstructured online data that can be captured, saved and analyzed and have gained popularity in predicting tourism demand (Li et al., 2021; Vu et al., 2018; Yang et al., 2015). Traces of access can be recorded when a tourist searches or makes a reservation online. Because travelers typically make plans in advance, aggregated internet behavioral data can be used to gauge demand for travel (Pan et al., 2012).

Tourism demand often has a natural link with new web-based data sources like search engine traffic, Website traffic or customer reviews on online review sites. These huge data sources have become more widely available, and as a result, forecasting tourist demand has also been done using them (Hopken et al., 2019). Literature has recently concentrated on taking advantage of the useful information search query data on tourist behavior provides. The number of travelers who use the internet to prepare for and book their business and leisure travels has dramatically increased over the past few years (Camacho and Pacce, 2018). Even for official statistics, big data are a vital source of information. International and national statistical offices are really conducting pilot studies on various big data sources today to see how well they would be able to provide or integrate official statistical data. Such data are primarily of importance because they can offer more time-saving and cost-effective information, especially at the local-level (Antolini and Grassini, 2019; Naccarato et al., 2018). Online data provide specific advantages over traditional data (e.g. survey-based data), such as higher frequency (i.e. daily, weekly and monthly) and the capacity to reflect people’s behavior from different angles (Pan and Yang, 2017). By examining the themes that emerge from the online discourse and researching the dynamics of user interaction, big data from online social networks can be used to predict rapid changes in tourist preferences and popularity trends for destinations and local attractions (Franzetti Colladon et al., 2019). As a result, academics have added online data to traditional data, extracted business value from it and used it in the predicting models (Andariesta and Wasesa, 2022). The ability to track and reliably predict visitor behavior has been transformed by the availability of real-time, high-volume and high-frequency data (Yang et al., 2015). Search data for tourist information has been extensively exploited to improve the forecasting capability of the current models (Park et al., 2017). A range of methods to apply search queries to predict tourist arrivals at a destination-level has been proposed (Antolini and Grassini, 2019; Artola et al., 2015; Bangwayo-Skeete and Skeete, 2015; Choi and Varian, 2009; Dergiades et al., 2018; Hopken et al., 2019; Li et al., 2018; Onder, 2017; Gunter and Onder, 2016; Park et al., 2017; Yang et al., 2015). The majority of existing studies demonstrate that online search engine data can improve the forecasting performance of tourism demand (Li and Law, 2020). Since search query data reveals information about the interests, intentions and opinions of travelers, it is useful for predicting travel demand. Tourists utilize search engines to learn more about their vacation destinations, routes, must-see attractions and other travelers' reviews and opinions of those locations. To predict the number of visitors who will actually visit a location, data from search queries may be employed.

Since ICTs and mobile computing devices have developed, the internet has become the primary source of travel information (Claude, 2020). Search engines, such as Google that dominate over a billion of worldwide Internet, Baidu and Yandex that represent the fastest growing emerging markets (i.e. China and Russia), shape tourist perceptions of attraction image and encourage travel decisions. In this paper, Google’s data were analyzed considering that most Americans have been using this search engine for some time to plan travel (Pan et al., 2011). Google began to release information on its users’ search queries through the publicly available Google Insights interface, now rebranded as Google Trends.

Since it began operating in 2004, Google Trends has provided search frequency of search keywords, which displays the comparison between the volume of searches for a given keyword and all other keywords over a given period of time. It then further normalizes search frequency into the range [0, 100], preventing changes in the volume of searches caused by an increase in users. The software displays the frequency of a specific keyword being used as a search term for the entire number of searches across multiple nations and languages (Choi and Varian, 2009). As a result, recent research has remarkably shown that Google Trends data reflect important elements of visitors' keyword-based queries, which in turn opens up a wealth of options to research and predict travelers' planned behavior (Dinis et al., 2019; Hopken et al., 2019; Yang et al., 2015; Kim et al., 2016; Hopken et al., 2021).

3. Google Trends data collection and analysis

3.1 Preliminary analysis

Considering the aim to predict international tourism demand in Italy using Google Trends (henceforth GT) data, first of all, we have proceeded to identify the possibly useful keywords. Following the literature suggestions (Camacho and Pacce, 2018; Havranek and Zeynalov, 2018; Hopken et al., 2019; Bangwayo-Skeete and Skeete, 2015; Claude, 2020) Holiday in Italy; Visit Italy; Travel to Italy; Hotels to Italy; Flights to Italy keywords have been initially selected. These expressions have been entered in GT to detect monthly search frequency data for each keyword from January 2015 to December 2021, carried out by Americans in the category Travel. After, we have computed the Pearson correlation coefficient between the GT keywords data and the target variable. Many different indicators have been used as the target variable in forecasting and modeling tourism demand. However, a commonly used measure is tourist arrivals. In particular, in this paper the target variable is represented by the Americans monthly arrivals in Italy from January 2015 to December 2020, carried out from the ISTAT website.

Correlation has been computed excluding the 2020 years, to avoid a possible bias due to the COVID-19 pandemic; however, unlike what one had been thought, correlations did not dramatically change also including 2020.

The highest correlation coefficients have been obtained for GT keywords Visit Italy (henceforth Visit) and Hotels to Italy (henceforth Hotel).

The choice of the GT keyword comparing the two candidates’ keywords has been made according to an intuitive criterion: we have chosen the GT keyword with a temporal dynamics which mimics as much as possible the temporal dynamics of the original series, using monthly data from 2015 to 2020. In practice, defined Xit the i-th GT keyword at time t, we have compared the autocorrelation function (ACF) of Xit to the ACF of the time-series Arrivals, denoted as Yt. The ACF of the original time-series Arrivals is reported in Figure 1, while Figures 2 and 3 show the ACF’s of the two candidates.

The comparison is based on the quantity ∆ρM, the average distance between.

M ACF values of the series Xit and Yt,

ρM=m=1M|ρmXρmY|M
where
ρmZ=Corr(Zm,Ztm)
with
Z(X,Y).

With (13,25,37), the lowest value of ρM (see Table 1) is found for the GT keywords Hotel. The temporal dynamics of the time series Hotel is then as similar as possible to the original series Arrivals.

Figure 4 displays the time series of monthly arrivals up to December 2020. In Figures 5 and 6, the selected GT keyword is shown until 2020 and until 2021, respectively.

3.2 Transfer function

The use of GT data provides an insightful view of the event when original data are not available. However, practitioners and policymakers are not interested in the forecasting of GT data. The prediction of actual data is the primary goal. Defined Yt the time series of interest (Arrivals) and Xt the observed GT keyword time series, we need to identify a transfer function (TF) from variable Xt to variable Yt,

TF:XtYt
A general formulation is given by
α(L)Yt=β(L)Xt+vt
where
α(L)=1α1LαpLp
β(L)=1β1LβqLq

Given the need of identifying a transfer function without using the lagged values of Yt, we impose α(L) = 1, so we focus on the expression:

(1)Yt=k+β(L)Xt+vt

We identify the order q that maximizes the adjusted coefficient of determination, that is R2¯. The time period used in this step has to be covered by both the series, in particular the shortest one (Arrivals), that is January 2015–December 2020.

Table 2 reports the R2¯ for Model 1 for q = 0,1, …,5. The highest value is reached for q = 4 and the estimated parameters of the TF model are given in Table 3.

The TF equation is then

(2)Yt=429950.9+810.7Xt+2927.2Xt1+4071.9Xt2+1170.4Xt3+3195.8Xt4+νt

3.3 Statistical model for GT keyword time series

After estimating the TF, we have identified a model for GT keyword time series Hotel. The data now end at December 2021.

We have considered different models:

  • 1. The first model is an ARIMA model without any intervention. The model naturally adjusts to the innovations. The resulting model is an ARIMA(0,1,0) × (2,0,0)12 model,

(1Φ1L12Φ2L24)ΔXt=εt
and the estimates are reported in Table 4.

Figure 7 reports the observed GT data and the fitted data from the estimated model.

  • 2. The second model is an ARIMA model with an automatic procedure for the detection of possible transient level changes with a decay factor |δ| <1. The framework is now

Xt=Xt*+i=1kfi(t)
where Xt is the observed (contaminated by the transient changes) series, Xt* is the series we would observe in absence of the transient changes,
fi(t)={0ift<τiω1ift=τiω1δiift=τi+1ω1δi2ift=τi+2

With δi(0.70,0.80,0.90) assuming δi=δi, the model with the lowest RMSE is given by the ARIMA(1,0,0) × (2,1,0)12 with two transient changes (M = 1) at 63rd and 73rd times (i.e. March 2020–January 2021) and δ = 0.9,

Xt=Xt*+f1(t)+f2(t)
where
(1ϕ1L)(1Φ1L12Φ2L24)Δ12Xt=εt
f1(t)={0ift<6363.430ift=630.90(63.430)ift=640.902(63.430)ift=65
f2(t)={0ift<7336.378ift=730.90(36.378)ift=740.902(36.378)ift=75

The parameter estimates of the model are reported in Table 5.

Figure 8 shows the original (observed) series and the series after removing the estimated transient changes. In Figure 9 the observed GT series is plotted together with the fitted series from the estimated model.

The higher flexibility of model allowing for transient changes involves a better fit, as confirmed by RMSE and MAE, which are 11.515 and 7.801 for the former model and 6.630 and 4.927 for the latter.

3.4 Predictions

Using the TF estimated in subsection 3.2, we can predict the original series Arrivals using a two-step procedure. First, we predict the GT keyword Hotel for 2022. The point forecasts as well as the interval forecasts with a confidence level of 95% are shown in Figure 10.

Applying equation (2), we obtain for the year 2022 the mean prediction when the input is given by the point forecast (Figure 11). This is the expected neutral scenario in absence of new relevant occurrences. However, the possible rebound of the tourist activity after two difficult years suggests considering an optimistic scenario as well. In this case, the input of the TF is represented by the upper limit of the interval forecast at the 95% confidence level, that is, the 90th percentile. The higher-level predictions are presented in Figure 12.

4. Conclusions

This work adopts a big data-driven foresight approach to predict the international visitor arrivals in Italy using Transfer Function on the US Google Trends searches data. Our analysis remembering the emerged tech–savvy generations in the evolving visitor demand megatrend demonstrate Google Trends’ excellent potential for forecasting tourism demand in timely manner, particularly useful in a situation where events, like the COVID-19 outbreak or the Russian–Ukrainian conflict, have increased uncertainty in travel planning. Airline tickets are being booked increasingly close to departure, with a reduction between purchase and actual passage, for example, on average for Americans it has gone from 103 to 80 days, while for Japanese even from 76 to 32 days. Keeping in mind that the theory of forecasting is based on the premise that the current and past knowledge can be used to make predictions about the future (Petropoulos et al., 2022), the results obtained from complementary between theoretical developments and applications using transfer function on Google Trends data are in line with the opportunity to implement foresight approach based on new instruments in the tourism industry.

A digital transformation based on digital infrastructure, data strategy, augumented/virtual reality would allow the tourism industry to benefit from timely forecasts for smart and better decision-making.

In this paper, two different models using monthly data have been evaluated. The first model is an ARIMA(0,1,0) × (2,0,0)12 model without any intervention. The second model allows transient-level changes (in particular a transient level change is expected in March 2020). An automatic procedure for the detection of possible transient level changes has been applied. Two scenarios emerged from the analysis characterized by different specific conditions. The first scenario, could be defined as prudent where the effect of perceived potential risk on travel and tourism decision-making after the COVID-19 crisis are still present (Matiza, 2022) and where forecasts are still struggling to reach pre-pandemic levels and the second, defined as optimistic in which the desire to be able to return to travel means that the average forecasts are in fact exceeded by the reality. The second scenario envisaged is reflected both in the data of the first half of 2022 which see are turn of tourist flows in Italy, including international ones and among these, above all Americans (also thanks to a favorable euro/dollar exchange rate) and in the greater confidence in the future on the part of tour operators. If the hypothesis of prudent scenario is compatible with niche phenomena such as tourism zero kilometer and slow (Monaco, 2021), with a resilient approach to adaptation and transformation (Kwok and Koh, 2022) and with the promotion of domestic tourism to fill the gaps left by international tourism, the optimistic one need evaluation of possible over-tourism phenomena, of safety in tourism (Monaco, 2021) and in general, of tourism well-being above all in a sustainability perspective (Larry, 2020).

While aware that factors such as subjectivity in the choice of keywords, Google Trends data quality and limitations of transfer function used only for linear systems (Lai, 1979) could have influenced the results obtained, it is believed that the path identified can make a significant contribution to take proactive measures to reduce uncertainty and effectively prepare for megatrends that will significantly impact tourism. In a future perspective, it could be interesting to observe the scenario generated by the application of the methodology to other origin markets of the international tourism in Italy or by data carried out on other search engines. Furthermore, external variables could be included in the analysis, considering a transfer function in which some external variables are added to the Google Trends values. In addition, obtaining forecasts through the use of transfer function models on Google Trends data could be a valid methodological tool also for other socioeconomic or environmental phenomena. For example, stakeholders of new emerged forms of tourism (i.e. wellness, health, volunteering and bleisure) could benefit from the forecasts obtained with the transfer function models from Google searches. Moreover, it could be useful to mitigate the vulnerability of the tourism sector due to climate change (Leal Filho, 2022).

In conclusions, the availability of a plurality of scenarios allows stakeholders to evaluate in a more coherent way the implications of political choices that can range from investment to restore consumer confidence, to seize the marketing opportunities offered by virtual reality (Mohanty et al., 2020; Monaco and Sacchi, 2023), to improve digital skills and infrastructure (Knezevic et al., 2021) to produce quick response and effective decision-making for future crises, to use of targeted and personalized marketing channels and quality improvement.

Figures

ACF of the monthly series Arrivals, 2015–2020

Figure 1

ACF of the monthly series Arrivals, 2015–2020

ACF of the monthly Google Trends keyword series Visit, 2015–2021

Figure 2

ACF of the monthly Google Trends keyword series Visit, 2015–2021

ACF of the monthly Google Trends keyword series Hotel, 2015–2021

Figure 3

ACF of the monthly Google Trends keyword series Hotel, 2015–2021

Time plot of the monthly series Arrivals, 2015–2020

Figure 4

Time plot of the monthly series Arrivals, 2015–2020

Time plot of the monthly Google Trends keyword series Hotel, 2015–2020

Figure 5

Time plot of the monthly Google Trends keyword series Hotel, 2015–2020

Time plot of the monthly Google Trends keyword series Hotel, 2015–2021

Figure 6

Time plot of the monthly Google Trends keyword series Hotel, 2015–2021

Time plot of the monthly Google Trends keyword series Hotel (black) and fitted values from the ARIMA(0,1,0) × (2,0,0)12 model (red), 2015–2021

Figure 7

Time plot of the monthly Google Trends keyword series Hotel (black) and fitted values from the ARIMA(0,1,0) × (2,0,0)12 model (red), 2015–2021

Time plot of the observed monthly series Arrival Xt(black) and the series depurated by the estimated transient changes Xt∗(red), 2015–2021

Figure 8

Time plot of the observed monthly series Arrival Xt(black) and the series depurated by the estimated transient changes Xt(red), 2015–2021

Time plot of the monthly Google Trends keyword series Hotel (black) and fitted values from the ARIMA(1,0,0) × (2,1,0)12 model with two transient

Figure 9

Time plot of the monthly Google Trends keyword series Hotel (black) and fitted values from the ARIMA(1,0,0) × (2,1,0)12 model with two transient

Predictions of Google Trends for 2022, real data in black, point forecast in blue and interval forecast in gray

Figure 10

Predictions of Google Trends for 2022, real data in black, point forecast in blue and interval forecast in gray

Predictions of arrivals for 2022 based on the mean of the predictor: real data in black and predictions in red

Figure 11

Predictions of arrivals for 2022 based on the mean of the predictor: real data in black and predictions in red

Optimistic predictions of arrivals for 2022 based on the 90 percentile of the predictor: real data in black and predictions in red

Figure 12

Optimistic predictions of arrivals for 2022 based on the 90 percentile of the predictor: real data in black and predictions in red

ρM for different values of M

GTkeywordM = 13M = 25M = 37
Visit0.2690.2350.209
Hotel0.1760.1520.137

Source(s): Authors

R2¯ for TF model for q = 0,1, …,5

qR2¯
10.4764
20.6115
30.6538
40.6763
50.6741

Source(s): Authors

Parameter estimates of the TF model with q = 4

ParameterEstimateStd.Errort-valuep-value
κ−429950.973086.3−5.8830.000
β0810.71396.20.5810.564
β12927.21956.01.4970.140
β24071.91981.72.0550.044
β31170.41937.70.6040.548
β43195.81428.12.2380.029

Source(s): Authors

Estimate of the ARIMA(0,1,0) × (2,0,0)12 model

ParameterEstimateStd.Errort-statp-value
Φ10.2690.1152.3390.022
Φ20.3480.1422.4510.016
σ2137.50
log-lik−323.76
AIC653.53
RMSE11.515
MAE7.801

Source(s): Authors

Estimate of the ARIMA(1,0,0) × (2,1,0)12 model including two transient changes at 63rd and 73rd times (that is March 2020–January 2021)

ParameterEstimateStd.Errort-statp-value
ϕ10.7410.0967.7190.000
Φ1−0.4910.142−3.4580.001
Φ2−0.3860.127−3.0390.003
ω1−63.4307.179−8.8350.000
ω2−36.377.429−4.8970.000
σ255.11
log-lik−247.03
AIC506.06
RMSE6.630
MAE4.927

Source(s): Authors

References

Akhtar, N., Khan, N., Mahroof Khan, M., Ashraf, S., Hashmi, M.S., Khan, M.M. and Hishan, S.S. (2021), “Post-COVID 19 tourism: will digital tourism replace mass tourism?”, Sustainability, Vol. 13 No. 10, p. 5352, doi: 10.3390/su13105352.

Aldao, C., Blasco, D. and Poch Espallargas, M. (2022), “Lessons from COVID-19 for the future: destination crisis management, tourist behaviour and tourism industry trends”, Journal of Tourism Futures, doi: 10.1108/jtf-02-2022-0059.

Aldulaimi, S.H., Ali, B.J., Yas, Q.M., Abdeldayem, M.M., Aswad, A.R. and Hammad, A.M. (2022), “Application of big data analysis to foresight the future: a review of opportunities, approaches, and new”, Research Directions ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS), IEEE.

Andariesta, D.T. and Wasesa, M. (2022), “Machine learning models for predicting international tourist arrivals in Indonesia during the COVID-19 pandemic: a multisource Internet data approach”, Journal of Tourism Futures. doi: 10.1108/jtf-10-2021-0239.

Antolini, F. and Grassini, L. (2019), “Foreign arrivals nowcasting in Italy with Google Trends data”, Quality and Quantity, Vol. 53 No. 5, pp. 2385-2401, doi: 10.1007/s11135-018-0748-z.

Artola, C., Pinto, F. and Pedraza, P.D. (2015), “Can internet searches forecast tourism inflows?”, International Journal of Manpower, Vol. 36 No. 1, pp. 103-116, doi: 10.1108/ijm-12-2014-0259.

Bangwayo-Skeete, P.F. and Skeete, R.W. (2015), “Can Google data improve the forecasting performance of tourist arrivals? Mixed-data sampling approach”, Tourism Management, Vol. 46, pp. 454-464, doi: 10.1016/j.tourman.2014.07.014.

Bank of Italy (2020), “Survey on international tourism”, Banca d’Italia.

Bergman, A., Karlsson, J.C. and Axelsson, J. (2010), “Truth claims and explanatory claims—an ontological typology of futures studies”, Futures, Vol. 42 No. 8, pp. 857-865, doi: 10.1016/j.futures.2010.02.003.

Camacho, M. and Pacce, M.J. (2018), “Forecasting travellers in Spain with Googles search volume indices”, Tourism Economics, Vol. 24 No. 4, pp. 434-448, doi: 10.1177/1354816617737227.

Cebrián, E. and Domenech, J. (2023), “Is Google Trends a quality data source?”, Applied Economics Letters, Vol. 30 No. 6, pp. 811-815, doi: 10.1080/13504851.2021.2023088.

Choi, H. and Varian, H. (2009), “Predicting the present with google trends”, available at: http://cran.r-project.org

Clark, M., Wilkins, E.J., Dagan, D.T., Powell, R., Sharp, R.L. and Hillis, V. (2019), “Bringing forecasting into the future: using Google to predict visitation in U.S. national parks”, Journal of Environmental Management, Vol. 243, pp. 88-94, doi: 10.1016/j.jenvman.2019.05.006.

Claude, U. (2020), “Predicting tourism demands by google trends: a hidden Markov models based study”, Journal of System and Management Sciences, Vol. 10 No. 1, pp. 106-120.

Dergiades, T., Mavragani, E. and Pan, B. (2018), “Google Trends and tourists arrivals: emerging biases and proposed corrections”, Tourism Management, Vol. 66, pp. 108-120, doi: 10.1016/j.tourman.2017.10.014.

Dinis, G., Costa, C. and Pacheco, O. (2017), “Forecasting British tourist inflows to Portugal using google trends data” in Katsoni, V., Upadhya, A. and Stratigea, A. (Eds), Tourism, Culture and Heritage in a Smart Economy 483-496, Springer Proceedings in Business and Economics, Springer, Cham.

Dinis, G., Breda, Z., Costa, C. and Pacheco, O. (2019), “Google Trends in tourism and hospitality research: a systematic literature review”, Journal of Hospitality and Tourism Technology, Vol. 10 No. 4, pp. 747-763, doi: 10.1108/jhtt-08-2018-0086.

Emili, S., Figini, P. and Guizzardi, A. (2020), “Modelling international monthly tourism demand at the micro destination level with climate indicators and web-traffic data”, Tourism Economics, Vol. 26 No. 7, pp. 1129-1151, doi: 10.1177/1354816619867804.

European Travel Commission (2022), “Market intelligence report”, available at: www.etc-corporate.org

Feng, Y., Li, G., Sun, X. and Li, J. (2019), “Forecasting the number of inbound tourists with Google Trends”, Procedia Computer Science, Vol. 162, pp. 628-633, doi: 10.1016/j.procs.2019.12.032.

Fronzetti Colladon, A., Guardabascio, B. and Innarella, R. (2019), “Using social network and semantic analysis to analyze online travel forums and forecast tourism demand”, Decision Support Systems, Vol. 123, 113075, doi: 10.1016/j.dss.2019.113075.

Gnanapragasam, S.R. and Cooray, T.M.J.A. (2016), “Forecasting post-war TouristArrivals to Sri Lanka using dynamic transfer function method”, International Journal of Multidisciplinary Studies, Vol. 3 No. 2, p. 111, doi: 10.4038/ijms.v3i2.13.

Goel, S., Hofman, J.M., Lahaie, S., Pennock, D.M. and Watts, D.J. (2010), “Predicting consumer behavior with websearch”, Proceedings of the National Academy of Sciences of the United States of America, Vol. 107 No. 41, pp. 17486-17490, doi: 10.1073/pnas.1005962107.

Gunter, U. and Onder, I. (2016), “Forecasting city arrivals with google analytics”, Annals of Tourism Research, Vol. 61, pp. 199-212, doi: 10.1016/j.annals.2016.10.007.

Havranek, T. and Zeynalov, A. (2018), Forecasting Tourist Arrivals with Google Trends and Mixed Frequency Data ZBW, Leibniz Information Centre for Economics, Kiel, Hamburg.

Hopken, W., Eberle, T., Fuchs, M. and Lexhagen, M. (2019), “Google Trends data for analyzing tourists online search behavior and improving demand forecasting: the case of e, Sweden”, Information Technology and Tourism, Vol. 21 No. 1, pp. 45-62, doi: 10.1007/s40558-018-0129-4.

Hopken, W., Eberle, T., Fuchs, M. and Lexhagen, M. (2021), “Improving tourist arrival prediction: a big data and artificial neural network approach”, Journal of Travel Research, Vol. 60 No. 5, doi: 10.1177/0047287520921244.

Huarng, K.H. and Yu, T.H.K. (2019), “Application of Google trends to forecast tourism demand”, Journal of InternetTechnology, Vol. 20 No. 4, pp. 1273-1280.

Istat (2019), Occupancy in Collective Accommodation Establishments, Italian National Institute of Statistics.

Jackman, M. and Naitram, S. (2015), “Research note: nowcasting tourist arrivals in Barbados-just Google it”, Tourism Economics, Vol. 21 No. 6, pp. 1309-1313, doi: 10.5367/te.2014.0402.

Jun, W., Yuyan, L., Lingyu, T. and Peng, G. (2018), “Modeling a combined forecast algorithm based on sequence patterns and near characteristics: an application for tourism demand forecasting”, Chaos, Solitons and Fractals, Vol. 108, pp. 136147-147, doi: 10.1016/j.chaos.2018.01.028.

Kanchana, W. and Shyama, R. (2021), “The role of disaggregated search data in improving tourism forecasts: evidence from Sri Lanka”, Current Issues in Tourism, Vol. 24 No. 19, pp. 2740-2754, doi: 10.1080/13683500.2020.1849049.

Kim, E., Lee, J.A., Sung, Y. and Choi, S.M. (2016), “Predicting selfie-posting behavior on social networking sites: an extension of theory of planned behavior”, Computers in Human Behavior, Vol. 62, pp. 116-123, doi: 10.1016/j.chb.2016.03.078.

Knezevic, C., Velbar, L., Rodrigues, M., Teoh, T., Antonucci, B., Cutrufo, N. and Marongiu, L. (2021), Relaunching Transport and Tourism in the EU after COVID-19 Part IV, Tourism sector Policy Department for Structural and Cohesion Policies Directorate-General for Internal Policies, European Parliament.

Kulendran, N. and Witt, S.F. (2003), “Leading indicator tourism forecasts”, Tourism Management, Vol. 24 No. 5, pp. 503-510, doi: 10.1016/s0261-5177(03)00010-4.

Kwok, A.O. and Koh, S.G. (2022), “COVID-19 implications for tourism: shifts, trends, challenges, and opportunities”, in COVID-19 and the Evolving Business Environment in Asia: the Hidden Impact on the Economy, Business and Society, pp. 27-51.

Lai, P.W. (1979), “Transfer function modelling: relationship between time series variables”, Concepts and Techniques in Modern Geography (CATMOG). London School of Economics, No. 22.

Larry, D. (2020), “Tourism development and sustainable well-being: a beyond GDP perspective”, Journal of Sustainable Tourism, Vol. 31 No. 10, pp. 2399-2416, doi: 10.1080/09669582.2020.1825457.

Leal Filho, W. (2022), “Will climate change disrupt the tourism sector?”, International Journal of Climate Change Strategies and Management, Vol. 14 No. 2, pp. 212-217, doi: 10.1108/IJCCSM-08-2021-0088.

Li, X. and Law, R. (2020), “Forecasting tourism demand with decomposed search cycles”, Journal of Travel Research, Vol. 59 No. 1, pp. 52-68, doi: 10.1177/0047287518824158.

Li, X., Law, R., Xie, G. and Wang, S. (2021), “Review of tourism forecasting research with internet data”, Tourism Management, Vol. 83, doi: 10.1016/j.tourman.2020.104245.

Li, G., Song, H. and Witt, S.F. (2005), “Recent developments in econometric modeling and forecasting”, Journal of Travel Research, Vol. 44 No. 1, p. 8299, doi: 10.1177/0047287505276594.

Li, J., Xu, L., Tang, L., Wang, S. and Li, L. (2018), “Big data in tourism research: a literature review”, Tourism Management, Vol. 68, p. 301323, doi: 10.1016/j.tourman.2018.03.009.

Matiza, T. (2022), “Post-COVID-19 crisis travel behaviour: towards mitigating the effects of perceived risk”, Journal of Tourism Futures, Vol. 8 No. 1, pp. 99-108, doi: 10.1108/JTF-04-2020-0063.

Mohanty, P., Hassan, A. and Ekis, E. (2020), “Augmented reality for relaunching tourism post-COVID-19: socially distant, virtually connected”, Worldwide Hospitality and Tourism Themes, Vol. 12 No. 6, pp. 753-760, doi: 10.1108/whatt-07-2020-0073.

Monaco, S. (2021), Tourism, Safety and COVID-19: Security, Digitization and Tourist Behaviour, Routledge, New York.

Monaco, S. and Sacchi, G. (2023), “Travelling the metaverse: potential benefits and main challenges for tourism sectors and research applications”, Sustainability, Vol. 15 No. 4, p. 3348, doi: 10.3390/su15043348.

Naccarato, A., Falorsi, S., Loriga, S. and Pierini, A. (2018), “Combining official and Google Trends data to forecast the Italian youth unemployment rate”, Technological Forecasting and Social Change, Vol. 130, pp. 114-122, 130, doi: 10.1016/j.techfore.2017.11.022.

OECD (2018), “Analysing megatrends to better shape the future of tourism”, OECD Tourism Papers (2). OECD Publishing Paris, doi: 10.1787/d465eb68-en.

OECD (2022), OECD Tourism Trends and Policies 2022, OECD Publishing Paris, doi: 10.1787/a8dd3019-en.

Onder, I. (2017), “Forecasting tourism demand with Google trends: accuracy comparison of countries versus cities”, International Journal of Tourism Research, Vol. 19 No. 6, pp. 648-660, doi: 10.1002/jtr.2137.

Pan, B. and Yang, Y. (2017), “Forecasting destination weekly hotel occupancy with big data”, Journal of Travel Research, Vol. 56 No. 7, pp. 957-970, doi: 10.1177/0047287516669050.

Pan, B., Xiang, Z., Law, R. and Fesenmaier, D.R. (2011), “The dynamics of search engine marketing for tourist destinations”, Journal of Travel Research, Vol. 50 No. 4, pp. 365-377, doi: 10.1177/0047287510369558.

Pan, B., Wu, D.C. and Song, H. (2012), “Forecasting hotel room demand using search engine data”, Journal of Hospitality and Tourism Technology, Vol. 3 No. 3, pp. 196-210, doi: 10.1108/17579881211264486.

Park, S., Lee, J. and Song, W. (2017), “Short- term forecasting of Japanese tourist inflow to South Korea using Google Trends data”, Journal of Travel Tourism Marketing, Vol. 34 No. 3, pp. 357-368, doi: 10.1080/10548408.2016.1170651.

Petropoulos, F., Apiletti, D., Assimakopoulos, V., Zied Babai, M., Barrow, D.K., and Ziel, F. (2022), “Forecasting: theory and practice”, International Journal of Forecasting, Vol. 38 No. 3, pp. 705-871, ISSN 0169-2070, doi: 10.1016/j.ijforecast.2021.11.001.

Saritas, O., Burmaoglu, S. and Ozdemir, D. (2022), “The evolution of Foresight: what evidence is there in scientific publications?”, Futures, Vol. 137, 102916, doi: 10.1016/j.futures.2022.102916.

Sharma, G.D., Thomas, A. and Paul, J. (2021), “Reviving tourism industry post-COVID-19: a resilience-based framework”, Tourism Management Perspectives, Vol. 37, 100786, doi: 10.1016/j.tmp.2020.100786.

Song, H., Li, G. and Witt, S.F. (2010), “Tourism demand modelling and forecasting: How should demand be measured?”, Tourism Economics, Vol. 16 No. 1, doi: 10.5367/00000001079087221.

Tian, F., Yang, Y., Mao, E. and Tang, W. (2021), “Forecasting daily attraction demand using big data from search engines and social media”, International Journal of Contemporary Hospitality Management, Vol. 33 No. 6, pp. 1950-1976, doi: 10.1108/ijchm-06-2020-0631.

Volchek, K., Liu, A., Song, H. and Buhalis, D. (2019), “Forecasting tourist arrivals at attractions: search engine empowered methodologies”, Tourism Economics, Vol. 25 No. 3, pp. 425-447, doi: 10.1177/1354816618811558.

Vu, H.Q., Li, G., Law, R. and Zhang, Y. (2018), “Tourist activity analysis by leveraging mobile social media data”, Journal of Travel Research, Vol. 57 No. 7, pp. 883-898, doi: 10.1177/0047287517722232.

World Tourism Organization (2021), “A year in review”, UN Tourism, available at: https://www.unwto.org/2021-a-year-in-review

World Tourism Organization (2022), “UNWTO Global Tourism Dashboard”, World Travel Tourism Council, (2021) Economic Impact Reports.

World Travel and Tourism Council (2022), Economic Impact 2022, Global Trends, available at: https://wttc.org/Portals/0/Documents/Reports/2022/EIR2022-Global%20Trends.pdf

Yang, X., Pan, B., Evans, J.A. and Lv, B. (2015), “Forecasting Chinese tourist volume with search engine data”, Tourism Management, Vol. 46, pp. 386-397, doi: 10.1016/j.tourman.2014.07.019.

Corresponding author

Monica Rosciano is the corresponding author and can be contacted at: monica.rosciano@uniparthenope.it

About the authors

Giovanni De Luca is Professor of Economic Statistics at the Department of Management and Quantitative Studies, University of Napoli “Parthenope”. He teaches time-series analysis. His research interests include time-series analysis, forecasting and multivariate analysis.

Monica Rosciano is Assistant Professor of Economic Statistics at the Department of Management and Quantitative Studies, University of Napoli “Parthenope”. She teaches statistical models for data analysis. Her research concerns the issues of financial well-being, tourism competitiveness and sustainability.

Related articles