PICTUREE—Aedes: A Web Application for Dengue Data Visualization and Case Prediction

Dengue fever remains a significant public health concern in many tropical and subtropical countries, and there is still a need for a system that can effectively combine global risk assessment with timely incidence forecasting. This research describes an integrated application called PICTUREE—Aedes, which can collect and analyze dengue-related data, display simulation results, and forecast outbreak incidence. PICTUREE—Aedes automatically updates global temperature and precipitation data and contains historical records of dengue incidence (1960–2012) and Aedes mosquito occurrences (1960–2014) in its database. The application utilizes a mosquito population model to estimate mosquito abundance, dengue reproduction number, and dengue risk. To predict future dengue outbreak incidence, PICTUREE—Aedes applies various forecasting techniques, including the ensemble Kalman filter, recurrent neural network, particle filter, and super ensemble forecast, which are all based on user-entered case data. The PICTUREE—Aedes’ risk estimation identifies favorable conditions for potential dengue outbreaks, and its forecasting accuracy is validated by available outbreak data from Cambodia.


Introduction
The PICTUREE-Aedes web application is an innovative system that integrates tools for displaying dengue-related data, visualizing simulated results, and forecasting future cases for an ongoing outbreak. This application is a product of the Predicting Insect Contact and Transmission Using histoRical Entomological and Environmental data (PICTUREE) project, which was initiated in 2019.
Dengue fever is an Aedes mosquito-borne viral infection that threatens the health of over 3 billion people in numerous tropical and subtropical regions [1]. Aedes aegypti mosquito is the primary carrier of dengue viruses, with the ability to bite multiple individuals in a brief period, and its immature phase can be found in water containers near human dwellings [2]. The proliferation, biting rates, infective rate, and survivability rate of Aedes mosquitoes are substantially affected by climatic conditions such as rainfall, temperature, and relative humidity, which, in turn, have a significant impact on the spread of the dengue virus [3][4][5]. The breeding time and maturity period of Aedes mosquitoes are shortened due to global warming, resulting in the accelerated growth of the vector population [6]. Furthermore, increasing human mobility and high population density have facilitated the geographical spread of dengue viruses over long distances [7]. Therefore, an integrated Table 1 presents a comprehensive list of the data sources. The daily global temperature and precipitation data are obtained from the National Oceanic and Atmospheric Administration (NOAA) weather stations [24]. The ERA5 hourly data provide hourly estimates of atmospheric variables, such as temperature, pressure, and humidity, which are used for simulation purposes [25]. The terrestrial ecoregions data are utilized to select areas with similar climate conditions [26][27][28]. The Aedes mosquito occurrence dataset and dengue occurrence dataset are geo-positioned occurrence records spanning from 1960 to 2012 [29,30]. Country-wise Google trend data for the keywords "dengue" and "fever" are also collected. The aforementioned data are either directly displayed on the map or are used for the simulations and forecasts [31].

Simulation Tools
The PICTUREE-Aedes simulations consist of two primary components: (1) the estimations of dengue risk-related measures and (2) the forecasts of case number for an ongoing outbreak. The tools graphically represent current and future risk of dengue transmission for users based on the life history of the mosquito vector.

Estimating Dengue Risk-Related Measures
We developed a dengue transmission risk model by training a neural network classifier over historical dengue outbreak data. The model uses the estimated vector population and the Dengue transmission reproductive number as predictive variables. In order to estimate the vector population and reproductive number, we developed a mechanistic model to simulate the Ae. aegypti population. The mechanistic model incorporates the mosquito's temperature-dependent parameters in a system of integral equations to calculate the mosquito population at various stages of development, including immature and adult stages. A diagram of the mechanistic model is shown in Figure 1. The set of integral equations precisely considers the temperature-dependent development time required for mosquitoes to transform from an egg into adult form. Given that the development period tends to be longer in seasons with cooler temperatures, such precise calculation is especially crucial in estimating the mosquito population in areas that experience significant temperature fluctuations throughout the year.

Vector Population
To accurately simulate the mosquito population fluctuations, the model accounts for the time spent in each life history stage during the aquatic or immature portion of the mosquito life cycle, as shown in Figure 1. The development rates of these stages are temperature-dependent, with warmer temperatures resulting in faster development. The temperature-dependent rates were determined from experimental data [5,8,10]. In addition to temperature, the effect of precipitation on the mosquito population is increased through the carrying capacity of the location: more rainfall equates to greater availability of habitats or more containers filled with water. Alternatively, less rainfall in some urban areas results in more irrigation or water storage, leading to an increase in filled containers, but this is location dependent. The equations that describe the life cycle of the mosquitoes are non-Markovian, meaning that transition times to new stages in the cycle are not exponential random variables. The transition to a new stage happens when the development of a current stage is complete. Moreover, the development rates are strongly temperature-dependent, and temperature changes throughout the year. Therefore, we use integral equations to describe the life cycle of the mosquitoes accurately.
In order to estimate the impact of precipitation on carrying capacity, the life cycle equations are extended to account for virus transmission and to subsequently fit the model to various dengue outbreak data from around the world using particle filter smoothing. This enables us to indirectly estimate the carrying capacity and its variation with precipitation. For this process, we used 67 dengue epidemic curves from Brazil, Mexico [32], Malaysia [33], Pakistan [34], Philippines [35], Singapore [36], Taiwan [16], Sri Lanka [37], Costa Rica [38] and Vietnam [39]. Particle filtering is an estimation technique known as sequential Monte Carlo, which involves iteratively updating the estimated parameters using a sequence of observation. We used a number of new cases as observations, and the carrying capacity is the unknown parameter that is sequentially estimated.
The local mosquito populations are calculated using the life cycle equations, daily average temperature, and precipitation for each location around the globe with longitude and latitude resolutions of 0.25 degrees. The result is the time-dependent profile of the mosquito population. The calculated values are not the exact populations, rather they reflect the variation of the mosquito population in that location through time.

Dengue Transmission Reproductive Number
The value of dengue's basic reproductive number R 0 depends on the ratio of mosquito to human population V/N H , mosquito biting rate α, human-to-vector and vector-to-human transmission probabilities per infectious bite φ H→V , φ V→H , extrinsic incubation period in the mosquito γ V , mosquito mortality rate µ V , and infectious period of human σ H . These parameters are temperature-dependent and can be found in [40], except for V/N H . The R 0 is derived assuming constant temperature and using Markovian equations. When R 0 < 1, the viral transmission will result in fewer new cases than past cases, and the infection will burn out if maintained for sufficient time.
Conversely, interpretation that the infection is going to die out when R 0 < 1 is only relevant if this condition is satisfied for a long period. However, if the temperature is almost constant for a period longer than the relevant periods in the transmission model, the R 0 can be used to quantify the severity of virus transmission. In other words, when R 0 > 1, an infected individual is expected to generate more infection in the population. For this reason, we utilize the value of R 0 to gauge the severity of virus transmission. The mosquito-to-human population ratio V/N H is time-dependent and is affected by the carrying capacity of the environment. To estimate the carrying capacity per human, we fit the transmission model to epidemic dengue curves from different locations around the world with a given human population. For each location, we obtain a constant component for the carrying capacity and a parameter that determines the variation of the carrying capacity with precipitation. We use the distribution of these parameters to find a statistical model for the carrying capacity per human. Although this model may not give the exact value for the carrying capacity, we use it as an approximate input in the calculation of R 0 . Our calculation shows that R 0 is almost proportional to the carrying capacity; therefore, the calculated R 0 provides a good approximation for the time-dependent profile of the exact reproductive number. Higher R 0 values indicate greater dengue virus transmission in an area; therefore, users can quickly understand where control measures need to be initiated and evaluate if they are working by observing changes in R 0 .

Dengue Transmission Risk
To estimate dengue transmission risk, we use the dengue transmission reproductive number and vector population calculated using the mechanistic model as predictive variables. For each location, we calculate the one-week moving average of the variables and apply them in a trained neural network classifier to obtain a time-dependent transmission risk. Assuming the historical presence of the mosquitoes and the virus, the risk factor is a value between zero and one corresponding to the least and the most favorable environmental condition, respectively, for dengue transmission. To train the model, we used dengue epidemic data from Brazil, Mexico, Malaysia, Pakistan, Philippines, Singapore, Taiwan, Sri Lanka, Costa Rica and Vietnam. For each one of the epidemic curves, we manually tagged periods of high and low transmission. If the number of new cases is small and comparable to the baseline for the location, we tag that period as zero, otherwise as one. In addition, we calculate the corresponding predictive variable as described above for each location and period. Using the tag values as the target variable and the corresponding predictive variables, we train the neural network classifier for risk evaluation.
In Figure 2, we show the data used for training as green and yellow dots, indicating low (0) and high (1) transmission, respectively. The neural network classifier is a nonlinear model that assigns a risk value between zero and one to points on the surface defined by the predictive variables V/N h and R 0 . Within the depicted image, regions displaying risk values exceeding 0.5 are indicated by the color red, while those with values below 0.5 are represented by the color blue.

Forecasting Dengue Case Number
In this section, we outline the methodology we have developed for predicting the future incidence of ongoing dengue outbreaks. Our approach entails analyzing userprovided data, including reported incidence, location, and date, and extracting pertinent environmental information from our extensive database. These data are then utilized to generate forecasts, which are presented by the statistics including upper quantile, median, and lower quantile values of the forecast distribution. Case forecasts help PICTUREE-Aedes users understand how the dengue outbreak will change with time. Forecasts of increasing cases suggest that allocating addition resources to an area is needed due to continued transmission, whereas decreasing case forecasts suggest that resources can be reallocated and that the outbreak is getting smaller.

SEIR-SEI-EnKF Forecast
The SEIR-SEI-EnKF forecasts are generated by employing the disease transition model SEIR-SEI combined with the EnKF data-assimilation technique. The SEIR-SEI-EnKF model is a new approach for estimating and forecasting dengue outbreak dynamics, which has been detailed in [16]. The SEIR-SEI model is a compartmental model that describes the transmission of the dengue virus between the host and vector population groups. This model considers the compartments for the human population and vector population, such as susceptible (S H andS V ), exposed (E H andE V ), infectious (I H andI V ), and removed (R H ). Transitions between compartments are described by the following difference equations At each time step, the EnKF first uses the model simulations to predict the prior, which is the expected state of the system where g(Φ t ) is the difference equation for the parameters, and q t is the model noise. Therefore, the ensemble covariance for the prior is given by whereĀ − t is the mean of ensemble members. To update the estimated state and parameters of the system, the EnKF assimilates new observations y t into the ensembleŶ t = y t + t , where t is the perturbation vector. The assimilation step involves comparing the predicted states of the ensemble with the observed data and adjusting the estimated state and parameters of the system to better fit the observations where K t = σ pri t H t (H t σ pri t H t + σ obs t ) −1 is the Kalman gain, and H t is the measurement index matrix, σ obs t = t t N−1 . Finally, the EnKF generates a new ensemble of model simulations by perturbing the updated estimated state and parameters of the system, and the process repeats for the next time step. In summary, the EnKF utilizes a data assimilation approach that combines real-time observations with mathematical models to accurately forecast dengue epidemics, helping to inform proactive measures for disease control and prevention.

Neural Network Forecast
The second approach to predict weekly cases uses a recurrent neural network model, as shown in Figure 3. Assuming that the index of the current week is represented by t, the model uses a sequence of weekly new cases up to t, c t−4 , c t−3 , c t−2 , c t−1 , c t , as input to predict new cases in the next three weeks. In addition to the sequence of new cases, the model uses the vector population and reproductive number sequences, calculated independently using the model discussed in the previous section.
The vector population V t+h−4 , V t+h−3 , V t+h−2 , V t+h−1 , V t+h and reproductive number {R t+h−4 , R t+h−3 , R t+h−2 , R t+h−1 , R t+h } include estimated vector population and reproductive number in the weeks ahead. Since the calculation of these sequences requires temperature as input, we use the average of temperature for the corresponding period in the past years. In addition to the sequences we discussed, we feed the error of the latest two predictions to the neural network model. We train this neural network model using historical dengue epidemic curves from different locations around the world. Although it is possible to train a neural network model that generates one, two, and three weeks of predictions, we choose to train independent models for each one of these prediction horizons. Here, the obtained output is a single value for each prediction time interval.

Particle Filter Forecast
As a third approach, we apply particle filter prediction to the transmission model, as depicted in Figure 4. In the transmission model, we divide the adult female mosquitoes into three compartments based on the virus development in the mosquitoes. We assume that all the newly developed adult mosquitoes are virus-free, (Adult-S). During the gonotrophic period, whose length is temperature-dependent, the mosquitoes may bite an infectious human, acquire the virus, and remain in the adult-exposed (Adult-E) state until they become infectious and are able to transmit the virus. The sojourn time for the Adult-E state is temperature-dependent. After becoming infectious, the mosquitoes infect susceptible humans during blood meals in their gonotrophic cycle until they die. We assume that the number of infecting bites per cycle is constant, and the bites occur uniformly in each cycle period. The human infection rate is A I B/(CN), where A I is the population of infectious mosquitoes, B is the number of infecting bites per gonotrophic cycle, C is the cycle length, and N is the human population. Similarly, the mosquitoes' infection rate is H I B/(CN), where H I is the infectious human population. In the particle filter estimation, we assume two unknown parameters, which are carrying capacity and the number of infecting bites per gonotrophic cycle. Like the mosquito life cycle model as shown in Figure 1, we account for non-Markovian transitions in the virus transmission model. Particle filtering is a sequential Monte Carlo approach that employs a group of samples to represent the posterior distribution of the unknown parameters. In each step of the process, a weight is assigned to each sample based on the observed data. During the resampling phase, particles with low weights are substituted by new particles in close proximity to the particles with higher weights. An advantage of the particle filter is its ability to handle observations from any probability distribution. In our particle filter estimation, we provide the algorithm with a sequence of weekly new human cases and estimate posterior distributions of new human cases for the one-week up to three-week horizon.

Super Ensemble Forecast
Using the competing forecasts described above, a super-ensemble forecast can be created by applying a weighted average of the individual forecasts. The weights of the three individual forecasting methods is determined based on the root-mean-square error (RMSE) of those forecasts over prior predictions. Specifically, the RMSE of the forecasts made for time t − 3, t − 2, and t − 1 can be used to generate the super-ensemble forecast for time T. The RMSE of each individual forecast e i (i = 1, 2, 3) can be calculated as follows: where f i t is the median forecast of method i at time t, and O t is the observation at time t. The weight of method i can be obtained by The super-ensemble forecast is given by

System Architecture
The web application consists of several components as shown in Figure 5, which we broadly categorize into (1) the frontend client application and (2) the server backend.

Client Application
The web-based client is built on top of HTML and is accessible using any web browser. The application's graphical interface consists of an interactive map viewer and tools for data selection, filtering, parameter inputs, and submission of on-demand analyses. The map is displayed using the open-source JavaScript library leaflet.js [41]. The application communicates directly with the server backend to load maps, results, and data.

Server Backend
• HTTP server: The backend system exposes multiple API (application programming interface) endpoints to the client via an HTTP server. This HTTP server runs in the Node.js environment [42]. • Database: The database service consists of PostgreSQL and PostGIS [43] extensions to handle spatial queries. The database contains multiple tables to store data collected from various sources and simulation results. • Map tile service: When the client requests the home page of the map tool, the HTTP server loads map tiles from a separate map tile service called MapTiler [44]. This service helps with memory management by loading only the parts and zoom levels required by the client. The service takes input from large tile files collected from OpenStreetMap [45]. • Simulation tools: The system can deploy and run executable code to perform ondemand analysis and simulations on selected data. The HTTP service directly communicates with the executables to pass arguments and to collect results. • Scheduled updates: The system contains scripts that run regularly to update the database tables with new weather data. The script connects to the NOAA-NCEI [24] servers for daily weather observation data.  The PICTUREE-Aedes enables users to identify weather stations and their recorded data based on their preferences. Figure 7a illustrates the distribution of weather stations in Central Indochina (Ecoregion). Users can select a station and plot its precipitation and temperature data. Figure 7b demonstrates the daily temperature and precipitation data of station UBON RATCHATHANI throughout 2022.  Figure 8 presents the occurrence records of dengue, Aedes aegypti, and Aedes albopictus, marked, respectively, with red, blue, and purple dots in Central Indochina. Clicking on a dot will generate a popup that displays the country name and the year of occurrence. Additionally, the PICTUREE-Aedes supports the display of information on Google Trends. Figure 9 illustrates the trend of "dengue" and "fever" in Cambodia during January 2018. Figure 9. Google search trends of "dengue" and "fever".

Simulation Results
The PICTUREE-Aedes offers four types of mosquito simulation results, including daily average temperature, vector host ratio, reproduction number, and dengue risk, which can be visualized spatially and temporally. To view the data spatially, users must select a specific day, and a heat map of the selected area will display the selected data value. Figure 10 displays the average temperature (a), dengue transmission reproduction number (b), vectorhost ratio (c), and dengue transmission risk (d) of the Indochina peninsula on 15 September 2022. Visualizing data beyond the next six months from the current time is limited.
To visualize data temporally, users must select a location and a time range. Figure 11 displays mosquito simulation data from 1 January to 11 June 2022 at the user-specified location. Please note that the query time frame is limited to a six-month horizon from the current date.   The PICTUREE-Aedes provides three types of predictive tools and their ensemble forecast. Figure 12 illustrates the predicting results of the dengue outbreak in Cambodia during 2022. The first 13 data points (before 27 April) represent weekly dengue cases, while the remaining points (including/after 27 April) denote the predictions. Figure 12a displays the distribution of SEIR-SEI-EnKF predictions for the next three weeks, including quantiles and median. Figure 12b illustrates the neural network prediction for the next three weeks. Figure 12c shows the results of the particle filter, presenting the distribution for the next four weeks. Figure 12d displays the result of the super-ensemble, showcasing the median prediction from each member, the average, and weighted average of the members.   Table 2 displays the outcome of the prediction as well as the reported cases. All of the reported cases in the following weeks are situated within the prediction range, which confirms the precision of the prediction tools.

Discussion
Dengue fever is a viral mosquito-borne disease transmitted predominately by the Aedes mosquitoes. It infects an estimated 400 million people annually, with nearly half the world's population at risk of infection [46]. Dengue cases are influenced by complex interactions of ecology, environment, meteorological factors, and virus factors (serotype), among many others [47].
The PICTUREE-Aedes is an innovative and unique system that serves as a valuable resource for both researchers and health officials in their efforts to advance scientific research and to effectively manage dengue outbreaks. By collecting and aggregating a comprehensive set of critical factors related to dengue transmission, such as reanalyzed temperature, precipitation, occurrences of Aedes mosquitoes, and dengue cases, the PICTUREE-Aedes provides researchers with a rich dataset for their investigations. This data-driven approach allows researchers to gain deeper insights into the complex dynamics of dengue transmission and aids in the development of more effective prevention and control strategies. In addition to its research-oriented features, the PICTUREE-Aedes offers essential information to health officials, enabling them to proactively prepare and adapt to the changing situation during dengue outbreaks. The system employs sophisticated algorithms to estimate Aedes mosquito abundance, dengue transmission reproduction number, and dengue risk, providing crucial insights into the intensity and geographical spread of the disease. Armed with this knowledge, health officials can make informed decisions about when and where to implement targeted vector control measures, effectively minimizing the risk of dengue transmission and safeguarding public health. Moreover, the PICTUREE-Aedes goes beyond monitoring and estimation by offering valuable and accurate forecasting capabilities. By utilizing its advanced modeling techniques, the system can predict future dengue case counts, empowering health officials and healthcare facilities to deploy medical personnel and to allocate treatment resources more efficiently. This forecasting functionality is instrumental in anticipating the potential impact of dengue outbreaks, enabling proactive measures to be taken in terms of resource allocation, capacity planning, and public health awareness campaigns.
The selection of a 6-month forecast horizon was made without a specific rationale. The forecast horizon relies on the availability of dependable environmental variables for the future. Presently, we employ the average temperatures from previous years for the corresponding timeframe in the next six months. However, the current values of the calculated variables are adjusted based on the most recent available environmental data. The current levels will influence the future levels of the calculated variable. However, for the dengue case number forecast horizon, we chose 3 weeks because human intervention can change the underlying physical transmission process in the long term.
Concerning possible limitations of the proposed system, the temperatures displayed are collected by the weather stations; thus, they do not always reflect temperature in areas close to the weather station. For example, shaded and barren areas near the same weather station will be considered at the same daily temperature. In contrast, the actual temperature of those places can range significantly. There are many water sources beyond precipitation, and Aedes mosquitoes are very efficient at finding these areas to lay eggs, such as domestic or industrial wastewater, sprinkler systems, air conditioning units, etc. [48]. Therefore, the accuracy of the local temperature and precipitation and the estimation results obtained considering only temperature and precipitation can be erroneous in some areas, and local data are needed to adjust the models. Furthermore, human behaviors cannot be accounted for at this time [49]. Water collection during the dry seasons, litter accumulating water, and irrigating lawns have all been demonstrated to impact Ae. aegytpi numbers, and this global model cannot account for local behaviors at this time [50]. In the future, local models can be generated and placed within the global model. Additionally,there is an increasing significance of female mosquito vertical transmission through the dry quiescent egg population. These infected eggs play a potential role in triggering dengue incidence following periods of low mosquito activity, such as the dry season [51]. In our future work, we plan to explore the dynamics of female mosquito vertical transmission in greater depth. This will involve investigating the factors influencing the rate of infected eggs, studying the survival and viability of infected offspring, and analyzing the potential amplification of dengue transmission through this mode. For the predictions, four different approaches provide reasonably accurate forecasts. However, some manual calibrations of the tools are needed to make the predictions more accurate at a local scale.
Future improvements include increasing the accuracy of the population abundance estimation in areas where no data are available, exploring the dynamics of female mosquito vertical transmission, optimizing the layout of the webpage, extrapolating the mosquito population dynamics from surveyed areas to unknown regions, and extending the tool to Culex mosquito genera.

Conflicts of Interest:
The authors declare no conflict of interest.