Introduction

Wayfinding and orientation in built environments are essential aspects of people’s daily lives. Many of these environments are unfamiliar and thus require wayfinding assistance. There are various types of “knowledge in the world” (i.e., external knowledge) [1] that can help individuals find their way such as signs, maps, landmarks, and fixed geometric aspects of the environment [2,3,4,5]. Signs can be particularly easy to understand because they require less abstraction than you-are-here maps, to adapt because they can accommodate changes to the environment, and to quantify because they allow for the measurement of relevant information [6]. Indeed, recent research has validated the importance of the perception, interpretation, and usage of the directional information provided by signage systems for improving evacuations [7]. In order for agent-signage interaction to be effective, signs should communicate meaningful navigation information among a plethora of noise, and the agent must be able to perceive and interpret such information. A proper signage system can reduce the perceived complexity of the environment and improve wayfinding performance and user experience [8]. However, existing measures [9, 10] are insufficient by themselves for assisting designers in the reduction of complexity/uncertainty (i.e., information) using signage.

The most important features of a biological, cognitive system are the abilities to perceive, act, and learn. Many researchers have built computational models of intelligent cognitive systems [11,12,13] that can cooperate with other agents in a shared workspace towards a common goal [14], perform decision-making during path-planning and avoid obstacles with flying drones [15], and perceive information from an unfamiliar dynamic environment [12]. These abilities can also be characterized in terms of Shannon information theory [16, 17]. Indeed, Fuster and colleagues [18] defined the perception-action cycle as the circular exchange of information from the environment to sensory and motor structures and back to the environment via goal-directed behavior.

Information theory is a branch of mathematics used to describe the manner in which uncertainty can be quantified, manipulated, and represented [19, 20]. This uncertainty can be used to characterize the transmission of information in various communication systems [21], including those from computer science [22], philosophy [23], physics [24], and cognitive science [25]. Information may be transferred within and between internal and external representations of space [26, 27] and requires a source, channel, and receiver. For example, we define the source as the wayfinding information afforded by a signage system, the channel as the agent’s sensory and perceptual apparatuses used to interpret the information, and the receiver as the knowledge of a sign’s directional information acquired by an agent. During the process of information flow from the source to the receiver, there is a loss of information that can occur for various reasons. Here, we do not investigate these reasons but instead focus on the quantity of information that is lost. Information theory quantifies such loss during transmission as uncertainty caused by the noisy channel and can be used as the foundational principle for quantifying wayfinding information in the environment.

With a similar focus on environmental constraints, Wang and colleagues [28] suggest that humans use a model-based predictive approach to anticipate the physical dynamics of the environment and plan a movement. Researchers have also generated smooth movement trajectories for agents using optimization algorithms such as particle swarm [29]. With the present approach, we employ information theory in order to provide realistically smooth trajectories along uncertainty gradients for agent path planning. For an artificial agent to demonstrate realistic and purposeful behavior, this agent should possess the abilities to explore, identify, and internalize information from its immediate surroundings. Recently, Marghi and colleagues [30] argued in favor of developing information processing frameworks that focus more on learning and the formation of internal models than direct geometric processing of spatial information for reasoning.

In the present work, we develop and test a biologically inspired computational model of human-signage interaction based on information theory. Towards this end, we conducted two crowd-sourced online experiments and one VR lab-based experiment to refine and validate our proposed cognitive model. There are four main contributions of this paper:

  • An information-theoretic approach to quantify the information provided by a signage system and to facilitate wayfinding in an environment.

  • Two crowd-sourced experiments to compute the parameters of an agent-signage interaction model.

  • An information gain-based approach for spatial decision-making from signs.

  • A VR experiment to refine and validate the proposed agent-signage interaction model.

To anticipate, our refined information theoretic model substantially improved the extent to which we can predict human wayfinding trajectories in virtual reality. Because of its foundation in information theory, the model affords greater flexibility with respect to different sources of information and noise and can be used in the future to predict wayfinding behavior in more complex environments.

Background and Prior Work

According to Gibson [31], an agent’s perception depends on the pick up of invariant information from the environment. In his theory, affordances are the invariants that are relevant for the interaction between an agent and an object [32]. Heft [33] has applied this theory to the influence of environmental properties on navigation behavior, and Norman [34] has expanded the concept of affordances to include design thinking. These environmental properties may include the design and placement of signage.

In order for signs to provide ecologically relevant information, they need to be visible and interpretable [35]. An appropriate signage system can facilitate the communication of wayfinding information from the world to the individual agent [36]. Previous research has demonstrated that signage has distinct advantages over maps for navigating the built environment [37, 38]. Indeed, O’Neill [38] found that signs with text led to a reduction in incorrect turns and an overall increase in wayfinding efficiency compared with graphic signage (i.e., an arrow; see also [39]). Studies that have employed simulations and user experiments have suggested that the focus of visual attention can be improved with signage redesigns [9, 35]. Also, signage can elucidate the relationship between crowding and well-being [39], reduce simulated casualties [40], and improve simulated evacuation times [41].

For the most part, agent circulation and evacuation models have neglected the agents’ interaction with the signage system [42]. In such models, the underlying assumption is that agents have full knowledge of the world and can compute the route beforehand. This oversimplification leads to inaccurate simulation results because these models do not incorporate signage detection errors. Filippidis and colleagues [42] presented the first evacuation model (BuildingEXODUS) to include agent-signage interaction using the concept of Visual Catchment Area (VCA). The VCA of a sign is the region in which an agent can physically perceive wayfinding information from a sign and can be approximated with a circle [43]. Indeed, analyses based on the VCA can be used to improve the wayfinding design process by visualizing the VCA of a sign together with path circulation and optimizing the sign location and orientation to maximize sign’s visibility around the nearby circulation paths [44]. The VCA of a sign is calculated using the location of the sign, the height of the agent, the viewing angle, and the size of the lettering on the sign. Xie and colleagues [41, 45] experimentally computed the detection rate and compliance rate (i.e., accuracy) of a sign in terms of familiarity, relative orientation, and level of directional information conveyed by the sign. They found that the environment in which the signs were positioned influenced human-signage interaction. However, these authors assumed that participants in their experiment apprehended information from the sign and would have acted consistently once they detected the sign.

Other researchers have proposed different methods for designing signage systems and modeling agent-signage interactions for evacuations and wayfinding scenarios. For example, Tseng and colleagues [46] used Building Information Modeling (BIM) technology for designing the signage system of a public building that was more effective than traditional procedures. They also suggested that the architectural plan and signage layout can be processed simultaneously in a BIM collaborative environment. Similarly, Motamedi and colleagues [47] proposed a system for optimizing the arrangement of directional and identification signage in BIM-enabled environments. Their system estimated optimal signage arrangement based on signage visibility and legibility for a 3D pedestrian model. However, signage detection and interpretation was not considered [47], and the agents’ paths were not affected by the navigation information provided by signage. Consequently, the empirical evaluation of cascading decisions along a route containing several signs (i.e., wayfinding) was infeasible.

Different models of wayfinding behavior have assessed agent-signage interaction with various signage parameters. For example, Hajibabai and colleagues [48] proposed a wayfinding simulation in a 2D environment model using directional signage for emergency evacuation during a fire. The agents in this simulation could decide their routes of movement based on perceived signage and fire propagation. However, the visibility and legibility of the signs were estimated using simple heuristics, and signage detection was not considered. Recently, signage-based wayfinding simulation has advanced by incorporating different signage parameters. Chen and colleagues [49] proposed and tested a wayfinding simulation algorithm based on 3D structural information, including doorway width and height, the contrast and intensity of the signage, and room illumination. Along with Morrow and colleagues [50], this study [49] is difficult to apply to the evaluation of a signage system with respect to wayfinding because the proposed model did not incorporate agent-signage interaction.

Simulations of human-signage interaction also require the optimization of sign placement in order to improve wayfinding. For example, Lin and colleagues [51] proposed a cellular automata model for optimizing directional signs in terms of congestion to facilitate occupants’ wayfinding in an airport terminal. Other researchers have optimized similar simulations in terms of visibility [52, 53]. For example, Tam and colleagues [52] optimized a binary linear model for sign placement in terms of the ratio of the number of available sight lines and the total number of sight lines that exist throughout the terminal. Similarly, Zhang and colleagues [53] proposed another model based on cellular automata for human-signage interaction during a simulated evacuation based on the coverage area of individual signs and the overall coverage of the signage system.

Some researchers [35, 54] have proposed schemes for wayfinding simulation based on agents’ perception of directional and identification signage in a 3D environment model. In these simulations, each successive walking direction of each agent was determined autonomously based on navigation information from the perceived signage. Signage perception was determined using estimations of signage visibility and legibility that were based on the visual perception of the pedestrian model. However, the detection of the perceived sign and the interpretation of information provided by the sign were not considered in these simulations.

Information theory may provide the flexibility required to incorporate various physiological (e.g., visual acuity), physical (e.g., VCA), and psychological (e.g., interpretation) factors into agent-based wayfinding models. Previously, information theory has been employed to measure scene complexity in computer graphics [55] and to automatically compute ideal viewing positions for polygonal scenes based on viewpoint entropy measures [56]. Turkay and colleagues [57] proposed an information theory-based framework that automatically controls the movement of the virtual camera in a crowded environment. They have extended their framework to control the behavior of an agent in crowd simulations to include variability in and realism of movement [58]. While Turkay and colleagues [58] focused on individual differences among agents in a crowd, the present work emphasizes the cognitive capacity of an individual agent given knowledge in the world.

Preliminaries

In this paper, we examine dominant physical and psychological factors such as a sign’s visibility, the agent’s height, and spatial decision-making.

Signage and Environment Model

In the proposed model, a signage system consists of a set of signs. Individual signs are represented by their legibility attributes along with a list of goal locations. These attributes include saliency, text legibility distance, sign visibility distance, sign type, comprehension time, and content. Each sign is considered an asset, and the sign’s property is assigned during its creation as a property-set in BIM [59]. Table 1 in Appendix A lists the sign’s attributes and describes each attribute in detail.

Grid Maps

In the proposed framework, the navigable surface of a 3D environment is divided into an array of rectangular grid cells. This array is a virtual grid map of the 2D floor plan that is used as a reference point for an agent’s location. The size of a grid cell is set to 1 m by 1 m because it approximates the average step length and size of an adult. The selection of this particular grid cell size allows us to match agents’ step size to the step size of human participants in later VR studies. In addition, this grid cell size provides a balance between computing time and the similarity of simulated trajectories. Each grid cell has several parameters that store essential wayfinding information. These parameters can include whether the grid cell is walkable or obstructed, a binary list of the sign’s visibility value, and a list of the entropy values of each sign from that grid cell. For a comprehensive list of grid cell parameters, see Appendix B.

Agent Model

We embed two physical aspects of humans (i.e., visual acuity and height) into the agent framework. In the proposed model, the agent’s visual acuity is considered normal (i.e., 20/20). An average eye height of 1.72 m is considered for the dynamic sign visibility check.

Visual Perception Model

In wayfinding research, visual cues in the built environment are a human’s primary source of distal information [60]. Indeed, several researchers have developed vision-based techniques for enabling mobile robots and virtual agents to detect and react to information in their environment [61]. In order to realistically model the interaction between agents and their environment, a human-like visual perception model should focus on the first-person perception of signage while considering dynamic occlusions [62]. In our model, both the horizontal and vertical fields of view (FOV) are modeled to realistically simulate the agents’ visual perception. The effective horizontal FOV is 120 degrees in order to account for human neck rotation [63] during search behavior (i.e., before the sign is detected). The effective vertical FOV is 60 degrees once a sign is detected to simulate focused visual attention.

Visual Catchment Area

The VCA of a sign is the region in which an agent can physically perceive wayfinding information from a sign. The calculation of a sign’s VCA is described below:

$$ \left (\frac{b}{\sin(o )} \right )^{2} = x^{2} + \left (y - \frac{b}{\tan(o)}\right )^{2} $$
(1)

Here, o is the angular separation of the sign and the agent, b is half of the size of the sign’s surface, and P(x,y) represents the agent’s location. The center of the VCA is at location \((0,\frac {b}{\tan (o)})\) with a radius of \(\frac {b}{{\sin \limits } (o)}\).

In our proposed signage visibility model, we simplify the calculation of the sign’s VCA as suggested by [43]. The VCA of a sign can be reliably simplified to an approximate circle with its radius equal to half of the viewing distance for the sign’s lettering height. For all of the experiments conducted for this paper, we have used a lettering height of 152 mm and have considered the radius of the VCA circle to be 20 m.

Dynamic Sign Visibility Check

The dynamic sign visibility check is a runtime sign visibility test for dynamic occlusion. When an agent enters the VCA of a sign, a dynamic visibility check is performed to check for any occlusion (from another agent or physical barriers) in the visual field of the agent. Five rays are cast from the eye position of the agent to a sign (see Fig. 1a). Here, the default value is 1.72 m (i.e., the approximate average height of an observer’s eye above the floor). These five rays are cast towards the center and four corners of the sign. If three out of five rays hit the sign unobstructed, the sign is considered visible. Figure 1 b illustrates the manner in which obstacles affect the VCA of a sign.

Fig. 1
figure 1

Visualization of the dynamic sign visibility check. a Five rays are cast from the eye position of two agents towards two different signs. b Revised VCA after the dynamic sign visibility check with occlusion. When there is an obstacle between the agent and the sign, the grid cells contain no information (visualized here in white)

Quantifying Wayfinding Information in a Sign

We apply the principles of information theory to quantify the information provided by signage in a virtual environment. While many physical and psychological factors influence the effectiveness of signage systems (e.g., color, contrast, interpretability, and attentiveness), we focus exclusively on signage visibility and decision-making confidence.

According to information theory, entropy represents the amount of uncertainty in a random variable as a probability distribution. The Shannon entropy of a discrete random variable X that can take possible events of x1,...,xn is

$$ H(X) = E(I(X)) = -\sum\limits_{i = 1}^{n} p(x_{i}) \log_{2} p(x_{i}) $$
(2)

We model the entropy of a sign’s visible information P(l,s) as a measure of the navigation-relevant information that is available to an agent at location l from sign s. Let X(l,sa) be a random variable that represents a particular piece of information at a location l and sign sa. The probability of a particular value for the random variable X(l,sa) will depend on the distance of sign sa from the location l and the relative angle between location l and sign sa. The probability distribution is generated by sampling information X from sign sa at l for 1000 iterations. Based on our experiments, we found 1000 samples to provide a reasonable trade-off between the granularity of calculations and computing time. Further investigations are needed to determine the sensitivity of our calculations to this parameter.

The uncertainty function U(l,sa) represents the likelihood of viewing information from a sign sa at location l as a function F:

$$ U(l,s_{a}) = F(\mu,\sigma) $$
(3)

μ is proportional to the distance and relative angle between sign sa and location l. Larger distances and relative angles between sign sa and location l result in higher values for μ (i.e., closer to 1), and σ represents the decision-making confidence at location l conditioned on μ.

These two relationships form the basis of I(sa) (i.e., the actual information contained in sign sa) and can be combined with Eq. 3 to calculate P(l,sa):

$$ P(l,s_{a}) = Noise(I(s_{a}), U(l,s_{a})) $$
(4)

P(l,sa) is the entropy of the sign sa for an agent at location l. We then substitute P(l,sa) for p(x) using Shannon’s entropy equation (Eqs. 2) and 4 to obtain a measure of entropy for a sign from the observer’s location.

The work conducted by Filippidis and colleagues [42] assumes a relationship of the relative direction between the agent and the sign with the probability of being visible. In this paper, we empirically compute the essential measures of sign visibility confidence (μ) inside the sign’s VCA and the sign’s decision-making confidence (σ ) after the sign’s detection. Sign visibility confidence is vital to assess the accuracy of sign legibility from various locations for an agent. Because spatial decision-making after sign detection is dependent on the sign’s legibility, we compute decision-making confidence conditioned on the sign’s visibility. To understand the relationship between the two measures and to generate two distributions, we conducted two online crowd-sourcing experiments using Amazon Mechanical Turk (AMT) [64]. AMT is an online crowd-sourcing service in which anonymous workers complete web-based tasks. The advantages of an online study include lower cost, faster data collection, and greater experimental control compared with real-world studies. While real-world studies can provide better ecological validity, they are often infeasible in this context.

The first experiment investigated sign visibility as a function of observation angle and distance. In the second experiment, we assessed decision-making confidence as a function as a sign’s visibility. Together, these experiments were used to compute the relationship between decision-making confidence and sign visibility.

Experiments

Experiment 1: Sign Visibility as a Function of Observation Angle and Distance

The purpose of this study was to determine the relationship between the visibility of a sign from various viewing distances and observation angles within the VCA of a sign. Specifically, the experiment tested the hypothesis that the visibility of a sign is a continuous measure rather than a binary value inside the VCA (cf, [43]).

Design

We used the Unity 3D game engine (https://unity3d.com/) in order to generate the layout of a simple environment, including a basic wall and floor with gray textures (see Fig. 2a). For each trial, the sign was placed on a wall at a height of 3 m from the floor. For an example of a sign, see Fig. 2b. For the text on each sign, five characters were randomly chosen from a pool including “A-Z” and “0–9.” We excluded characters such as “S” and “5,” “0” and “o,” and “1” and “l” because they are difficult to distinguish from one another. We also excluded special characters because they are not commonly used on signs. We used alphabetic letters and numbers because this arrangement is most common (e.g., “GATE A24”). The generated characters were then written on a sign with a green background and white font. The text height was selected to be 152 mm.

Fig. 2
figure 2

Example stimuli from Experiment 1. a A screenshot of a sign in the testing environment. b A sign with five characters that were randomly generated

The VCA of each sign was calculated with a radius of 20 m as described in “Visual Catchment Area.” The VCA was then divided into grid cells that were 0.5 m2. To reduce the number of grid cells, we used a semi-circle VCA instead of the entire VCA. Every other grid cell was then selected from the remaining grid cells. In total, 108 grid cells at various distances and angles from the sign were selected as the final locations. A First-Person Character (FPC) was created in Unity 3D with a height of 175 cm, and the camera was placed at an approximate height of 172 cm in order to approximate eye position. Screenshots were then taken inside of Unity 3D by rotating the FPC to face the sign directly from each of the selected grid cells.

We conducted a pilot experiment in order to choose the resolution of these screenshots. A high-resolution photograph of an “EXIT” sign was taken in a real environment at a distance of 30 m. The photograph was then scaled at three different resolutions (i.e., 2600 × 1462, 2700 × 1288, and 3000 × 1688) and printed on high-quality paper. Fifteen participants (7 women and 8 men) were taken to the same “EXIT” sign in the real environment and asked to stand at the same place from which the photograph was taken. Three different resolution photographs were then shown to the participants, and they were asked to select which of these three photographs was visually closer to the real-world view of the sign. Eight participants selected the photograph with the 2700 × 1288 resolution, five participants selected the photograph with the 2600 × 1462 resolution, and only two participants selected the photograph with the 3000 × 1688 resolution. Based on these results, we chose a resolution of 2700 × 1288 to be used in all experiments. In order to fit the image on various desktop screens on a web-browser, these screenshots were then cropped to a resolution of 600 × 400 by keeping the sign at the center as shown in Fig. 2a.

In order to maintain participants’ focus and attention, the experiment was divided into two smaller sub-experiments, each with 54 randomly selected signs. Each sub-experiment was created using a popular online survey platform named Qualtrics (https://www.qualtrics.com/uk/). A link to each sub-experiment was then added to the AMT platform with the assistance of the ETH Decision Science Laboratory (DeSciL). The DeSciL is a multidisciplinary experimental laboratory dedicated to the study of human decision-making behavior [65]. In total, 45 min were required for 108 participants to complete the online experiment.

Procedure

Ethics approval was acquired for all experiments from the ethics commission at ETH Zurich (EK 2016-N-73). Participants were asked to complete a consent form before starting the experiment. After informed consent, they were given an online eye test in order to select those with 20/20 visual acuity. After the eye test, participants were given three practice trials in order for them to become familiar with the questionnaire. They were explicitly instructed not to perform any zoom operations in order to avoid a penalty. Participants were then shown the 54 different sign images and asked to type the text from each sign into the textbox provided just below the image. Participants were told beforehand that the text was not case-sensitive. After the presentation of the 54 images, they were asked to enter demographic details, including their gender, year of birth, number of HITS submitted on AMT, and visual acuity. Finally, participants could provide feedback on the experiment in another textbox.

Participants

Participants on AMT are mostly located in the USA and India. Besides having to be located in these countries, there were no other eligibility criteria. Of the AMT participants who reported their gender, 55.37% selected male, 44.62% selected female, and 0.006% selected other. The age of participants ranged from 18 to 54 years. Data were collected from 307 participants for this experiment.

Results

A total of 307 participants (154 for one sub-experiment and 153 for the second sub-experiment) completed the experiment. The data from two participants were discarded because they performed poorly on the visual acuity test. Participants who performed a zoom operation and participants who entered random text (e.g., “AAAAA”) were also discarded. After eliminating these data, 120 responses from the first sub-experiment and 134 responses from the second sub-experiment were analyzed.

Levenshtein distances (LD) were used to analyze responses [66]. LD measures the similarity between two strings by calculating the minimum number of editing required to convert the incorrect sequence into the correct sequence (i.e., edit distance). For example, LD for the two strings “XPGT1” and “XPCT1” is 1 because only one editing step is required to change “C” in the latter string to match “G” in the former string. The mean LD (mLD) for each image was used to calculate visibility error in percentage.

The K-Nearest Neighbor (KNN) [67] search was conducted to find the nearest visibility error (number of neighbors = 1, NSMethod = “kdtree”) for those grid cells that were excluded from the experiment by referencing the visibility error computed for all 108 grid cells (see Fig. 3). A visibility error range of 0 to 33% was computed from this experiment. Consistent with [43], the results revealed a gradual decrease in visibility towards the perimeter of the VCA.

Fig. 3
figure 3

The K-Nearest Neighbor (KNN) search algorithm was used to estimate visibility error for the grid cells that were not considered in Experiment 1. Red indicates higher error (%), and blue indicates lower error. Each dot represents one grid cell from Experiment 1. a Visibility error plot for the 108 grid cells tested in Experiment 1. b Visibility error plot for every grid cell within a sign’s VCA after applying KNN

Experiment 2 - Relationship Between Decision-Making Confidence and the Visibility of a Sign

The goal of Experiment 2 was to assess the relationship between decision-making confidence and the visibility of a sign from various locations.

Design

Again, we used AMT to collect data for Experiment 2. To generate a set of screenshots, we first computed the z-score values for the visibility error (in %) from Experiment 1 for each grid cell inside the VCA. Grid cells with z-scores above 0 (i.e., greater than the mean z-score) were further considered. In order to reduce the number of grid locations for the experiment, one side of the circular VCA was discarded because the VCA is symmetrical. Grid cells with non-unique z-scores were also discarded to remove redundancy. Grid cells with unique z-scores were randomly selected. To balance the two halves of the VCA, half of the grid cells were randomly selected and replaced with the corresponding grid cells from the other half of the circle. The final selected grid cells are visualized in Fig. 4a. Screenshots were taken from these grid cells using the same resolution as in Experiment 1. An example sign for this experiment is shown in Fig. 4b. The height of the lettering was also maintained at 152 mm.

Fig. 4
figure 4

Stimuli from Experiment 2. a The grid cell locations (red dots) selected to generate screenshots. b Example of a sign

Procedure

Participants were asked to complete a consent form at the beginning of Experiment 2. After consent, participants were given three practice examples to become familiar with the experiment. Similar to Experiment 1, participants were instructed not to perform zoom operations in order to avoid a monetary penalty. Participants were asked to select the correct direction to reach a particular gate for each of 57 different screenshots with randomly selected goal locations. Possible direction choices included left, right, and none of the above. Participants were informed before beginning the trial that the goal location may not be mentioned on some of the signs. The none of the above option was added to prevent guessing based on a subset of the characters displayed on the sign. Participants were asked to enter their basic demographics, including gender, year of birth, number of HITS submitted on AMT, whether they wear corrective lenses, and their feedback on the experiment.

Participants

Of the AMT participants who reported their gender, 54.31% chose male, and 45.69% chose female. The age of participants ranged from 18 to 64 years old. A total of 171 participants’ data were collected for Experiment 2. The data from 20 participants were discarded because of guesswork, premature abortion of the experiment, and the performance of zoom operations.

Results

The relationship between decision-making confidence and visibility confidence is plotted in Fig. 5a. With a 100% sign visibility, maximum decision-making confidence was found to be 88%. Using the data from both experiments, conditional entropy [P(A|B)] between the visibility and decision-making was computed for each grid cell (see the “Quantifying Wayfinding Information in a Sign” section).

Fig. 5
figure 5

Results of Experiment 2. a Scatterplot of the significant relationship between sign visibility and decision-making confidence. Even with a 100% visible sign, some participants decided incorrectly. b Density map visualization of wayfinding decision entropy from each grid cell within a sign’s VCA from Experiment 2. Grid cells in a warmer color (red) represent high entropy/uncertainty and grid cells with a colder color (blue) represent low entropy/uncertainty

Simple linear regression in R was used to predict decision-making confidence given the visibility confidence of a sign. Here, R2 represents the fit of the linear regression model, yint represents the value at which the line intersects the y-axis, Bvis represents the slope of the line for the relationship between decision-making confidence and sign visibility, pB represents the probability of obtaining this coefficient or anything more extreme by chance, and the 95% CIB represents an estimated range of values that is likely to contain the true slope. This regression revealed a significant relationship between sign visibility and decision-making confidence, R2 = 0.161, yint = 40.74, Bvis = 0.47, pB = 0.00112, 95% CIB = [0.195, 0.741].

We computed wayfinding decision entropy conditioned on sign visibility using Eq. 4 for each grid cell inside a sign’s VCA. In Fig. 5b, we visualize decision entropy as a density map inside a sign’s VCA.

Information-Theoretic Model of Agent Wayfinding

In this section, we propose an agent-signage interaction model that is grounded in spatial decision-making with a signage system. Experiment 1 highlighted the continuous relationship between distance/angle and a sign’s visibility. Experiment 2 revealed a linear relationship between decision-making confidence and sign visibility. The proposed model was informed by data from these two experiments.

During a wayfinding task, agents search for cues that can help them navigate their surroundings. In the presence of a relevant sign, agents can pick up the wayfinding information it affords. The agent’s interaction with a sign can be divided into a series of phases, including searching for a sign, detecting a sign, approaching a sign, gaining information from a sign, decision-making, and acting on the decision. Figure 6 presents the steps involved in the proposed signage-based agent wayfinding system. Below, we describe these phases in detail.

Fig. 6
figure 6

Overview of the agent-signage interaction model. Each box represents a particular state of the agent, and the arrows represent transitions between states. Emphasis is placed on the information gain phase during which the agent moves down an uncertainty gradient towards grid cells with higher decision-making confidence

Exploration Phase

The interaction of an agent with a sign starts when the agent first notices its presence. An agent may be outside of the sign’s VCA and unable to perceive what is written on the sign. However, the agent can still see and walk towards the signboard in the absence of any intermediate signs.

Decision Node Phase

While exploring, if an agent approaches a decision point (e.g., a location with a potential change in direction) and has not found the relevant sign, the exploration phase changes to the decision node phase. In the decision node phase, a random directional decision is made based on the number of directional choices possible at that intersection. The agent then moves in the chosen direction, and the agent switches back to the exploration phase.

Signage Discovery Phase

When an agent finds a sign or a set of signs and is within one or more signs’ VCAs, the agent state is changed from the exploration phase to the signage discovery phase. This is shown as location B in Fig. 7a. Detected signs are added to a list, and the nearest sign is selected to be approached first. The iteration through the list is based on the nearest sign and stops when either the agent has found a correct sign with some confidence or none of the signs has the correct goal information. In the former scenario, the signage discovery phase is changed to the information gain phase. In the latter scenario, the agent state returns to the exploration phase.

Fig. 7
figure 7

Detailed diagram of the information gain phase. a Overview of an agent’s interaction with a sign. b Zoomed-in view of nine grid cells within a sign’s VCA. The agent can move to any of the neighboring grid cells indicated by the arrows. X is the entropy of perceiving the sign from that grid cell, and D is the distance (in meters) between the grid cell and the sign. Movement towards the next grid cell is determined by searching the neighboring grid cells (1–8) from the current agent location at grid 5. c Zoomed-out view of a larger section of a sign’s VCA. Grid cells with darker shades of green indicate higher entropy/uncertainty. An example agent’s path is shown by the curve marked with orange triangles 1 to 7

Information Gain Phase

During a wayfinding task, an agent has to choose a particular route from several possible route options. Before making a decision, humans intuitively reduce uncertainty as much as possible. The information gain phase captures this process of reducing uncertainty in a computational model. The process of information gain begins when an agent is within the VCA of a sign. In Fig. 5b, we present the decision-making entropy at individual grid cells. Notably, there is a gradual reduction in decision-making entropy with the concurrent reduction of distance and relative angle from a sign.

In Fig. 7, we demonstrate this process in detail. This phase begins when an agent enters the VCA of a sign. Once inside the VCA, the agent moves to the adjacent grid cell with the lowest entropy value among all neighboring cells. If all the neighboring cells have the same entropy value, then the agent moves to the grid cell nearest to the sign. This information gain process continues until the agent crosses a low entropy threshold. Once an agent reaches a grid cell at or below this threshold, the agent acts on this information by switching to the execute signage phase.

Two important parameters for the information gain phase are Lookahead Distance (LD) and Threshold Entropy (TE) value. LD is the number of grid cells between an agent’s current location and the grid cells being considered during the information gain phase. A smaller LD would result in constant jitter in the approach towards the sign, and a larger LD would result in a smoother approach towards the sign. Both extremes are unrealistic, so we conducted a virtual reality (VR) experiment in order to determine the ideal value (see the “VR-based Experimental Optimization and Evaluation of Agent Wayfinding Model” section). TE is the value below which the agent decides to act on the information gained from a sign, thus marking the end of the information gain phase. This parameter was held at 0.54 bits for all reported simulations.

Execute Signage Phase

In this phase, the agent acts on the decision by walking towards the directional vector indicated by the sign (see location E in Fig. 7a). This phase ends once an agent sees the final goal and is within its VCA. Otherwise, the agent state changes back to the exploration phase after the execute signage phase, and the wayfinding process continues.

Disorientation Phase

The disorientation phase begins when the agent cannot find a sign that indicates the correct goal and is disoriented. Specifically, this phase starts when an agent has walked more than two times the inter-sign distance (e.g., 2 × 40 m for the simulations below). This parameter may be adjusted to match different wayfinding scenarios.

Fail-Safe Phase

This phase is reached when an agent attempts to explore the environment and cannot locate a valid sign. The disorientation phase changes to the fail-safe phase after the agent has walked more than four times the inter-sign distance (e.g., 4 × 40 m for the simulations below). Here, the agent abandons the search for its goal and continues with the next task on their task list. When there are no further tasks to be performed, the agent exits the building via the nearest available known exit.

VR-based Experimental Optimization and Evaluation of Agent Wayfinding Model

The primary aims of the VR experiment were to inform the parametric design of an agent-signage interaction model and to understand the human-signage interaction process. We conducted this VR experiment with 40 participants (tested individually). Participants were asked to locate a goal using a sequence of two signs. For each of nine trials, walking trajectories were collected from the participants during the wayfinding task. The resulting dataset was randomly divided into two groups of 30 and 10 participants. We used the larger group (30 participants) to compute the two identified parameters as discussed in the “Information-Theoretic Model of Agent Wayfinding” section. The second group of 10 participants was used to validate the revised agent-signage interaction framework.

Design

A Manhattan-style 3D grid network (see Fig. 8) was created using Autodesk Revit [68] and then imported into Unity 3D. The 3D environment included two directional signs, each at one of three possible locations. This resulted in nine different combinations of signs. Three possible locations for sign placement were chosen systematically and placed at the top-center (S1/S4/S9/S12), center-center (S2/S5/S8/S11), and bottom-center (S3/S6/S7/S10) of the four intersections as shown in Fig. 8. For each trial, only a pair of directional signs were shown. The 3D environment also had a few foil signs (i.e., signs that indicated the direction of non-targets) and a destination sign at each possible destination. The designs of these signs were based on the signs used in Experiment 2. Participants were tested with a desktop computer with a mouse-and-keyboard control interface. The computer used for the experiment was a custom build Lenovo PC running Windows 10 with Intel Core i7 with 3.4 GHz processor and 16 GB RAM. The computer was connected to a Samsung monitor with 28-inch diagonal and a resolution of 3840 × 2160 pixels. A standard bluetooth enabled Logitech keyboard and mouse were used as input peripheral devices.

Fig. 8
figure 8

3D environment used for the VR experiment. S1 to S12 indicates possible sign locations that were used for the experiment. D1 to D6 indicates possible target goal locations for the participant’s wayfinding task

Procedure

Before starting the experiment, participants were briefed about the VR setup, their rights as participants, and the experimental tasks. Participants sat on a chair positioned at the center of the screen at a distance of 1.5 m. With an interactive video, participants were then trained on the usage of the mouse-and-keyboard control interface for navigating in the virtual environment. For every trial, participants were randomly assigned a start location from the nine starting points (see Fig. 8). Participants were then shown instructions for the wayfinding task in which they searched for a particular destination (e.g., “Gate C2”). Each participant completed 9 trials, each containing two directional signs. The text on the directional signs and the sign locations were randomly varied for every trial. The trial ended when either the participant successfully reached the vicinity of the destination or made a incorrect turn. Participants were informed beforehand that the task might end abruptly and that this was unrelated to their performance. We recorded participants’ trajectories and the time required to complete each trial.

Participants

A total of 40 people (15 men and 25 women) participated in the VR experiment. All participants were students from the National University of Singapore (NUS). No other eligibility criteria were set. Participants’ age ranged from 19 to 31 years old (Mean = 23.3, Standard Deviation = 2.37). The self-reported rating of their wayfinding ability on a 100-point rating scale ranged from 60 to 100 units (Mean = 84.75, Standard Deviation = 12.40).

Results

We collected 360 trajectories in total, 15 of which were towards an incorrect destination (see Fig. 9). We selected all of the trajectories from 30 randomly selected participants and compared them with 30 simulated agents. The trajectories for the simulated agents were based on the agent-signage interaction model as discussed in the “Information-Theoretic Model of Agent Wayfinding” section. The default value for LD was 2, and the default value for TE was 0.54 bits (i.e., entropy value computed from the 88.2% decision-making confidence obtained in Experiment 2). A small delta value of 0.01 was introduced to generate variation in the agents’ trajectories. Incorrect trajectories were not included in the computation of these parameters. These errors may be attributable to the front directional arrows shown on both trials 5 and 8 instead of left/right directional arrows. These front directional arrows may have been confusing because they were less common (2 of 9 trials).

Fig. 9
figure 9

a–i Trajectory visualization for 30 participants (black) and 30 simulated agents (blue) over 9 trials with different combinations of sign pairs. The look ahead distance was set to 2 grid cells (2 m). Notably, we observe incorrect decision-making by some participants in trials 4, 5, and 8

Participants’ and agents’ trajectories differed during the information gain phase, but the grid cells at which the decisions were executed were similar for both participants and agents. Most of the participants appeared to walk in a straight line using approximately the shortest unobstructed path in the direction of the sign, assuming the sign was visible, in order to minimize the travel time. The agents’ trajectories were straight until they exited the sign’s VCA. Once an agent entered a sign’s VCA, its trajectory changed by walking towards the immediate grid location. We observed that this lookahead information gain continues until the agent was in direct alignment with the sign in consideration. Afterwards, the trajectory was mostly straight until the agent decided to act on the apprehended information from a sign. This location was decided based on the TE value of 0.54 bits. On average, the grid cells at which the decisions were executed varied in the visibility range of 80–100%.

Participants’ trajectories demonstrated that, once a sign was detected and visible, people tend not to make an immediate decision and navigate towards the next sub-goal location. Instead, they tended to approach a sign in order to gain more wayfinding information and improve decision-making confidence inside the sign’s VCA before acting on it. In the absence of distractions such as other virtual agents, elaborate textures on the wall, shop fronts, or landmarks, decision-making error was negligible (0.02% or 20 out of 720 decisions for 9 trials).

Parameter Computation

To improve the proposed agent-signage interaction model, we parametrically computed the value of LD and TE.

Lookahead Distance

Variation in an agent’s trajectories with different values for LD (ranging from 2 to 32) for one example trial is shown in Fig. 10a. In order to compare the differences between participants’ and agents’ trajectories, we used dynamic time warping (DTW) [69]. DTW is an algorithm for calculating the similarity between two trajectory sequences that may vary in time or speed. For example, similarities in walking patterns could be detected using DTW. Figure 10b shows the plot of DTW distances between the mean of 30 participants’ data with the mean of 30 simulated agents’ data for the same trial. With the increase in LD, the mean agent trajectory begins to straighten up and becomes closer to the mean participant trajectory (see Fig. 10a). This is captured in the reduction of DTW distance between the two mean trajectories. This indicates more similarity between the two mean paths. At an LD of 28 and above, an agent’s trajectory stops straightening and changing, and the trajectory remains the same for higher LD. In addition, the DTW distance becomes constant after an LD of 28 (see Fig. 10b).

Fig. 10
figure 10

Visualizations of the relationship between lookahead distance and the similarity between participants’ and agents’ trajectories. a Visualization of an agent’s trajectory data (in black) with different values for lookahead distance ranging from 2 (right-most curve) to 32 (left-most curve) in increments of two for one example trial. The red trajectory is the mean of 30 participants’ trajectories. b A graph depicting the relationship between lookahead step size and DTW distance between participants’ mean trajectories for trial 1 and the corresponding trajectory from an agent

These results suggest that participants conducted a lookahead search of 28 m along an unobstructed path towards a sign and approached the sign in a straight line to minimize walking distance. Notably, a zig-zag pattern would have led to a location with higher sign visibility.

Threshold Entropy

- This is a critical parameter in the agent-signage interaction model because the grid cells with this value (or lower) cause agents to execute the sign’s instructions. From participants’ trajectories, we noticed that this occurs on the grid cells that have the sign visibility range of 80 to 100%. A histogram of TE (extracted from 30 participants’ data) is plotted in Fig. 11. The majority of participants’ decision-making entropy falls in two bins (associated with 90–95% and 95–100% sign visibility range). Instead of computing the mean of all the decision-making entropy values and considering the mean as one default value, the agent’s decision entropy was sampled from the distribution provided by the histogram for each agent-signage interaction during the wayfinding process.

Fig. 11
figure 11

Histogram of the decision-making entropy values computed from participants’ observed trajectories

Simulation Results and Validation

We updated the agent-signage interaction model with the newly computed LD value of 28 and TE values to be sampled from the distribution visualized in Fig. 11. This refined agent-signage interaction model was used to generate 10 new trajectories. The untouched test dataset from 10 participants was used for the comparison and validation. The mean trajectory data for 10 participants and 10 simulated agents after refinement of the interaction model for nine trials are shown in Fig. 12. We visually observed that the trajectory generated by the refined agent model has variations similar to participants’ data, concerning both the trajectories and decision points. The decision points from agents’ trajectories mimic the variability in and closeness to the decision points extracted from the participants’ trajectories. Moreover, we observed a reduction of 38.76% in DTW distance between the means of both trajectories before and after refinement over nine trials (see Table 1). According to a (nonparametric) Wilcoxon signed ranks test, the difference between DTW distances before and after refinement was significant, Z = 2.67, p = .008.

Fig. 12
figure 12

ai Mean trajectories for 10 participants (black) and 10 simulated agents (blue) over 9 trials with different combination of sign pairs after informing the agent-signage interaction model with the refined parameters

Table 1 DTW distance between the mean trajectories for 9 trials before and after parameter refinement in meters

Conclusion and Future Work

In the present paper, we have proposed an information-theoretic approach to modeling agent-signage interaction, conducted two crowd-sourcing experiments that informed the computation of a sign’s visibility and an agent’s decision-making confidence, and conducted a VR experiment in order to refine and validate our proposed model. Our biologically inspired agent-signage interaction model allows for greater flexibility by adding different types of noise with respect to the environment (e.g., layout complexity, crowds, and other distractions), signage (e.g., multiple information clusters, visual salience), and agents (e.g., attention, reasoning, memory) because of the model’s foundation in information theory. In general, this model capitalizes on the advantages of information theory for representing uncertainty in biological, cognitive systems. Our model was motivated by the concept of VCA from Xie, Fillipidis, and colleagues [41,42,43, 45] but elaborates on the relationship between a sign’s visibility and the relative angle and distance of the agent from the sign. Specifically, we empirically computed two distributions that highlight the continuous relationship between distance/angle and the sign’s visibility.

These empirical distributions were generated using two online crowd-sourcing experiments in which participants judged the visibility of a sign from different distances/angles (Experiment 1) and decided whether to move forward, left, or right given signs at different distances/angles (Experiment 2). Together, these experiments revealed that decision-making confidence was linearly related to the visibility of a sign but that a visible sign did not always lead to a correct decision. Using these data, conditional entropy (i.e., decision-making confidence conditioned on the sign’s visibility) was then used to inform an initial configuration for the agent-signage interaction model, which was later refined using an experiment in VR.

The primary purpose of the VR experiment was to inform two critical parameters (i.e., LD and TE) that improved the realism of the agents’ wayfinding behavior. Specifically, these parameters determined the extent to which agents directly approached a visible sign and the amount of information required for the agent to make a decision. Refining these parameters led to more realistic agent-signage interactions.

Information theoretic approaches to problems involving one or two variables are well understood and widely used [70], but the investigation of any complex system would be insufficient if we restricted ourselves to only one or two variables. Quantifying the information between more than two variables remains largely unsolved. Several multivariate information measures have been introduced to analyze the relationships and interactions between two or more variables [71]. However, the generated results using multivariate information measures often differ significantly. Future work should extend our framework to incorporate three or more random variables.

This work in relatively simple virtual environments may also be extended to complex and/or real environments by investigating the influence of other quantitative variables (e.g., spatial layout, crowd dynamics, spatial memory retrieval) on agents’ decision-making confidence. These other variables should first be studied in isolation and then combined to understand their synergy and redundancy. These information-theoretic concepts can then be used to create an uncertain data fusion prediction model [72] for agent wayfinding.