FlexiSee: flexible configuration, customization, and control of mediated and augmented vision for users of smart eyewear devices

Smart eyewear and Augmented Reality technology have been examined closely in the scientific community to provide vision rehabilitation to people with visual impairments as well as augmented vision to people with and without visual impairments for various application scenarios and contexts of use. However, current systems lack flexibility in the configuration and customization of the features and functionalities they present to their users, as we show in this paper by means of a thorough literature review and categorization of prior work on augmented and mediated vision for smart eyewear devices. To address the flexibility aspect that has been missing in prior work, we introduce FlexiSee, an application for smart eyewear devices, such as see-through Augmented Reality glasses and Head-Mounted Mixed Reality Displays, specifically designed to enable flexible configuration, customization, and control of both augmented and mediated vision. FlexiSee achieves this desiderata by implementing visual filters (e.g., color correction, edge highlighting, contrast adjustment, and others) that are coupled with a web-based interface, readily accessible from smartphones, tablets, smartwatches, and other devices with web browsers, where authorized users can specify and apply custom parameters for the visual filters implemented by FlexiSee. We also introduce FlexiSee-DS, a three-dimensional design space for FlexiSee-like applications, that includes mediation & augmentation, user categories, and control design dimensions to specify a variety of FlexiSee-like systems. We show how the dimensions of FlexiSee-DS were applied to inform the design of our FlexiSee system, and we highlight and focus on the distinction between primary users and vision monitors and assistants, where the latter two categories represent new types of users for augmented and mediated vision that have various degrees of control, from their remote locations, over the visual reality delivered to and perceived by the primary users of smart eyewear devices. We conduct a user study to understand the perception of vision monitors and assistants regarding our new FlexiSee concept and system, and we report empirical results about usability aspects (e.g., we found an average SUS score of 75.3 and high ratings for the perceived usefulness of FlexiSee) as well as user feedback and suggestions to inform further developments of FlexiSee-like systems and applications.


Introduction
Smart eyewear devices, in the form of Augmented Reality (AR) smartglasses and Head-Mounted Displays (HMD) for Mixed Reality (MR) and Extended Reality (XR) applications, represent a convenient category of wearable devices to implement vision assistance and vision rehabilitation for people with visual impairments.Examples of applications from the scientific literature include contrast enhancement, edge highlighting, magnification, and text and sign reading [28,67,70,78,79], while commercial eyewear devices, such as eSight [17] and OrCam [50] among others, implement a variety of vision mediation features to assist people with various visual disorders, including glaucoma, age-related macular degeneration, or people who are blind.Recently, the XR Access initiative [74] has been consolidating a community of researchers and practitioners around the common goal of making Virtual, Augmented, and Mixed Reality (all referred under the term XR [38]) more accessible to people with various disabilities, including people with visual impairments [6,71,75,76].However, possible applications of smart eyewear devices go beyond vision rehabilitation to address users without visual impairments toward delivering more accurate visual perception in low ambient light conditions [47], increased field of view [19,57], better peripheral vision [14], or new perceptions of phenomena occurring in other regions of the electromagnetic spectrum beyond visible light [1][2][3]37].Commercial AR and MR eyewear devices, such as Vuzix Blade [72], MagicLeap [34], or HoloLens [41], enable vision augmentation with computer-generated graphical content.
Research and development targeting assisted vision and vision rehabilitation have attracted a lot of interest and efforts from the scientific and practitioner community.However, existing systems are not always configurable or customizable and, where they are, customization is usually limited to just some features.For example, Zhao et al. [79] reported that the participants trying out their CueSee system expressed interest in customizing the color used to highlight objects of interest while the ChromaGlasses prototype of Langlotz et al. [31] enabled users to select a custom shift in the RGB space for compensating various types of color blindness.Other works discussed customization as a desirable feature of their systems or as a conclusion of their investigation that was left for future work [40,68].In the recent context of making AR, MR, and VR devices and applications more accessible and inclusive [74], enabling flexibility in controlling mediated and augmented vision becomes a relevant feature.
The flexibility aspect of augmented and mediated vision applications running on smart eyewear devices is especially relevant for designing vision rehabilitation devices and systems that are suited to and/or can adapt to the specific visual abilities of their users, i.e., ability-based design [73], since the visual needs and preferences for vision mediation and augmentation vary among users [58,62,63].For example, configuration of visual filters and customization of their parameters, such as the amount of edge highlighting applied to accentuate the edges of nearby objects, performed by the primary user of an eyewear device can be complemented by the specialized intervention of a medical professional that has access to the primary user's field of mediated vision and, thus, can tune from their remote location the parameters of the various visual filters toward more effective vision rehabilitation.In this work, we address the aspect of flexibility in presenting mediated and augmented vision to smart eyewear users on several dimensions: (1) flexibility in controlling augmentation and mediation, (2) flexibility in terms of the categories of users involved in the mediation and augmentation process, and (3) flexibility in terms of the input device and modality to control augmented and mediated vision.Our multi-faceted approach to design for flexibility in augmented and mediated vision systems and applications has both theoretical (i.e., a design space to inform such systems) and practical contributions (e.g., a working system with open-source code).The contributions of our work are as follows: 1. We introduce FlexiSee-DS, a three-dimensional design space for flexible configuration, customization, and control of mediated and augmented vision for smart eyewear users, including users with visual impairments.We distinguish among primary users, vision monitors, and vision assistants as three distinct user roles for FlexiSee-like applications, and between various types of visual augmentation and mediation, such as predefined, customizable, adaptive, and configurable features.2. To demonstrate the FlexiSee-DS design space, we present FlexiSee, a highlycustomizable and configurable vision mediation and augmentation application designed for the Microsoft HoloLens HMD [41]; see Fig. 1.We present the engineering details and technical implementation of FlexiSee and, in order to encourage further exploration of our concepts toward more development of assistive technology for users with visual impairments, we make the source code of our HoloLens FlexiSee application freely available in the research community.

Related work
We discuss in this section prior work on Mediated, Augmented, and Mixed Reality applications for smart eyewear devices, and focus on prototypes designed for users with visual impairments.We start our discussion by highlighting the distinction between mediated and augmented vision, the two concepts that we address in the FlexiSee-DS design space and implement in our FlexiSee application.Throughout this paper, we use the term smart eyewear to refer to devices worn at eye-level, that incorporate a video camera, and feature see-through lenses and Wi-Fi connectivity.1

Mediated vs. augmented vision
Following previous work on Augmented, Mixed, and Mediated Reality [7,8,10,35,37,[42][43][44]65], we distinguish between augmented and mediated vision.By augmented vision, we understand the use of AR and MR technology to render digital content on top of the visual reality for users of smart eyewear devices.For example, highlighting human faces in live video by means of a red rectangle and displaying next to each detected face the name and other information about the identified person represents a case of augmented vision.
Fig. 1 Top: Screen capture of the FlexiSee application for HoloLens presenting the user with a custom mediation of visual reality, represented in this example by color correction and highlighted edges.Bottom: this specific type of mediated vision is specified via a web user interface accessible from any web browser, such as web browsers running on smartphones (left) or smartwatches (middle), and can be controlled either by the primary user (top right) or by a vision assistant from a remote location (bottom right) on behalf of the primary user.FlexiSee features flexibility in terms of (i) configuration and customization of visual mediation and augmentation filters, (ii) user roles that are involved in specifying and controlling the visual filters, and (iii) input modalities, e.g., using the eyewear or an external device By mediated vision, we understand any modification of the visual reality by applying computer vision algorithms on the video frames captured by the video camera embedded in the smart eyewear device.For example, adjusting the contrast or highlighting the contours of the objects detected in the video delivered through HMDs represent instances of mediated vision.The distinction between augmented and mediated vision is important since augmented vision brings into the user's field of view new information, while mediated vision enhances the existing information already present in the visual reality.Moreover, mediated vision can be used to filter out unwanted selected information toward creating an antonymy with respect to AR, such as in the form of Diminished Reality.For example, according to Mann [35], "Mediated Reality ... differs from virtual reality (or augmented reality) in the sense that it allows us to filter out things we do not wish to have thrust upon us against our will" and, respectively, "Mediated Reality goes a step further [with respect to VR/AR/MR] by mixing/blending and also modifying reality" [37] (p. 1).Also, it is noteworthy mentioning that augmentation and mediation of the visual reality can occur independently and

MulƟmediated Reality Mediated Reality
Fig. 2 A visual diagram of mediated and augmented vision and the corresponding concepts of multiplexing [52] and multiplicities [83], illustrated as instances of Augmented and Mediated Reality, which are subsets of Mixed [43] and Multimediated Reality [37], respectively simultaneously, such as when over the contrast-enhanced video capture of the visual reality several visual effects are superimposed to highlight the presence and location of specific objects of interest to facilitate visual search tasks [79] while overall improving visual perception of the surrounding physical world [35,78].In this case, the result can be referred to as Augmented Mediated Reality or, for short, Augmediated Reality [29,37,64].
Other concepts from the scientific literature are equally relevant for our discussion regarding the mediation and augmentation of human vision using smart eyewear devices.For example, Zolyomi et al. [83] defined "multiplicities of vision" as technology-mediated sight that is a form of skilled vision, neither fully human nor fully digital, but rather "continuously assembled through a combination of social and technical affordances" (p.220).Furthermore, Peli et al. [54] proposed "vision multiplexing" for people with visual impairments, representing superpositioning of contour images over the natural view of a scene "to avoid or reduce [...] limitations [of other approaches] by combining both the wide field-ofview and the high-resolution capabilities in devices in ways that permit these functionalities to be both separable and useful" (p.366).Figure 2 presents a visual illustration of these concepts. 2Following the multiplicity perspective of Zolyomi et al. [83] and the multiplexing concept of Peli et al. [54], we discuss and implement in this work sets of visual filters that, when applied in a specified order, progressively mediate and augment the visual perception of users of smart eyewear devices.

Smart eyewear applications for users with visual impairments
Prior work has proposed and evaluated a variety of applications for smart eyewear devices, such as video camera glasses, AR smartglasses, and MR HMDs.According to Coughlan and Miele [15], AR applications for users with visual impairments, abbreviated AR4VI, can be divided in two categories: global applications that augment the physical world in the user's proximity, and local applications that augment physical objects that the user can touch and explore.In this section, we overview such applications and focus on the type of vision augmentation and mediation that they implement.But before, we briefly discuss studies that examined the needs of users with visual impairments for assistive technology.
An important preliminary stage in the process of designing assistive technology that is relevant and useful is represented by understanding challenges experienced by people with visual impairments in using technology in general as well as their needs for vision augmentation and mediation with AR, MR, and VR devices in particular.By adopting interviews as the methodology of choice to understand potential users' needs for vision augmentation via smartglasses, Sandnes [60] reported face and text recognition being the most important features that people with visual impairments, participants from their study, sought in smartglasses applications.Brady et al. [12] documented visual challenges experienced in everyday life by people who are blind by conducting a large-scale study with over 5,000 participants and 40,000 questions regarding the content of photographs captured by blind people.The authors created a taxonomy of questions and highlighted various categories, such as "what color is this shirt?"or "what does this say?," for which a social community (the VizWiz Social) could provide answers.Szpiro et al. [69] reported that the needs of people with low vision with regards to assistive technology are different from the needs of people who are blind, and highlighted the importance of designing technology for vision enhancement [68,69,78,79].Rusu et al. [58] reported results from a lead-in study with five participants with low vision, in which they correlated psychological well-being evaluations, self-perceived efficiency for performing daily activities, and reported needs for eyewear technology to assist and augment visual abilities.The authors also suggested the use of models of human vision, e.g. from Marr et al. [39], to inform design of mediated and augmented vision.
Next, we discuss systems and applications that were designed to provide vision augmentation and mediation to users with visual impairments.We organize the rest of this section according to the specific features, e.g., magnification, color correction, contour highlighting, etc., that these systems implemented.We also report on the degree of customization of the vision augmentation and mediation functionality featured in prior work.

Magnification, reading text and signs
Harper et al. [23] discussed head-mounted video magnification devices for vision rehabilitation of people in need of assistance with reading, watching television, and independent travel.Huang et al. [26] proposed a sign-reading assistant implemented with the HoloLens HMD [41] that featured magnification and high-contrasting fonts.Their system allowed users to indicate to a nearby text sign, such as "Staff Only" or "Rooms 327-330," and the application displayed and read the text out loud.Reading was also addressed by Stearns et al. [66] that employed magnification to assist people in reading printed text by means of a finger-worn camera and results presented via the HoloLens HMD [41].In a follow-up work, Stearns et al. [67] proposed an AR magnification tool in which the user captured a video frame with their smartphone, after which the image was magnified and displayed via HoloLens.The VizLens system of Guo et al. [21] is another example that relied on a mobile application to assist blind people in employing nearly any real-world interface via screen reading.

Independent mobility and wayfinding
Several systems and applications have been designed to assist with mobility and navigation.For example, Everingham et al. [18] proposed a system mobility aid consisting of a video camera and display unit that used a neural network classifier to identify objects in video.The video feed presented to the user was modified so that distinct colors would depict and highlight different objects.Hicks et al. [24] proposed a system to detect the distance to nearby objects by using brightness intensity values, e.g., objects closer to the user were shown in brighter colors.Mobility was also addressed by Zhao et al. [77], who implemented AR visualizations delivered via the HoloLens HMD [41], to highlight stairs with color on the premise that stair navigation in unfamiliar environments can be challenging for people with low vision.Szpiro et al. [68] observed how their study participants with low vision performed wayfinding and shopping tasks in unfamiliar environments and reported that, although low vision aids were commercially available at the time of their study, participants mostly used their smartphones.However, while smartphones were found useful outdoors for wayfinding, they were also the source of frustrations during shopping tasks.The authors concluded with the need for assistive technology to enhance visual information for users with low vision, rather than converting that information into other output modalities, such as audio or tactile feedback [68].

Contours and contrast enhancement
Specific systems have been designed for specific vision conditions and disorders.For example, magnification and edge enhancement has been considered for the vision rehabilitation of people with peripheral or central vision loss [53,54] to assist with visual search and collision detection tasks.For example, people with tunnel vision (i.e., loss of peripheral vision with retention of central vision) experience collisions, may stumble, and encounter challenges in visual search tasks, for which edge enhancement represents an useful assistance tool [33].Hwang and Peli [27] found that people with age-related macular degeneration, juvenile macular degeneration, glaucoma, and myopic degeneration preferred mild to moderate degrees of contour enhancement when watching TV or viewing images.Edge enhancement was also found to improve performance in visual search tasks on computer screens.

Color correction
Color vision deficiency or color blindness has been addressed by changing colors in the live video feed delivered to users, e.g., Melillo et al. [40] implemented changes in three color intervals corresponding to deuteranopia (red-green color blindness), protanopia (inability to perceived red light), and tritaniopia (blue-yellow color blindness).Fuller and Sadovink [20] developed a Google Glass application that classified colors for people with color blindness, and Tanuwidjaja et al. [70] proposed Chroma, a wearable Google Glass application that manipulated colors in AR to assist with color deficiencies.Experiments conducted by Zhao et al. [78] with ForeSee, a customizable head-mounted vision enhancement system for people with low vision, reported that color red was challenging to distinguish by their study participants with low vision, while colors white and yellow performed the best.Moreover, most of the study participants reported that color blue attracted their attention more when looking at objects due to it having more contrast, but they also found blue text difficult to read.ChromaGlasses [31] is another example of a system designed to replace critical colors in the video feed presented to colorblind users with more easily distinguished alternative colors.

Face recognition
Another challenge experienced in everyday life by people with low vision is recognizing other people [81], which impacts negatively their involvement in social activities.To address this aspect, Zhao et al. [81] reported findings on the use of a face recognition application (the Accessibility Bot, a research prototype bot built on top of Facebook Messenger that helped identify friends from Facebook pictures) outside laboratory conditions.While the application was appreciated helpful, user experience was negatively affected by the low perceived accuracy and because of difficulties while aiming the camera to capture the face of the nearby person.

Complex smart eyewear applications featuring multiple visual enhancement functions
A few applications have implemented several types of visual enhancements, which makes them more flexible to be employed in a variety of use case scenarios and for a variety of users.For example, Zhao et al.'s [78] ForeSee system implemented several image processing algorithms (referred to as visual filters in this work), such as contrast enhancement, text extraction, black/white reversal, edge enhancement, and magnification.CueSee [79] was designed to assist with recognizing specific products and delivering users with corresponding cues, such as flashes, spotlights, movements, or sunrays, to identify those products more easily.After a formative study to understand the challenges that people with low vision experience with VR technology, Zhao et al. [75] designed SeeingVR, a set of low vision tools implementing several visual filters, such as magnification, brightness, contrast, edge enhancement, text augmentation, text-to-speech, depth measurement, object recognition, highlighting, and recoloring for VR applications addressing users with low vision.
Several recently available commercial eyewear devices implement many of the techniques discussed in this section for vision assistance.For instance, OrCam MyEye 2 [50] is a system designed for people who are blind or with visual impairments that implements computer vision algorithms for detecting, recognizing, and reading text, face recognition, general object recognition, such as banknotes, bar code reading, and color detection.Another example is eSight [17], an eyewear device designed to improve functional vision by means of contrast and brightness adjustment and magnification.During an investigation of eSight 2 with thirteen participants with visual impairments, Zolyomi et al. [83] documented the social and emotional impacts associated with assistive eyewear technology.Their results showed that assisted vision was not perceived as cure or replacement of fully functional sight, and also not appropriate for all situations, but rather "a new type of sight that provided [participants] with an experience of the visual, if not the singular notion of vision that they might have held when they first heard about the device" (p.226).

Video camera glasses, lifelogging, and abstracting life
Some smart eyewear devices were not designed to deliver augmentation or mediation of vision in real time, but rather to record and collect visual data that users could consult at a later time as retrospective memory aids [4,25,32,82].These systems fall into the category of lifelogging applications [22].For example, Aiordachioae [5] proposed a system to share first-person video, captured using smartglasses with embedded video cameras, with remote viewers.For such systems, there is no vision mediation or augmentation, but only video streaming to third parties, who can thus experience the visual perspective of the smartglasses user.Another example is Life-Tags [4], a smartglasses-based system and application for abstracting life in the form of clouds of tags and concepts, automatically extracted from video.

Summary
In this paper, we are interested in providing users with flexible configuration, customization, and control over the mediation and augmentation of visual reality delivered by see-through eyewear devices.To this end, an understanding of what "flexible" means for visual augmentation is paramount, which we address in this work by introducing a new design space, FlexiSee-DS, with several dimensions and design options.FlexiSee-DS enables instantion of several prior designs of visual augmentation systems from the literature.For example, from the body of work surveyed and discussed in this section, the work most connected to our goals is represented by the VR and AR systems and corresponding studies conducted by Zhao et al. [75,[77][78][79][80] for smartglasses and HMDs and people with visual impairments.However, unlike this prior work, such as CueSee [79], ForeSee [78], and SeeingVR [75] (and also unlike other systems; see Table 1 for an overview of prior work on vision augmentation and mediation), our FlexiSee concept and system implements new categories of users (remote monitors and assistants), which opens new dimensions for designing complex systems for vision monitoring, assistance, and enhancement.For example, while vision monitors have access to the live video stream only, vision assistants can specify visual filters and apply those filters for the benefit of the primary users of FlexiSee.Moreover, unlike other systems (Table 1), FlexiSee can be controlled from a variety of devices by means of its mobile first, responsive web-based user interface, which includes tablets, smartphones, and smartwatches (see Fig. 1 for illustrations).To better differentiate FlexiSee from prior work, but also to characterize the amount of flexibility presented by any system designed to augment and/or mediate vision, we introduce in the following a four-level categorization of flexibility in vision augmentation: F 1 : The authors discuss the customization option for their systems, but do not implement it.F 2 : The system is customizable by the system designer alone.F 3 : The system can be customized by users, but partially (only some features can be customized).F 4 : The system is fully customizable by users.
For each paper that presented a functional prototype [18, 23, 31, 40, 66-68, 75, 76, 78, 79], we identified and extracted the features of that prototype that users could control and to what extent those features could be controlled.Table 1 presents a summary of the prototypes overviewed in the previous subsections from the perspective of how much customization and flexibility they allow to their users for controlling mediated and augmented vision (e.g., color correction, edge enhancement, or magnification) characterized by the levels F 1 (no customization implemented) to F 4 (fully customizable features).For example, Szpiro et al. [68] mentioned the need for customization for mobile systems designed for low vision to account various preferences for visual enhancement, but did not pursue implementation (F 1 level) since it was not the main purpose of their study; flexible color correction was implemented by Everingham et al. [18] and Zhao et al. [76] that described systems enabling users to choose from a predefined set of saturation colors (F 3 level); and Langlotz et al. [31] integrated the option for users to select a custom shift in the RGB color space, making their visual filter fully customizable (F 4 ); see Table 1 for more examples with corresponding quotes from the respective papers in order to precisely identify the options for flexibility.This overview of the related literature shows that customization of augmented and mediated vision has been considered in prior work and addressed at various levels, but without a formal systematization of the dimensions that are customizable, while customization has been restricted to visual filters only.In the next section, we formalize various dimensions of flexibility for augmented and mediated vision in the form of a design space, References are listed in chronological order † These references do not describe a customizable system or prototype, but they highlight the need of personalization or customization of assistive systems ‡ System exclusively addressing VR, but we include it because of its many customizable features Level of flexibility, from F 1 to F 4 ; see the text for details FlexiSee-DS, that can be used to inform and characterize the features of new prototypes and systems, such as our FlexiSee application.

The FlexiSee-DS design space and FlexiSee application
We describe in this section the design principles, software architecture, and technical implementation of our FlexiSee application.We also introduce the FlexiSee-DS design space that enumerates possible design options for a variety of FlexiSee-like applications.

Visual filters for mediated and augmented vision
We start by defining the notion of a visual filter employed by FlexiSee.A visual filter is any software-based modification of the video frames captured by the built-in video camera of the smart eyewear device that are rendered on the see-through lenses and aligned with the physical world; see Fig. 1, top from Section 1 for an example of a visual filter illustrating edge enhancement. 3Visual filters can implement either mediation (e.g., contrast adjustment or edge enhancement) or augmentation (e.g., detected faces are highlighted with a flashing rectangle around them); or, they can implement both mediation and augmentation for Augmediated Reality [37].Using C++ language formalism and OpenCV [49] data structures,4 a visual filter is any implementation of a function that takes as input a video frame and outputs a modified version of it, as follows: where cv::Mat is the OpenCV class for implementing n-dimensional dense arrays, 5 which in our case are video frames represented by matrixes of RGB pixels; filterType specifies the type of processing that is to be applied to the video frame (e.g., contrast adjustment); and ... indicates a number of optional parameters and the parameters that may follow (e.g., the amount by which the contrast is adjusted).By defining a visual filter as a function that takes input and returns data of the same type (cv::Mat), applying multiple visual filters in a sequence becomes an easy task both conceptually and from a practical perspective.For example, a contrast adjustment visual filter can be followed by an edge enhancement filter, after which face detection results can be rendered on the contrast-and edge-enhanced video frame.Moreover, such a sequence of visual filters and their corresponding parameters can be specified using standard data representation and data-interchange formats, easy to understand and edit by users.In the next subsections, we present our technical implementation for sequences of visual filters, which are specified in the FlexiSee application in the form of JSON representations.

FlexiSee-DS, a design space for FlexiSee-type applications
Before presenting our FlexiSee application, we introduce the FlexiSee-DS design space and describe the steps that led us to the identification of the three design dimensions of FlexiSee-DS.In a first stage, we formulated design principles for the operation of the FlexiSee application in the form of three quality properties (Q 1 to Q 3 , presented below) regarding the flexibility envisaged for customizing visual perception (i.e., the degree in which augmented and mediated vision are customizable), flexibility in terms of user categories and roles (i.e., who controls the visual filters?), and flexibility regarding the control modalities allowed by FlexiSee (i.e., how are visual filters specified, activated, and deactivated?), as follows: Q 1 .Specification of visual filters: FlexiSee should enable easy specification of visual filters and of their corresponding parameters to address a wide range of usage scenarios and user categories, including users with low vision.Q 2 .Control of visual filters: FlexiSee should enable both local and remote control of visual filters by both the wearer (primary user) and assistants (secondary users).Q 3 .Integration with other personal smart devices: FlexiSee should easily integrate with other smart devices, such as smartphones and smartwatches, and with applications and services available on the web via standard web-based protocols.
In the second stage, we extended these quality properties into design dimensions for prototyping FlexiSee-like applications with various functionalities and for various contexts of use by identifying design options for each property.The result is represented by the FlexiSee-DS design space with the following three dimensions (see Fig. 3 for a visual illustration): 1.The Mediation & Augmentation axis specifies possible ways in which users' visual perception may be enhanced by FlexiSee-like applications implementing visual filters on smart eyewear devices.For this dimension, we identify five categories or design options: (a) Predefined mediation and augmentation visual filters that are built-in in the hardware/software of the smart eyewear device.(b) Customizable visual filters, for which users can tune parameters, e.g., the level of contrast for a contrast enhancing filter or the colors that are shifted by a color changing or correction filter.

Default
Fig. 3 The FlexiSee-DS design space, highlighting the three quality dimensions regarding flexible mediation and augmentation, user roles, and control modalities in the form of three independent axis with various design options.Notes: the origin specifies no augmentation, no control, and no user, e.g., regular eye glasses when they are not used.The default point near the origin of the FlexiSee-DS space specifies a system with predefined augmentations that can be controlled by the primary user and via the system exclusively.The most flexible instantiation of a visual perception enhancement system in this space would be anticipative augmediation with mixed control performed by a mixed category of users (c) Adaptive visual filters tune their parameters automatically based on data collected by sensing and understanding the context of use, e.g., a contrast adjustment filter that adapts to low ambient lighting conditions according to the measurements provided by a light sensor embedded in the eyewear device would fall into this category.(d) Configurable visual filters, for which users can define new functionality, e.g., by combining multiple filters that, when applied in a specific order, can generate new types of augmediated vision.For example, a color correction filter followed by an edge enhancement filter may lead to a different result compared to the case when the application of two filters is reversed.Sequences of visual filters that users can specify by themselves achieve the property of configurability, which subsumes customizability.(e) Anticipative visual filters, where built-in Artificial Intelligence models and algorithms use data (e.g., user's settings and preferences, user profile, logs and usage history of the device and application, etc.) to anticipate needs and to perform corresponding adjustments of the visual filters, including recommendations provided to users.
Note the increasing level of complexity of the way in which mediated and augmented vision can be specified, from predefined filters that can be modified solely via software or hardware updates to an anticipative behavior of the smart eyewear device, reflective of the characteristics of systems and applications pertaining to Ambient Intelligence [16,59] and semantic Ambient Media [55,56] use case scenarios.As we showed in Section 2, customization of augmented and mediated vision has been implemented to a limited extent by the systems introduced in prior work (see Table 1), but we could not find any cases of adaptivity (i.e., automatic customization) or configurability (i.e., repurposing existing application features to define new functionality), let alone anticipatory behavior.2. The Users axis specifies the various types of users that are involved in customizing or controlling augmented and mediated vision.This axis identifies the primary user wearing the smart eyewear device that has direct access to the augmented and mediated vision, vision monitors, vision assistants, and the mixed category, where control is shared by several categories of users, e.g., primary and assistants, or primary, monitors, and assistants alike.We distinguish between vision monitors that only have access to the live stream of the augmediated video and vision assistants that have some degree of control over the visual filters applied by the FlexiSee application for the primary user.
To the best of our knowledge, such flexibility in terms of various user categories has never been proposed in the literature (see Table 1 for an overview).3. The Control axis characterizes the ways in which control of the visual filters (e.g., specification, activation, deactivation, etc.) is implemented.We distinguish four categories of control and corresponding design options: on-the-eyewear operated by the primary user; on-other-device, such as the primary user's smartphone or smartwatch; remote via a web-based user interface, to which secondary users, such as vision assistants have access; and mixed, including combinations of the previous categories.filters according to the current configuration, and rescaled to align as best as possible with the user's field of view.Customization and configuration of visual filters are implemented by the primary user via an external device, such as a smartphone or smartwatch, that runs a web browser.Vision assistants have access to the same web-based user interface.Both vision assistants and vision monitors can watch a live video stream of the primary user's mediated and augmented field of view, as delivered by FlexiSee.

Technical details of the implementation of FlexiSee
We implemented FlexiSee using a first generation Microsoft HoloLens HMD [41] featuring a 32-bit Intel architecture, 64 GB Flash and 2 GB RAM memory, and running Windows 10.We used Visual Studio 2017 and the Windows Software Development Kit (SDK) for the Windows 10 operating system and the C++ programming language to implement a Universal Windows Platform (UWP) application.We also used Boost [11] (the boost-system, boost-date-time, and boost-regex components) and RapidJSON,6 a header-only fast JSON library.
Listing 3 illustrates a section of C++ code implementing the visual filters that FlexiSee provides by default.These filters are configured and their parameters customized either by the primary user or by the vision assistants via the web-based user interface and uploaded to FlexiSee using the JSON data-interchange format; see Fig. 4. The procedure from Listing 3 iterates over all the JSON members to identify relevant keywords, such as Listing 1 Example of a JSON customization file activating face detection in FlexiSee "contrast", "edge", "replace," or "color" that match the default visual filters.The applyVi-sualFilters(...) method calls the prototype function visualFilter(...) presented in Section 3.1.Listing 1 shows an example of a JSON that customizes a single visual filter (face detection enabled for the "HoloLens-1" user), while Listing 2 presents a more complex JSON configuration file specifying a sequence of visual filters and their corresponding custom parameter values (contrast is increased by 60% and edges are highlighting against the background of the visual reality, among other visual filters).
The following visual filters are implemented by default in FlexiSee using the OpenCV library [49]: 1. Contrast adjustment: Fig. 5, first row demonstrates contrast adjustment by 30%, 60%, and 90%, respectively.The corresponding OpenCV function is convertTo(..): 7 to increase contrast by 30%, the beta parameter of the function is set to 1.3; when beta is less than 1, contrast is decreased by the specified amount.We decided to implement contrast adjustment in FlexiSee since previous systems from the literature demonstrated its effectiveness for vision mediation; see Harper et al. [23], Peli [51], eSight [17], Zhao et al. [75,79], Tanuwidjaja et al. [70], Hwang et al. [27], and Satgunam et al. [61].2. Brightness adjustment: this visual filter was equally implemented with the OpenCV convertTo(..) function.This time, the argument is added to each pixel value.Figure 5 illustrates brightness adjustment implemented in FlexiSee with parameters 15, 35, and 55, respectively.When the specified value is negative, brightness is decreased.Prior work [17,24,75] inspired us to implement this visual filter in FlexiSee.3. Edge enhancement: we implemented the Canny filter to detect edges and present them with a distinct color (see Fig. 5, third row, second image) and with coloring effects for the background (Fig. 5, third row, third and fourth images).We implemented this filter since previous work found it useful for vision rehabilitation; see Peli et al. [54], Hwang and Peli [27], Langlotz et al. [31], and Zhao et al. [75,78,80].4. Color replacement: this visual filter is implemented by creating masks for the lower and upper values of the color that will be replaced.Figure 5, fourth row illustrates the color replacement visual filter customized for three different types of color blindness: deuteranopia, tritanopia, and protanopia.We implemented color replacement in FlexiSee due to a large number of previous work demonstrating its utility for vision rehabilitation; see Zhao et al. [75,76,79], Fuller and Sadovnik [20], Langlotz et al. [31], Melillo et al. [40], and Tanuwidjaja et al. [70]. 5. Face detection: in order to implement this visual filter, we used the detectMultiScale(..) function 8 from OpenCV with a classifier trained with the frontalface model.Figure 5, last row and column shows the result of applying the filter.The implementation decision was informed by the results of prior systems and studies [50,60,76], e.g., Sandnes [60] Listing 2 Example of a complex JSON configuration file specifying a sequence of visual filters and corresponding custom parameter values reported face and text recognition being the most important features that people with visual impairments, participants from their study, sought in smartglasses applications.
While the first four rows of Fig. 5 demonstrate customizability, the last row illustrates configurability, i.e., the application of a sequence of visual filters in a row, as follows: contrast enhancement (second image), contrast enhancement and replacing red with yellow (third image), and face detection (fourth image).
Vision monitor and vision assistant roles are implemented in FlexiSee via a web-based user interface and a video streaming service over the web.The web user interface is illustrated in Fig. 1 from Section 1 and was implemented using standard HTML5 and CSS technology.We decided for a web-based implementation since it can be accessed easily from any web browser and any device and operating system featuring a web browser, including mobile and wearable devices, such as smartphones and smartwatches for the primary user and tablets and desktop PCs for vision assistants and monitors; see Fig. 1.For the implementation of vision monitoring, we opted for live video streaming over the web.The HoloLens HMD allows streaming live video, referred to as Mixed Reality Capture (MRC). 9When the HoloLens HMD is connected to a WiFi network, MRC can be visualised via an internal IP address. 10We used Open Broadcaster11 in order to distribute the MRC video stream to a dedicated YouTube channel, which can be accessed via any web browser from any device; see Fig. 6.In this implementation, both vision monitors and assistants have easy access to the primary user's video stream of augmented and mediated vision.

User study
We conducted a user study to collect feedback from remote users in order to understand usability aspects and perceptions about the FlexiSee concept and implementation, but also to collect suggestions to inform further developments of FlexiSee-like systems and applications.In our study, we focused on remote users since the performance and perceptions of primary users with regards to HMD-based vision augmentation have been evaluated before in the scientific literature [67,75,78,79].

Participants
Ten young adults, aged between 20 and 32 years old (M = 26.6,SD = 3.8, Mdn = 27.5 years) volunteered for our usability study to play the part of the remote vision assistants for the FlexiSee HoloLens HMD system worn by one primary user (the first author).Three participants were female.Since our study focused strictly on understanding usability aspects for FlexiSee, N = 10 participants are more than enough; see Nielsen's [46] well-known recommendation that the best results in usability tests come from testing with no more than five users, according to Nielsen and Landauer's [45] mathematical model for finding usability problems.All of our participants were young adults and, thus, representative of potential users and adopters of FlexiSee technology, as it is a known fact that young people use a greater breadth of technologies compared to older adults and are more open to new Fig. 6 A vision monitor has access to the primary user's mediated and augmented vision via live video streaming on their smartphone.Also see Fig. 1, bottom for illustrations of the web-based user interface for vision assistants technology, whereas older adults are more likely to use technologies that have been around for a longer period of time [48].Moreover, all of our participants reported owning and using smartphones on a regular basis, and four participants owned smartwatches, which are two input devices with which FlexiSee was designed to be controlled.

Apparatus and task
Participants employed the FlexiSee web-based user interface (see Fig. 1, bottom-right for an illustration), from which they could control all the visual filters and observe their effects in a live YouTube video stream representing an exact replica of what the FlexiSee primary user was also seeing in the HoloLens HMD.Participants were instructed to test each visual filter in order to understand its operation and effect.They were given absolute freedom to explore the web-based user interface and specify, from their remote location, the parameters of the augmented and mediated vision experienced by the primary user.For the purpose of this study, we logged participants' activity in the web user interface and found that, overall, our participants applied the five types of visual filters for a total number of 213 times (M = 21.3,SD = 15.0,Mdn = 15.0) and each visual filter received roughly the same attention (M = 20.0%,SD = 2.1%).In descending order of their frequency of use, the visual filters tried out by our participants were: edge enhancement (21.6% of the trials), contrast enhancement (21.1%), color replacement (20.7%), brightness enhancement (20.2%), and face detection (16.4%).After testing the FlexiSee web-based user interface, participants filled in a Google Forms questionnaire asking about their experience; see next for details.

Measures
We employed the following categories of measures to collect feedback from our participants: 1. Use of online video and social media: to understand the use of smart technology in our sample of participants, we asked participants about their regular use of smart devices (smartphones and smartwatches), YouTube video watching, and use of social media platforms.Participants rated the following affirmations: "I use YouTube on a regular basis" and "I use social web sites and applications (Facebook, Messenger, WhatsApp, Instagram, etc.) on a regular basis" using 5-point Likert scales with values from 1 (strongly disagree) to 5 (strongly agree).
2. System Usability Scale (SUS): we applied the SUS test [13] to evaluate the overall perceived usability of our web-based user interface.The SUS test consists of ten statements for which participants rate their degree of agreement using 5-point Likert scales.At the end, participants' answers are aggregated into a score ranging from 0 (corresponding to low usability) to 100 (a perfect usability score).3. Perception of the FlexiSee concept: we asked participants to rate their degree of agreement with the following affirmations using 5-point Likert scales from 1 (strongly disagree) to 5 (strongly agree) regarding the perceived usefulness of the live video feature, setting visual filters remotely, connectedness with the primary user, and the overall perceived value of the FlexiSee concept and implementation.4. Freeform feedback: we asked participants about any feedback to further improve our concept and system in the future.

Results
Participants reported using YouTube on a regular basis (M = 4.2, SD = 0.8, Mdn = 4) as well as social media platforms (M = 4.6, SD = 0.7, Mdn = 5), a result that confirms the suitability of our sample of participants for evaluating FlexiSee as regular users of smart devices and consumers of online video content and media.We found no statistically significant difference between the self-reported use of YouTube and social media among our participants (Wilcoxon signed-rank test Z= − 1.300, p=.375>.05,n.s.) SUS scores varied between 57.5 and 100.0 with an average of 75.3, which places the usability of our FlexiSee web-based user interface above the "good" threshold [9] (corresponding to a SUS score of about 68) and within the acceptable range, according to the interpretation scales of SUS scores [9].Ratings of the various features of FlexiSee are shown in Table 2.We found no statistically significant difference (Z= − 1.265, p=.359>.05,n.s.) between participants' ratings regarding the usefulness of live video for remote assistants (M   2, rows 1 and 2, for the corresponding questions.Also, no significant difference between the perceived usefulness of visual filters (Z= − 1.633, p=.250>.05,n.s.) that are controlled by remote assistants (M = 4.3, Mdn = 4.5) and primary users (M = 4.8, Mdn = 5.0), suggesting equally perceived levels of utility; see Table 2, rows 3 and 4. Perceived connectedness was medium (M = 3.6, Mdn = 3.0), which is a fine result because our FlexiSee system was not designed for connectedness in the first place, and this result informs the opportunity for future work in this direction.Also, all of the participants saw good added value in our concept and implementation (M = 4.8, Mdn = 5 on a scale of 5) and used positive words, such as "beneficial" (P 2 ), "advanced" (P 2 , P 9 ), "useful" (P 4 ), "interesting" (P 6 , P 7 ), and "innovative" (P 9 ), to describe the FlexiSee concept and system.For example, P 9 commented "I think that FlexiSee is innovative and uses advanced technology" and P 2 considered that "the technology is beneficial and advanced."Two participants suggested new visual filters that would be useful to implement by future versions of FlexiSee, such as simultaneous correction of multiple colors (suggested by P 3 ) and a Gamma filter for people with nyctalopia (P 8 ).

Open source code for FlexiSee
To foster more research and development regarding vision augmentation and mediation, we are releasing the source code of the FlexiSee application in the open source domain.The source code, representing a Visual Studio project and application written in C++, can be freely downloaded from the web address http://www.eed.usv.ro/mintviz/projects/Senses++.

Conclusion and future work
We presented in this work FlexiSee, an application for smart eyewear devices that demonstrates flexibility in terms of configuring, customizing, and controlling augmented and mediated vision.FlexiSee comes to address a gap in the scientific literature, where previous systems for vision augmentation and/or vision rehabilitation were designed with little flexibility in terms of customizing their features and functionality.Overall, our evaluation showed good usability, favorable perception regarding the concept and implementation, and very good added value of our system idea in the landscape of mobile and wearable devices.
Together with FlexiSee, we introduced the FlexiSee-DS design space to inform future developments of FlexiSee-like systems to meet various application and user needs.The source code of FlexiSee is available to the scientific and practitioners' community to foster more research and development in this direction.Future work will look at implementing various FlexiSee-like applications according to the design possibilities enumerated by the FlexiSee-DS space, including anticipative behaviour and recommendation systems for custom visual filters.For example, in the FlexiSee-DS space, our system is positioned at the intersection of the configurable, mixed users, and mixed control categories.Other implementations of FlexiSee-type systems may choose other configurations, such as from simple applications where predefined visual filters are activated and deactivated on the eyewear device by the primary users only to more complex designs involving other smart devices and categories of users.We also plan to integrate FlexiSee with concept recognition applications and systems, such as Life-Tags [4], toward augmediated applications for lifelogging enthusiasts.In this work, we focused on a usability evaluation of FlexiSee, and detailed examinations are envisaged for the future.
For example, future work will also consider task-oriented evaluation of FlexiSee in order to understand input performance in detail, e.g., the effect of input device on user performance quantified using task completion times and error rates, but also to understand the use of FlexiSee in longitudinal studies in order to inform further versions better matched to users' needs and visual abilities.Baseline comparisons against a control condition, such as vision augmentation with no remote assistants or monitors, will complete our understanding of user performance with FlexiSee.

Figure 4 Fig.
Figure4illustrates the software architecture of FlexiSee.Video frames are acquired from the video camera embedded in the smart eyewear device, processed by applying visual

Listing 3
C++ sequence of code implementing several visual filters in FlexiSee

Fig. 5
Fig. 5 Output examples for visual filters implemented by FlexiSee with custom parameter values.From top to bottom: contrast adjustment, brightness adjustment, edge enhancement, color replacement, and a sequence of visual filters (contrast enhancement, color correction, and face detection).Notes: the images shown in the first column are not processed; the other columns show the effects of the respective visual filters with custom parameter values

Table 1
Types of customization for mediated and augmented vision identified in the scientific literatureFeatureAuthors' description of system/feature flexibility FColor "The form of output is easy to customise to a particular user's requirements [...] we have used a predefined set of high saturation colours, but these colours may be customised by a user to improve visibility according to his or her particular visual impairment."[18,p. 3] "For each of our interface designs, the user should be able to customize the magnification level, position, text processing and other settings."[66,p.362]F 1 n/a †"from the technical point of view, a future development could be tuning of the correction matrix in order to be customized for the specific alteration of each subject."[40,p. 6]

Table 2
Participants' self-reported perceptions of the FlexiSee concept and implementation the FlexiSee concepts fits well and brings added value to inter-human communication and collaboration in the current context of prevalent smart mobile devices, wearable gadgets, and activity in social media web sites † †Higher is better