TRAFFIC SCENARIOS AND VISION USE CASES FOR THE VISUALLY IMPAIRED

We gather the requirements visually impaired road users have concerning a camera-based assistive system in order to develop a concept for transferring algorithms from Advanced Driver Assistance Systems (ADAS) to this domain. We therefore combine procedures from software engineering, especially requirements engineering, with methods from qualitative social sciences, namely expert interviews by means of problem-centered interview methodology. The evaluation of the interviews results in a total of six traffic scenarios, that are of interest for the assistance of visually impaired road users, clustered into three categories: orientation, pedestrian and public transport scenarios. From each scenario, we derive use cases based on computer vision and state which of them are also addressed in ADAS so that we can examine them further in the future to formulate the transfer concept. We present a general literature review concerning solutions from ADAS for the identified overlapping use cases from both fields and take a closer look on how to adapt ADAS Lane Detection algorithms to the assistance of the visually impaired.


INTRODUCTION
According to the World Health Organization (WHO) [1], 254 million people are estimated to be visually impaired.In addition to a decrease in life quality and independence, visual impairment causes a decline in the personal mobility [2].To counteract this, we transfer computer vision algorithms from Advanced Driver Assistance Systems (ADAS) to the domain of visually impaired road users.Although some research for object detection in traffic situations for the visually impaired exists, there is a much higher amount of research in the field of ADAS.Hence, it is suitable to profit from the progress in driver assistance and make it applicable for visually impaired pedestrians.However, because the algorithms cannot be transferred without any changes, a transfer concept is necessary.In [3], we motivate the need for a transfer concept, give an overview of the current research status in camera-based driver assistance as well as camera-based assistance of the visually impaired, and present first findings concerning the development of such a transfer concept.We furthermore present a sketch for an assistive system consisting of a camera, headphones, and a smartphone.The algorithms resulting from applying the transfer concept can later be integrated into such a system, but we first focus on the concept development.
Therefore, we gather the situations where visually impaired road users need support.We do this by combining procedures from software engineering, especially Requirement Engineering (RE), with methods from social sciences.As proposed by Sommerville in the chapter about RE [4], we conduct interviews, but whereas RE usually refers to a software system, we stay on a higher level of abstraction by discussing traffic situations for the visually impaired in general.For the interviews, we apply methods from qualitative social sciences, namely expert interviews [5] by means of problem-centered interview methodology [6].The evaluation of the interviews relies again on RE approaches.We use a slightly modified form of scenarios as described in [4] to cluster the traffic situations mentioned in the interviews and from these scenarios we derive use cases based on computer vision.Finally, we analyze which of the gathered use cases are addressed in ADAS and have to be further examined in the future in order to formulate the transfer concept.We present a general literature review for these overlapping Use Cases and examine the use case Lane Detection in particular.
In chapter 2, we describe the used methodology including our approach of using software engineering methods for the representation of qualitative data.Then, chapter 3 presents the current status of visually impaired road users by describing the traffic scenarios and derived computer vision use cases that result from the interviews.Chapter 4 presents an examination of ADAS solutions for Lane Detection in particular and gives an overview of existing ADAS solutions concerning the remaining use cases.Chapter 5 outlines our future work and concludes the article.

METHODOLOGY
First, we explain why we conduct expert interviews and discuss the chosen method of problem-centered interviews.Afterwards, the interview guideline is described and an overview of the interviewees' characteristics is given.We conclude this chapter by presenting our approach of using software engineering methods for the representation of qualitative data.

Expert Interviews
The aim of the expert interviews is to collect the main problems visually impaired persons face in traffic situations.Furthermore, we analyze traffic scenarios where a camera-based assistive system could provide support to visually impaired road users.In this context, we define experts as persons who are regularly in contact through professional or volunteer work with a diverse group of visually impaired people concerning age, gender, and degree of impairment.An own visual impairment of the experts is possible, but not a precondition.Rather than referring to the own experiences of the interviewees, expert interviews intend to access the "contextual knowledge" [5] the interviewees have acquired concerning the living conditions of a certain group of people, in our case the visually impaired.
As method for the expert interviews, we refer to problem-centered interviews according to Witzel [6], a method that is settled between narrative and structured guideline interviews.During the interview, the sequence of the questions is handled flexible resulting in a dialogue between interviewee and interviewer.
Witzel names four required instruments: The short questionnaire's purpose at the beginning is to gather basic data about the interviewees.Recording the interview allows the later transcription.The guideline serves as a useful reminder and guiding framework that assures the discussion of important issues and problems.Immediately following the interview, the interviewer writes a postscript containing nonverbal aspects, thematic peculiarities, and spontaneous interpretative ideas for the evaluation.
Data analysis and interpretation were conducted by support of MAXQDA, a software program for qualitative data analysis.Codes were developed in order to categorize the traffic scenarios of visually impaired persons.

Interview Guideline
We create the guideline as a course for the interview under consideration of the four required instruments for problem-centered interviews: 1. Welcome and explanation of our research 2. Privacy policy: Includes the agreement of the interviewee to recording and transcription.
3. Short questionnaire: Gather information about the interviewed person (age, gender, profession) and about the visually impaired persons the interviewee is in contact with (age and gender distribution, kinds of visual impairment, affinity for technology).

Traffic Scenarios:
We first ask the interviewee to name the three biggest problems visually impaired people face in traffic situations and if there are differences in the problems for people of different age, gender, and degree of impairment.By explaining problems, the interviewee usually mentions keywords for the discussion of further problems.If not, a prepared list with traffic situations (e.g.riding a bus, orientation at train stations, on the way to the bus station, finding an unknown address, differences between known and unknown ways) helps to give impulses to the interviewee.Towards the end of the interview, we ask about the differences between the visually impaired and the sighted when it comes to the preparation for a trip to an unknown address.

Thanks and conclusion
6. Postscript

Characteristics of the Interviewees
We interviewed four persons.All of them are male and their ages cover a range from over 40 to over 70.Three are blind, whereas the fourth person has no visual impairment.Two are active members of interest groups who work as volunteers and the others are employees of educational institutes.We use the following IDs to identify the interview partners in the course of this paper: • IP1: male, blind, employee of educational institute • IP2: male, sighted, employee of educational institute • IP3: male, blind, volunteer of interest group • IP4: male, blind, volunteer of interest group Gender and degree of impairment of the visually impaired persons our interview partners are in contact with are well distributed.Concerning age, the students of one of the educational institutes are generally not older than 19.But as the interest groups have more older members, due to demographic reasons and the fact that visual impairments are often age-related, we cover a wide range of ages.In total, the experts have knowledge about mobility issues of the visually impaired in the whole country, because the educational institutes are situated in different parts of Germany and the interest groups are organized country-wide.A limitation is that their statements refer to German traffic realities and may not be valid in other countries.The interviews were conducted via phone in German language.Therefore, the quotes in this paper are translations from German to English.

Evaluation with Software Engineering Methods
The literature provides several procedures for the analysis of data acquired in qualitative research, including codes which we used for evaluation in this paper.But as suitable structuring and representation of the gained knowledge are highly depending on the concrete problem, there is no "simple step that can be carried out by following a detailed, "mechanical," approach.Instead, it requires the ability to generalize, and think innovatively, and so on from the researcher," [7, p. 63].We master this challenge by adapting methods from software engineering.
Using qualitative methods to improve the software development process has been applied in the past (see e.g.[8], [9]).To our knowledge, the reverse -using software engineering methods to improve the representation of qualitative data -is a new approach.The many structuring elements, such as different tables and diagrams, we find in software engineering are powerful tools that can help to generalize qualitative data and other procedures beyond software.
In our concrete case, data analysis using codes resulted in six traffic scenarios that are of importance for the visually impaired.The word scenario is used in the sense of software engineering defined by Sommerville in [4].In contrary to [4], we do not refer to a concrete software system but discuss the challenges visually impaired people face as road users in general.Sommerville suggests to record scenarios in tables with the keywords: Initial Assumption, Normal, What Can Go Wrong, Other Activities, and State Of Completion.As the last two keywords are of no relevance in our case, we modify the table by deleting them.In the added first line, Quotes, we introduce the scenario by citing one of the interviewees to underline the importance of the described scenario.We use the line Normal to explain the current procedure to solve the scenario and the line What Can Go Wrong to determine problems that can occur.In the added last line, Vision Use Cases, we record vision use cases that can be solved by means of computer vision derived from the line What Can Go Wrong.This form of scenario record, inspired by [4], gives us a clustered overview of the needs of visually impaired people in traffic situations.

CURRENT SITUATION OF VISUALLY IM-PAIRED ROAD USERS
This chapter presents the evaluation of the conducted expert interviews.We describe the traffic scenarios we extracted from the interviews with the before mentioned scenario tables.For social insights we gained from the interviews concerning age, gender, degree of impairment, and affinity for technology, we refer to [10].
A total of six scenarios are derived from the interviews.They can be clustered into three categories: orientation scenarios (general orientation, navigating to an address), pedestrian scenarios (crossing a road, obstacle avoidance), and public transport scenarios (boarding a bus, at the train station).
Two scenarios concerning orientation are extracted from the interviews: General orientation, see Table 1, focuses on self-localization and the awareness of the surrounding in order not to get lost.The second scenario, Navigating to an address, see Table 2, gives details on how to locate a concrete house when trying to find an unknown address.In addition to the discussed orientation issues, the pedestrian scenarios Crossing a road, see Table 3, and Obstacle avoidance, see Table 4, are of high importance for the visually impaired.When it comes to public transport, we identified the two scenarios Boarding a bus, see Table 5, and At the train station, see Table 6.We neglected subways because concerning vision use cases, they are a mixture of bus stops and train stations.Trams are not discussed because they are very similar to bus stops, but less complex.

IP3). Initial
Assumption A visually impaired person wants to know where they are and be aware of their direction and surroundings in order not to get lost.

Normal
Navigational smartphone apps, e. g.Blindsquare, help the person to keep track of their direction and also their surroundings as the app can announce crossings, shops, restaurants and such.On a smaller scale, Tactile Ground Guidance Systems (TGGS) help the person to find their way.
If they get lost anyway, they can ask a passer-by or use the app BeMyEyes, which connects them via video chat to a sighted person who is willing to help.

What Can Go Wrong
The navigational app can only announce places in the database.If the data is not maintained or if the person is in a remote area, there might not be enough information for the person to create a "mental map" (IP1) of their surroundings.Depending on the location of the person, GPS can be inaccurate.As TGGS are not directed, it is possible that they do not know, if they are walking it in the right direction.The person cannot find the TGGS even though it is there.Vision Use Cases TGGS Detection, Description of the surroundings, Traffic Sign Detection (e.g. to find pedestrian zones)    The bus is also the most difficult means of transport, because it is so flexible," (IP3).

Initial
Assumption A visually impaired person wants to board a bus.

Normal
The person waits at the bus stop on the entry field (marked with TGGS).When the bus arrives, they enter.

What Can Go Wrong
The person cannot find the bus stop They cannot find the entry field even though it is there.There is no entry field and the person has to rely on hearing to find the door.They do not know, if the arriving bus has the right number and direction and they might not want to ask every time.At a larger stop, where several buses stop at once, it is difficult to find the right bus.

Vision Use Cases
Traffic Sign Detection (to detect bus stop signs), TGGS Detection, Display Detection (to detect displays with important information), OCR (to read the text on the detected displays), Door Detection Table 6 At the train station "It [the train station] is easier to overlook and (...) at least at most train stations, there is some logic that you can understand," (IP1).

Initial
Assumption A visually impaired person wants to travel by train.

Normal
A TGGS leads the person to the platforms.They find the right platform with the help of Braille indications under the handrails or they know the design of the station.

Additionally, they can use the mobility service of the German railway company. What Can Go Wrong
There is no TGGS leading to the platforms or the person cannot find the TGGS even though it is there.There are no Braille indications or they do not find the handrails.The mobility service is not available.Small train stops do not always offer announcements.They do not find the right track section which matches their seat reservation.Vision Use Cases TGGS Detection, Traffic Sign Detection (to detect platform numbers and platform section signs), Display Detection (to detect displays with important information), OCR (to read the text on the detected displays), Door Detection For our further examinations, we focus on the derived use cases.Therefore, Fig. 1 gives an overview of the use cases identified for the different kinds of scenarios.We marked the use cases that are also of relevance in the field of ADAS in italic; they will be discussed in the following chapter.From the figure, we can see that there are two use cases related to all kinds of scenarios: Traffic Sign Detection and TGGS.Traffic Sign Detection is also addressed in ADAS and because of its relevance and versatility, it is the most important use case to transfer from ADAS.OCR and Door Detection are of interest in orientation as well as public transport scenarios, but are not discussed in ADAS.In summary, there are no ADAS use cases in orientation or public transport scenarios, but the already mentioned Traffic Sign Detection.In pedestrian scenarios, however, there is a large number and majority of use cases with relevance to ADAS.

SOLUTIONS FROM ADAS
We first examine solutions from ADAS for Lane Detection and consider adaptation possibilities for the assistance of the visually impaired.Afterwards, we give an overview of existing ADAS solutions for the remaining overlapping use cases.

Lane Detection
In order to examine the use case Lane Detection from driver and visually impaired pedestrian perspective, we have to consider the different situations where Lane Detection algorithms can support the users.For driver assistance, the main application is a Lane Departure Warning System (LDWS) that computes if the vehicle is in a tolerable range or not.The expert interviewees named one situation where Lane Detection could support visually impaired pedestrians: crossing a road would be easier if they knew details about the road such as size or if there is an intermediate platform.In further interviews with members of the target group (see section 5.1), another situation was named: general orientation could be enhanced by knowing the course of the road ahead (e.g.curves).
LDWS in ADAS consist of three steps: After detecting the two stripes that bound the road in a single frame, they are tracked over time.If the car runs into the risk of departing the road, the danger is communicated with the driver [11].In the case of visually impaired pedestrians, tracking and departure warning are redundant, meaning that it is sufficient to process single frames.Single frame Lane Detection consists of four steps [11]: After image acquisition and preprocessing (e.g.contrast enhancement, noise reduction), edge detection is the needed preliminary step for stripe identification by Edge Distribution Function (EDF) or Hough Transform.Afterwards, the detected parameters ISSN 1335-8243 (print) c 2018 FEI TUKE ISSN 1338-3957 (online), www.aei.tuke.skare fit to a certain kind of function (Line fitting).In the case of Hough Transform, the two most striking local maxima denote the lane boundaries.At this point, we will not look further into algorithms based on the Hough Transform, because in the case of visually impaired pedestrians it could only be used for the crossing of roads, but not for telling the course of the road.EDF-based algorithms, however, can be adapted to the situation of visually impaired road users.In [12], first the Region Of Interest (ROI), namely the road, is extracted so that the following operations can be restricted to this area in order to reduce computation time.These operations are edge extraction, EDF construction and analysis, and departure identification.The EDF is the histogram of the gradient magnitude with respect to the corresponding edge angle.In the case of a road image with lane boundaries taken from a car, the EDF has a specific appearance with two local maxima that correspond to the lane boundaries.
Fig. 2 proposes a procedure to adapt the before described ADAS algorithm to visually impaired pedestrians.Red boxes mark modules that have to be added, green boxes can be taken with a minimum of changes from [12].Fig. 3 shows the algorithm's result for two exemplary images taken from pedestrian perspective.An evaluation of the algorithm on a representative database is pending.
Although extracting the ROI is part of both algorithms, its box is marked red the adaptation procedure, because from pedestrian view the ROI varies a lot more than from driver perspective.Therefore, the ROI has to be calculated in different ways.In the current state of implementation, a rectangular ROI is chosen manually.Automatization of ROI will be achieved in the future by a road background segmentation.The next step is the division of the ROI into subregions.Because of perspective, we choose subregions in descending size from bottom to top of the image's ROI.We start with two subregions of one quarter of the ROI's height, followed by two subregions of one eights and four subregions of one sixteenth of the ROI's height (see blue lines in Fig. 3).For each subregion, we compute the EDF according to [12] by using the Sobel operator in order to compute the gradients and the according angle (discretizised from one to 180 degrees).In contrary to [12], we then extract the highest local maxima instead of the two highest, because from pedestrian perspective the lane boundaries are almost parallel.For the interpolation of the subregion's angles, we use a linear-parabolic fitting as proposed in [11].The two biggest subregions, that are also the ones closest to the user, are considered linear whereas the others are considered parabolic.The interpolation process, which leads to a smooth function, results in a curve whose angle does not match exactly the road, but fits its general course.Analyzing the curve enables us to determine if the road goes on straight ahead or takes a turn to the left or the right (see Fig. 3).

Other Use Cases
In their chapter about related work, Yuan et al. [13] give an overview of existing methods for Traffic Sign Detection and Recognition.Traffic sign recognition usually consists of two blocks: detection and classification.In addition, tracking is used to increase the recognition rate.For detection, there are on the one hand methods based on color and shape (e.g.[14]), and on the other hand machine learning approaches, e.g. based on Support Vector Machines (SVM) [15].Tracking detected signs with Kalman filters is applied with several intentions.For example, tracking can be used to include detection results from different frames [16] or to eliminate the results that cannot be found in successive frames [17].As traffic sign classification is a typical object classification problem, the according algorithms are applied.Following feature extraction, SVM [15], Neural Networks [18], and other techniques are used.As traffic signs who set out lane arrangements are often neglected by traffic sign recognition systems, [19] specifically addresses this problem.
A Vehicle Detection for urban areas under consideration of different angles in which the vehicle is recorded, is presented in [20].The paper also contains an analysis of the state of the art for vision based vehicle detection.The literature is sorted by certain characteristics such as on-road environment (highway, urban, night time), vehicle views (front, rear, side), or if it covers partial occlusions and part based detection.Rubio et al. [21] for example present a vehicle detection for night time highways under consideration of front and rear views and without addressing partial occlusions or part based detection.
Vehicle Detection can be seen as part of Obstacle Detection, as many obstacle detection developments in ADAS focus on the detection of specific objects.Besides vehicles, pedestrians and bicycles are detected (e.g.[22]).A priori knowledge about the obstacles' texture, color, and shape is used for the training of models.On the contrary, Yang and Zheng [23] present a system that responds to every approaching object by exploiting motion patterns of a single dashboard camera.Other work that responds to all kinds of approaching objects can be for example found in [24,25].The general methods are the ones that have to be considered for visually impaired pedestrians, because the kinds of obstacles they face are extremely distinctive.

Region of Interest EDF Construction
(incl.Edge Detection)

Divide into Subregions
For every subregion Fig. 2 Proposed procedure to adapt the Lane Detection algorithm in [12] for visually impaired pedestrians.Additionally, [26] describes the general structure of algorithms for Traffic Light (State) Detection.First, candidates for traffic lights are identified by their color for which different color spaces can be used.The number of false candidates is then reduced by shape features, e.g.not round candidates are eliminated using the Hough transform [27].Furthermore, structure information of the traffic light is extracted using global and local image feature descriptors like HOG [28] or Haar-like features [29].The performance and robustness of traffic light detections is improved by using a combination of color, shape and structure features.For the recognition of the traffic light's state, different classifiers such as SVM [28] or CNN [30] can be used.
For Crosswalk Detection, [31] proposes to extract according regions under different illuminations by means of MSER and to eliminate false candidates afterwards.The fact that crosswalks have a horizontal structure from driver perspective is used by [32] and [33].Choi et al. ( [32]) use a 1D mean filter and examine the difference image between original and filtered image.Kummert and Hasselhoff [33] on the other hand take advantage of the bipolarity and the straight lines of crosswalks by applying Fourier and Hough Transform.

Future Work
As a foundation for our future research, it is essential to collect the relevant vision use cases that can support vi-sually impaired road users and that are also addressed in ADAS.Therefore, we evaluate further problem-centered interviews with visually impaired people where they focus on their own experiences.We conducted and transcribed ten such interviews.A first evaluation indicates that the results of this paper will be confirmed and at some points extended.
In order to evaluate existing ADAS algorithms and proposed adaptations to the domain of visually impaired road users, comparable video test sequences are needed.To achieve this, we create activity diagrams for the overlapping use cases from driver as well as pedestrian perspective.Every path in every diagram will then correspond to a scene to be filmed.Scenes belonging to a certain use case will be filmed from driver and pedestrian perspective in the same location and directly one after the other to ensure similar lightning and weather conditions.The use of activity diagrams is another example of how software engineering methods can be used to structure problems that go beyond software.
The produced video material will be used to evaluate the Lane Detection adaptations proposed in this paper.Afterwards, we will perform the same steps for the other use cases.Based on a literature review, we will formulate the general ADAS procedure and propose adaptations that will then be evaluated.The needed adaptations will be clustered into different categories and a generalized transfer concept will be formulated.

Conclusion
By combining methods from social sciences and software engineering, we conducted and evaluated expert interviews in order to gather camera-based traffic use cases that can support visually impaired road users in their daily life.Thereby, we presented the new approach of using software engineering methods for the representation of qualitative data.
We extracted different kinds of scenarios from the interviews and derived the matching vision use cases.The six identified scenarios are clustered into three categories: orientation, pedestrian and public transport scenarios.We extensively described each scenario based on the interviews and showed that there is a large overlap in the use cases derived from the scenarios and the ones addressed in ADAS.
For the use cases belonging to both fields, we gave a short overview of existing solutions from ADAS in general and discussed the use case Lane Detection and its possibilities of adaptation in detail.The concept of EDF can be applied to the assistance of the visually impaired by dividing the image's ROI into subregions of decreasing size (from bottom to top).The identified angles have then to be interpolated in order to get a curve matching the course of the road.A problem still to be solved is the ROI computation by road background segmentation.
In the future, we will extend the examination of the use cases in order to identify and later overcome the problems that occur when using ADAS algorithms for visually impaired road users.For the algorithm evaluation, we produce comparable video material from both driver and pedestrian perspective.
Besides giving us comprehensive insights about relevant traffic scenarios for the visually impaired, the interviews also encourage us to continue our research as one of the interviewees pointed out that "we currently have the rapid development of smartphones, and with that we are also experiencing more and more comfort.And in this context, such a development and research as yours is of utmost importance, so that one can achieve more safety in road traffic," (IP3).

Fig. 1
Fig. 1 Traffic scenarios and their overlapping use cases.Italic use cases are of relevance in ADAS.

Fig. 3
Fig. 3 Exemplary results achieved with the proposed adapted algorithm (Left: Original image.Right: Interpolated course of the road).Blue lines mark the subregions.Shows only preselected ROI.Implementation with MATLAB.
Traffic Scenarios and Vision Use Cases for the Visually Impaired

Table 1
General orientation Quote"[It is] problematic in general to keep an eye on your destination.(...)You can simply get lost easily.(...) And this is a really burdening point for us,"

Table 2
Navigating to an addressDue to GPS accuracy, the navigational app cannot lead the person directly to the entrance of the building.If the building is unknown to them, they have to ask in order to get to the entrance and possibly find the right door bell.

Table 3
Crossing a road

Table 4
Obstacle avoidance Scenarios and Vision Use Cases for the Visually Impaired What Can Go WrongWhereas guide dogs are usually trained to detect ground as well as elevated obstacles, it is not possible to detect elevated obstacles with the white cane.The detection and avoidance of a construction site can be difficult.While moving around an obstacle, the person can lose orientation and/or drift away from the TGGS (see Table1).
Quote"They [obstacles] impede the walking flow, they interrupt you, you lose direction," (IP4).Normal With the help of the white cane or a guide dog, obstacles are detected and avoided.Traffic

Table 5
Boarding a bus