Mobile Mixed Reality System for Architectural and Construction Site Visualization

Augmented Reality (AR) is a natural development from virtual reality (VR), which was developed several decades earlier. AR complements VR in many ways. Due to the advantages of the user being able to see both the real and virtual objects simultaneously, AR is far more intuitive, but it's not completely detached from human factors and other restrictions. AR doesn't consume as much time and effort in the applications because it's not required to construct the entire virtual scene and the environment. In this book, several new and emerging application areas of AR are presented and divided into three sections. The first section contains applications in outdoor and mobile AR, such as construction, restoration, security and surveillance. The second section deals with AR in medical, biological, and human bodies. The third and final section contains a number of new and useful applications in daily living and learning


Introduction
The Architecture, Engineering and Construction (AEC) sector is widely recognized as one of the most promising application fields for Augmented Reality (AR).Building Information Models (BIM) and in particular the Industry Foundation Classes (IFC) data format are another main technology driver increasingly used for data sharing and communication purposes in the AEC sector (Koo & Fischer 2000).For example, the Finnish state owned facility management company Senate Properties demands use of IFC compatible software and BIM in all their projects (Senate 2007).
At some advanced construction sites, 3D/4D Building Information Models are starting to replace paper drawings as reference media for construction workers.Thus, workers can check daily work tasks using BIM systems installed at site offices, sometimes with remote connections to BIM databases, and even annotate the virtual model with information relating to the construction site.However, the model data is mostly hosted on desktop systems in the site office, which is situated far away from the target location and not easily accessible.Combined with mobile Augmented Reality and time schedules, 4D BIMs could facilitate on-the-spot comparisons of the actual situation at the construction site with the building's planned appearance and other properties at the given moment.
Besides augmented visualization, the related camera tracking technologies open up further application scenarios, enabling mobile location-based feedback from the construction site to the CAD and BIM systems.Such feedback possibilities include adding elements of reality such as images, reports and other comments to the virtual building model, correctly aligned in both time and space.Our discussion thus addresses the complete spectrum of Mixed Reality as defined by (Milgram and Kishino 1994), with real world augmented with virtual model data, and digital building models augmented with real world data.Shin and Dunston (2008) evaluated 17 classified work tasks in the AEC industry.They concluded that eight of them (layout, excavation, positioning, inspection, coordination, supervision, commenting and strategizing) could potentially benefit from the use of AR.Additionally, related application areas would be communication and marketing prior to construction work, as well as building life cycle applications after the building is constructed.
Among previous work, the first mobile AR system was developed by Feiner et al. (1997).Their application was to present an AR view of campus information at Columbia University.Gleue and Thaene (2001) presented the Archeoguide system to provide tourists an AR view to historical and cultural sites.More recently, Reitmayr and Drummond (2006) presented a robust feature based and hybrid tracking solution for outdoor mobile AR.Among the first to address practical AEC applications, (Schall et al. 2008) presented a mobile handheld AR system Vivente for visualizing underground infrastructure.Their work was extended with state-of-the-art sensor fusion methods for outdoor tracking in (Schall et al. 2009).For further references on mobile AR with building construction models, see the thesis by Behzadan (2008) and the review article (Izkara et al. 2009).
However, little research has been done to integrate mobile AR with real world building models, often containing millions of triangles and being hundreds of megabytes in size.Integrating the time component to mobile AR solutions is another topic that is seldom addressed in previous literature.Among non-mobile solutions, however, let us note the impressive work (Goldparvar-Fard et al. 2010).They provide off-line still image based tools to compare the situation at construction site against 4D plans, based on 3D reconstruction of the construction site created from photographs taken of the site.
Our long term research goal has been to prove the technical validity of bringing real world BIM models to the construction site, for augmenting with lightweight mobile devices.Our work on mobile AR dates back to 2003 with the client-server implementation on a PDA device (Pasman & Woodward 2003).The next generation implementation (Honkamaa et al. 2005) produced a marker-free UMPC solution by combining the building's location in Google Earth, the user's GPS position, optical flow tracking and user interaction for tracking initialization.This work lead to the first version of the current system architecture (Hakkarainen et al. 2009) to handle arbitrary OSG formats and IFC (instead of just Google Earth's Collada), 4D models for construction time visualization (instead of just 3D), and mobile feedback from the construction site to the design system ("augmented virtuality").The system was further extended in (Woodward et al. 2010) to cover more accurate map representations, mobile interaction, operation with data glasses, efficient client-server architecture, tracking methods, as well as discussion on photorealistic visualization for mobile AR.This article gives an overall presentation of our software system, its background, current state and future plans.Among the most recent developments, we present: the client implementation on mobile phones, based on a lightweight optical tracking solution; results of our field trials in different pilot cases, including application during the construction work and comparing previous visualization results with the appearance of a partially ready building; as well as conclusions of the present status of the research.
The article is organized as follows.Section 2 explains the general implementation and functionality of the core software modules.The mobile phone implementation is discussed in Section 3. Our lightweight feature-based tracking solution is presented in Section 4. The photorealistic rendering functionality for mobile AR is described in Section 5. Results from our field trials are presented in Section 6. Items for future work are pointed out in Section 7 and concluding remarks are given in Section 8.

System overview
This Section presents the general implementation of the system.The discussion is given mainly from functional point of view, while a more detailed discussion is provided in (Woodward et al. 2010).

www.intechopen.com
Mobile Mixed Reality System for Architectural and Construction Site Visualization 117

Software modules
Our system is divided into three parts; 4DStudio, MapStudio and OnSitePlayer.The Studio applications fulfill the authoring role of the system and are typically used at the office, while OnSitePlayer provides the augmented reality view and mobile feedback interface at the construction site.OnSitePlayer can be operated either as a stand-alone, or as a client-server solution, distributing heavy 3D computation to the OnSiteServer extension, and tracking and rendering to the OnSiteClient extension.See Figure 1.The tracking algorithms are based on our software library ALVAR -A Library for Virtual and Augmented Reality (VTT 2011), and the OpenCV computer vision library.The GUI is built using the wxWidgets framework.For rendering, the open-source 3D graphics library OpenSceneGraph (OSG) version is used.The applications can handle all OSG supported file formats via OSG's plug-in interface (e.g.OSG's internal format, 3DS, VRML).The TNO IFC Engine3 (TNO 2010) is used as a platform to process IFC building model files.
Augmented Reality -Some Emerging Application Areas 118

4D studio
The 4DStudio application takes the building model (in IFC or some other format) and the construction project schedule (in MS Project XML format) as input.4DStudio can then be used to link these into a 4D BIM.4D IFC models defined with Tekla Structures can also be read directly by 4DStudio.Once the model has been defined, 4DStudio outputs the project description as an XML file.
4DStudio has a list of all the building parts and project tasks, from which the user can select the desired elements for visualization.For interaction, 4D Studio provides various tools to select elements for visualization, user definable color coding, clip planes, and viewing the model along the time line.See Figure 2. Feedback report items generated with the mobile AR system describe for example tasks or problems that have been observed at the construction site by workers.These can also be viewed with 4DStudio.Each item contains a title, a task description, a time and location of the task, and optionally one or several digital photos.Selecting a report item in the list takes the 4D building model to the time and location of the report item in question.

MapStudio
The MapStudio application is used to position the models into a geo coordinate system, using an imported map image of the construction site.The geo map can be imported from Google Earth, or for more accurate representations geospatial data formats like GeoTiff.The image import is done using the open source Geospatial Data Abstraction Library (GDAL).
The models are imported from 4DStudio, and can be any OSG compatible format or IFC format.The model can either be a main model or a so-called block model, which is used to enrich the AR view, or to mask the main model with existing buildings.The system can also be used to add clipping information to the models, for example the basement can be hidden in the on-site visualization.
The user can position the models on the map either by entering numerical parameters or by interactively positioning the model with the mouse (see Figure 3).Once all the model information has been defined, the AR scene information is stored as an XML based scene description, ready to be taken out for mobile visualization on site.

OnSitePlayer
OnSitePlayer is launched at the remote location by opening a MapStudio scene description, or by importing a project file containing additional information.The application then provides two separate views in tabs; a map layout of the site with the models including the user location and viewing direction (see Figure 4) and an augmented view with the models displayed over the real-time video feed (see Figures 5 and 6).
The user is able to request different types of augmented visualizations of the model based on time, for example defining the visualization start-time and end-time freely, using clipping planes, and/or showing the model partially transparent to see the real and existing structures behind the virtual ones.OnSitePlayer also allows for storing augmented still images and video of the visualization, to be later reviewed at the office.
With OnSitePlayer, the user can also create mobile feedback reports consisting of still images annotated with text comments.Each report is registered in the 3D environment at the user's location, camera direction, and moment in time.The reports are attached to the BIM via XML files and are available for browsing with 4DStudio, as explained above.

Interactive positioning
As GPS positioning does not always work reliably (e.g. when indoors) or accurately enough, we provide the user with the option to indicate his/her location interactively.The system presents the user the same map layout as used in the MapStudio application.The user is then able to zoom into the map and place the camera icon to the his/her currently know location.Note by the way that by using manual positioning, possible errors in the model's and user's positioning are aligned and thus eliminated from the model orientation calculation.Additionally, the user's elevation from ground level can be adjusted with a slider.The interactive alignment of the video and the building models can be achieved in several ways (Woodward et al. 2010).As one option, block models that represent existing buildings can be used as a reference for the inital alignment.However, this approach requires modeling parts of the surrounding environment which might not always be possible or feasible.
As a more generally applicable approach (Wither et al. 2006), known elements of the real world are marked in MapStudio as "placemarks" (see Figure 4).The mobile user then selects any of the defined placemarks with the "viewfinder" to initialize real time tracking (see Figure 5).Real time augmented view (Figure 6) is produced as the user "shoots" the placemark by pressing a button on the mobile device.

www.intechopen.com
Augmented Reality -Some Emerging Application Areas 122

Client-Server Implementation
Virtual building models are often too complex and large to be rendered with mobile devices at a reasonable frame rate.This problem is overcome with the client-server extension for the OnSitePlayer application.The client extension, OnSiteClient, is used at the construction site while the server extension, OnSiteServer, is running at the site office or at some other remote location.Data communication between the client and server can be done using either WLAN or 3G.
The client and server applications were basically obtained with relatively small modifications to the OnSitePlayer code.The client and server share the same scene description as well as the same construction site geospatial information.The client is responsible for gathering position and orientation information, but instead of rendering the full 3D model, the client just passes the user location and viewing direction to the server.The server uses this information to calculate the correct model view, which is then sent to the client for augmenting on the mobile device.
In our implementation, the view is represented as a textured spherical view of the virtual scene surrounding the user.The sphere is approximated by triangles.An icosahedron was chosen since it is a regular polyhedron formed from equilateral triangles, therefore simplifying the texture generation process.The icosahedron also provides a reasonable tradeoff between speed (number of faces) and accuracy (resolution of images).
As the scene is rendered into the sphere representation, alpha values are used to indicate transparent parts of each texture image.If some image does not contain any part of the 3D model to be rendered, the whole image can be discarded and not sent to the client.See (Woodward et al. 2010) for further implementation details.
The client augments the scene by aligning the sphere to the virtual camera coordinates according to the user's position and camera direction, and renders the alpha textured sphere over the video image.Camera tracking keeps the 2D visualization in place and the user may pan/tilt the view as desired.
The same sphere visualization can be used as long as the user remains at the same location.Our solution generally assumes that the user does not move about while viewing.This is quite a natural assumption, as viewing and interacting with a mobile device while walking would be quite awkward or even dangerous, especially on a construction site.The user is still free to rotate around 360/360º and view the entire sphere projection.

Mobile phone implementation
In the PC based client-server implementation (Woodward et al. 2010), the client and server extensions were obtained by direct modifications to the OnSitePlayer application.With the mobile phone implementation this was not feasible due to the difference of platforms.Also, to create as lightweight solution as possible, we implemented a whole new client application for the Nokia N900 smart phone (see Figure 7).
The mobile phone client still supports the network connection and data stream provided by the original server on the PC.The application framework is built using Qt SDK 1.0 and Qt Mobility.Rendering is done with OpenGL ES 2.0.The network connection is ad-hoc WLAN.The functionality of our first mobile phone version is restricted to architect's visualization models, without time component or other advanced features.Positioning is done using the integrated GPS module, without any user interaction.On the other hand, the N900 does not have a compass so the user is responsible for defining the viewing direction.
All the user interactions are done via the touch screen.The viewing direction is defined with a slightly modified version of the PC based viewfinder approach.On the mobile phone we show all of the pre-defined viewfinder positions (authored in MapStudio) first in arbitrary direction.The user is then able to swipe the screen and choose the valid viewfinder(s) for the final aligning.After locking the model in the correct position, the viewfinder images are removed from the view and tracking is started.
Model rendering is based on the sphere projection method, as described above.Downloading the sphere images from the server depends on the number of images (triangles) required.New sphere initialization typically takes some 5 seconds, though in the worst case scenario (20 images, model all around the user) it takes up to 30 seconds.The initialization phase could be improved (up to some 50 %) by compressing the raw images and also packing multiple images in one texture.Alternatively, "hot spot" viewing positions can be defined at office using OnSitePlayer.In this case the sphere images are stored beforehand in the OnSiteClient's scene description and no downloads or even connection to the server are required.

Tracking
We have developed altogether three vision based tracking methods to be used in different use cases.Two solutions were developed for the OnSiteClient application, one for PC and one for mobile phone.These solutions assume the user stands at one position, at least a few meters away from the target object, and explores the world by panning with the mobile device (camera).A separate solution was developed for the stand-alone OnSitePlayer on PC, allowing the user also to move freely while viewing.While the PC based tracking solutions have been described in our previous article (Woodward et al. 2010), the implementation on mobile phone is new and is described in the following.

Tracking on mobile phone
Our light-weight markerless tracking solution designed for the mobile phone client application is based on rotation-invariant fast features (RIFF) (Takacs et al. 2010) and the FAST interest point detector (Rosten & Drummond 2006).The implementation follows closely the tracking logic of (Takacs et al. 2010) with the following modification.Instead of matching detected RIFF descriptors between two consecutive frames, we maintain a set of 3D features and assign one descriptor for each 3D feature.For each camera frame we select a sub-set of these 3D features by projecting the features using a predicted camera orientation and choosing features evenly across the image.To maintain real-time performance, only a limited number of features are selected.For each selected feature, matching descriptors are then searched around the projected feature positions.We use the same search radius of 8 pixels as in (Takacs et al. 2010).
Since descriptor matching gives correspondences between image corners and 3D features, the camera orientation is estimated simply by minimizing the re-projection error of the features.We use the Levenberg-Marquardt optimization routine for orientation estimation as in our previous implementation.We process each image pyramid level separately and the optimized orientation of the previous pyramid level is used as the initial camera orientation for the next pyramid level.For the first pyramid level, the final result of the previous frame is used instead.Once all image pyramid levels have been processed, the set of 3D features is updated.First, outliers are detected from the residual re-projection errors.Feature quality values are increased for inliers and decreased for outliers.Once the quality value of a feature drops below a threshold, the feature is completely removed from the feature set.New 3D features are created by choosing strong FAST corners and back-projecting the corners into a surface of a sphere centered at the camera.New features are created only in image regions where there are no existing features.
Compared to our previous lightweight implementation (Woodward et al. 2010), the use of RIFF descriptors and FAST corners gives two clear benefits.Firstly, detecting FAST corners is much faster than the previously used interest point detector (Shi & Tomasi 1994).With a carefully optimized implementation we are able to reach a real-time performance of 30 FPS on the N900 mobile phone.Secondly, by tracking features using descriptor matching instead of the optical flow method of Lucas and Kanade (1981), we gain some ability for local recovery.The orientation of the camera is not updated if the tracker fails to match enough feature descriptors.If the tracker fails to match enough feature descriptors, the user can rotate the camera to bring more inlier features back into the camera view, thus restoring the previously found orientation.

Rendering
On-site visualization of architectural models differs somewhat from general purpose rendering (Klein & Murray 2008), (Aittala 2010) and the methods should be adapted to the particular characteristics of the application for optimal results.The following special characteristics typical for mobile architectural visualization were identified in (Woodward et al. 2010):


Uneven tesselation of 3D CAD building models  Shadow mapping methods, related to the previous  Complex and constantly changing lighting conditions  Aliasing problems with highly detailed building models  Sharp computer graphics vs. web camera image quality We have experimented with the rendering and light source discovery methods described in (Aittala 2010) and integrated them into the OnSitePlayer application.Figure 8 shows an example of applying our rendering methods with a pilot project.The present implementation of the rendering methods covers: determining of sun light direction based on GPS, date and time of day; interaction with sliders to adjust day light intensities; screenspace ambient occlusion; soft shadows based on shadow maps; and adjusting the rendered image quality to web camera aberrations.Automatic lighting acquisition from the real scene (Aittala 2010) has not been integrated into our system yet, and the current implementation has been done for the stand-alone OnSitePlayer system only.We plan to implement more advanced features also with the client-server solution, using separate feedback mechanisms for interaction and passing of lighting conditions of the real world scene to the server.

Field trials
Several iterations of field trials have been performed with three pilot cases.The first mobile use experiments were done with a laptop PC device in summer 2009.We used the 4D model of the Koutalaki hotel in Lapland as an example and augmented it behind our Digitalo offices in Espoo.The experiment enabled us to verify that most of the intended functionality was already operational, including e.g.visualizing the building in various modes and along the timeline, masking the virtual model with the real one, creating and viewing of mobile feedback reports, etc. However some problems were noticed with the user interface; especially the PC screen brightness was far from sufficient in bright day light.Also, the poor accuracy of the compass as well as GPS was noticed to be a major problem in practice.This stimulated our decision to develop interactive positioning methods as backup for the sensors.
A second round of experiments was carried out in fall 2009 in a case of the Forchem oil refinery in Sweden, with the purpose of augmenting new equipment to be installed, using Sony Vaio UX as mobile device (see Figure 9).Video of these experiments is available in (VTT 2010a).In October 2010 when the construction work had already started, we finally received the complete 4D model of the Skanska building (IFC model size 60 MB) and went out to try it at the construction site.We could then verify that our solution also worked in practice with this rather demanding experiment.With some user interaction, we were able to augment the complex model on site, and display the construction elements to be installed at different time frames and from various view points.With respect to tracking initialization, managing altitude information interactively was considered to be the biggest problem.Stand-alone laptop PC version was used in these experiments.See Figure 10 and video (VTT 2010b).In these experiments we were able to verify that our mobile phone solution using the new tracking method and pre-defined placemarks on the scene provided a stable augmented view of the building (see Figure 7).Comparison of the OnSitePlayer view which we had computed nine months earlier (Figure 8) against the real situation at the site (Figure 11) also validated the quality of our photorealistic rendering methods.

Future work
For practical reasons, we still have a number of stand-alone OnSitePlayer features yet to be integrated in the client-server solution.Also, integration of our feature based tracking methods with sensor data as well as photorealistic rendering technology into the AR system is still under way.Some near term plans for interaction, tracking and rendering enhancements were discussed above, and previously in (Woodward et al. 2010).Positioning accuracy could also be improved by applying more accurate methods, e.g.differential GPS, Real Time Kinematics (RTK) and other measurement tools that are routinely employed at construction sites.
In future, we look forward also to obtaining feedback from different user groups.The first formal user studies with the system will be performed in our next outdoors visualization project in September 2011.Handing out the system to actual end users will certainly bring up various proposals and wishes for improvements to the system.Instead of adding new functionality however, we anticipate a general request to simplify the user interface and limit it to the most essential features.

Conclusions
In this article, we have described a software system for mobile mixed reality interaction with complex 4D Building Information Models.Our system supports various native and standard CAD/BIM formats, combining them with time schedule information, fixing them to accurate geographic representations, using augmented reality with feature based tracking to visualize them on site, applying photorealistic rendering, with various tools for mobile user interaction and feedback.The client-server solution is able to handle complex models on mobile devices, and an efficient tracking solution enables implementation also on mobile phones.
While there is still some way to go until the technology is in daily use at real construction sites, and there are some general concerns for applicability such as weather conditions, we believe that we have proven the technical validity of the concept.In particular, mobile AR visualization of architectural models is already quite manageable with the present system.We look forward to evaluating our system with user tests in the future, and eventually to bringing our solutions to real production use.

Fig. 4 .
Fig. 4. User position and placemark shown in OnSitePlayer.Compass (if any) does not always provide sufficient grounds for automatic tracking initialization.As backup, interactive means are provided for model alignment.After the model is properly aligned the system switches to feature-based tracking.

Fig. 9 .
Fig. 9. Mobile AR view of Forchem factory on a UMPC.In the Forchem case we relied completely on our 3D feature based tracking solution without sensors(Woodward et al. 2010).Tracking was initialized manually by having the user indicate point correspondences between the video image and the 3D model of the factory.As hypothesis for future work, we believe this initialization step could be avoided by first roughly aligning the video and the model using compass information, and based on that, finding the actual point correspondences automatically.Our most comprehensive field tests were conducted in a series of experiments with the new Skanska offices in Helsinki 2010-2011.In summer 2010 before the building work started, we compared AR visualization of the planned building with different display devices: laptop PC on a podium, attached data glasses, and UMPC client.The two first devices were used in stand-alone mode while the UMPC was used in client-server mode.For rendering, we compared standard computer graphics without adjustments against our photorealistic rendering methods to account for light direction, intensity and other visual properties.See Figures2-8and video (VTT 2010b).

Fig. 10 .
Fig. 10.Mobile AR during construction work.Harsh winter interrupted our field tests for almost half a year.The most recent experiments with the Skanska pilot were done in May 2011 when the back part of the building was already completed and also the first version of our mobile phone implementation was ready.In these experiments we were able to verify that our mobile phone solution using the new tracking method and pre-defined placemarks on the scene provided a stable augmented view of the building (see Figure7).Comparison of the OnSitePlayer view which we had computed nine months earlier (Figure8) against the real situation at the site (Figure11) also validated the quality of our photorealistic rendering methods.