IMAGE CAPTURE WITH SYNCHRONIZED MULTIPLE-CAMERAS FOR EXTRACTION OF ACCURATE GEOMETRIES

Abstract. This paper presents a project of recording and modelling tunnels, traffic circles and roads from multiple sensors. The aim is the representation and the accurate 3D modelling of a selection of road infrastructures as dense point clouds in order to extract profiles and metrics from it. Indeed, these models will be used for the sizing of infrastructures in order to simulate exceptional convoy truck routes. The objective is to extract directly from the point clouds the heights, widths and lengths of bridges and tunnels, the diameter of gyrating and to highlight potential obstacles for a convoy. Light, mobile and fast acquisition approaches based on images and videos from a set of synchronized sensors have been tested in order to obtain useable point clouds. The presented solution is based on a combination of multiple low-cost cameras designed on an on-boarded device allowing dynamic captures. The experimental device containing GoPro Hero4 cameras has been set up and used for tests in static or mobile acquisitions. That way, various configurations have been tested by using multiple synchronized cameras. These configurations are discussed in order to highlight the best operational configuration according to the shape of the acquired objects. As the precise calibration of each sensor and its optics are major factors in the process of creation of accurate dense point clouds, and in order to reach the best quality available from such cameras, the estimation of the internal parameters of fisheye lenses of the cameras has been processed. Reference measures were also realized by using a 3D TLS (Faro Focus 3D) to allow the accuracy assessment.


INTRODUCTION
Manual recording methods based on direct measurements of road infrastructures are time consuming and dangerous regarding the traffic which can be dense on roads and in tunnels.Thus it is a question of defining a measuring method in a most automatic way and in best safety conditions.In this context, the purpose of this study is to test the use and to define the best configurations using action-cameras for the acquisition of images allowing then, by photogrammetry way to generate dense point clouds.From these point clouds, the various road infrastructures can be sized or handled in the form of profiles (Browstof et al., 2008).Then these results can be used to verify the possibilities of passage of exceptional convoy trucks.

APPLICATIONS
During this study, it has been set up a number of protocols allowing to test the best configurations and also allowing to verify if results at the cm accuracy level could be obtained by using action cameras.The first step is to study the various equipment used as well as the devices set up.Then general protocols are proposed for the acquisition of tunnel underpasses, traffic circles or still whole streets presenting very constrained geometries.

Action cameras
2.1.1Acquisition devices: Action-cameras: GoPro Hero 3 and 4 cameras were used as acquisition devices.Our laboratory has got six GoPro Hero 4 Black Edition cameras, they are action cameras with fisheye lenses.These cameras are widely known for their solidity and low cost.Hero 4 model is able to film in 4K with 30 images per second and taking pictures with a resolution of 12 Mpx with various configuration modes, Bluetooth and Wi-Fi connectivity.It stands out as even today the reference of the action cameras on a market which has created itself a few years ago.We also have a panoramic assembly which can welcome up to 6 sensors to realize 360° panoramic images.This tool fixes the position of cameras, pictures will be taken always in the same way and in the same conditions.The quality of image will thus be the same, what insures a homogeneity of the precision in the project.The total weight of the assembly including 6 GoPro action cameras is 1,270 kg.Cameras can be synchronized and the release can be controlled by a unique remote controller.The devices don't allow to guarantee a perfect synchronization.But the synchronization is not the most important for this project.Indeed, it has been observed that the impulses for shooting could be synchronized.But in this case each camera works in an autonomous way and will be activated only after having gathered the correct image parameters (luminosity, sharpness).According to the types of objects to be acquired, in particular in the case of passage in tunnels, the light conditions were not even optimal, sometimes very difficult, which allowed to detect release differences up to a second, in unique shooting mode.
For these experiments six cameras have been used.

Calibration:
In photogrammetry purposes, the camera calibration is very important (Balletti et al., 2014;Wiggenhagen, 2002).It is compulsory in all the photogrammetry processes to achieve quality.The estimation of the internal parameters of the camera is so determined.The transmission of errors must be minimal to not impact on the final project.The principle of the calibration is to use a profusion of data by capture a network of known points and to estimate the internal parameters of the camera.So the focal length, the coordinates of the principal point and the optical distortions will be determined.Their knowledge allows afterward the relative orientation of the images.The calibration of the sensor and the optics is also necessary for the creation of exact dense point clouds.The PhotoModeler Scanner software is specialized for close range photogrammetry.This science leans on images realized with professional cameras or devices specialized in this domain.The calibration module is thus optimized for the distortion model with central projection.Instruments having fisheye lenses as optics are unsuitable for this tool.It is however possible to calibrate the internal orientation parameters of GoPro cameras.So the projection is equidistant.The mathematical model follows the Brown's description (1971).Wiggenhagen (2002) specifies the notations and the conversion formulas between the Brown's model and his one.

Calibration with Photomodeler.
The calibration module offers two possible methods.It is also possible to calculate the internal parameters of the camera from a seized object and from numerous control points known in a unique coordinate system.The first method uses a flat test grid, the second uses coded targets distributed in the space.First of all, we realized a calibration on the dedicated test grid.The acquisitions and the results were difficult to manage.Indeed, we were confronted with the effect of the wide viewing angle of the optics which required to move closer to the calibration test sheet.Thus locations for captures were very restricted and the surface of the sensor was much reduced.We then used another modus operandi given by Eos Systems PhotoModeler (2013).Indeed, the methodology recommends the use of targets to calibrate and the procedure brought us to create a calibration zone in a 3D space.As the objective of the project was to capture roads, tunnels, houses, etc. which are volumes, the size of targets was sensibly calculated to be visible at 4 m distance from our sensor.
The arrangement of targets was realized in a corner of a room.The place is lacking luminosity to have a good sharpness of targets on images.This criterion is necessary for the effective correlation of targets.A 150 Watt light projector allowed to improve the contrast.The calibration of a camera does not require scaling.Numerous calibrations were realized to find a suitable method to obtain a constant result.This calibration zone is presented in the Figure 1.The spreading of targets allows to have an important variety of possible positions in the zone.The stabilization of the GoPro is possible thanks to a tripod.
The used calibration process is explained below: -Implementation of targets; -GoPro camera setup : unique shooting mode, sensibility ISO 400, Wi-Fi connection with smartphone; -Positioning of the GoPro camera on the tripod; -Six images of the complete zone by movement of the camera; -Six images of the central zone; -Six images of the zone on the edge of the sensor.
The ISO sensitivity has been selected according to the mode of use as continuous shootings with regular time interval.It is possible for static observations (unique shoots) to reduce this value to 100 ISO.As the device does not possess a control screen to show the resultant images and as it has a fisheye lens the connection to a smartphone allows to make sure of the good orientation and the visibility of targets in the images.With the twelve first images we obtain redundancy and stability in the determination of the internal parameters.The supplementary images increase the calibrated sensor surface.A weak cover of the sensor will engender an extrapolation of the distortions and will be doubtless inaccurate on the sensor borders.It has been added coloured sticks on the ground to have a similar configuration of targets on all the calibrations.So six GoPro Hero 4 Black Edition cameras were calibrated in this same way.The estimation of the internal parameters of cameras constitutes the next stage.
Determination of the internal parameters: for the calibration, the PhotoModeler software in multi-sheet calibration mode has been used.The software analyses the images by detecting and identifying the coded targets.A correlation process is thus used to find first approximatively, then finely the position of each target.Once the detection are realised, the software calculates the parameters at the first time.Often, for our camera type, the parameter determination diverges.The calculation is then stopped.With a fine inspection of the images it can be possible to verify the taking into account of the characteristic objects.Where necessary, the "Referencing mode" tool has been used to point approximately the zone of a not detected target and to relaunch the detection by the algorithm.The parameter determination has been boosted by modifying the configuration.By default, the research of the targets is always carried out before the estimates.The radial distortions constitute a majority of the systematic errors of the cameras.Thus the number of unknown factors has been reduced by limiting the calculation of the coefficients of radial distortion to K1 and K2 according to the Brown's "Citation" model.The various experiments showed a high correlation between the coefficients K1, K2 and K3.The non-uniformity of the sensors is underlined by rather different values.This report is based on six cameras of the same model and the different values for the dimensions of the sensor, a variation of the coordinates of the principal point and the focal length.According to the results (Table 1) an individual calibration seems to be relevant in order to obtain a model of distortion and the essential parameters of the relative orientation for each sensor.These are very important in the following matching process for creation of dense point clouds.

Calibration using Photoscan.
The choice of this software, developed by Agisoft is based on its speed and flexibility in use.The Agisoft Lens application allows calibration and calibration file conversion.The interoperability of this software module facilitates the migration bypassing the calibration step.Imported files can come from Australis software, PhotoModeler, 3DM and Calcam.Agisoft Lens also has a calibration algorithm using open CV code.The calibration pattern is a checkerboard.Unlike PhotoModeler, the choice of the internal parameters to be calculated during calibration is not possible.The user may only configure the radial distortions K1, K2 and K3.The focal length and the coordinates of the principal point are also carried out.

Comparison between PhotoModeler and Photoscan calibration results.
At this experimental step two datasets are available in PhotoScan: one directly calculated in software and another imported from PhotoModeler.Table 2 shows that the PhotoModeler parameters conversion method to Photoscan is not exactly known.But the results are very close.These similarities demonstrate the successful completion of the calibration of the GoPro cameras, internal parameters are now known with precision and accuracy.An in-situ calibration will in future define precisely these parameters.

Static acquisition process for tunnels
In order to evaluate qualitatively and quantitatively photogrammetric point clouds, two underpasses were chosen.These two tunnels differ in several aspects: -The size (2.60 m and from 4.40 m to 6.60 m of height); -The environments (urban and rural); -The geometry (top structure is straight or oblique); -The number of elements (unique or double); -The traffic.

Reference model by Faro TLS:
A TLS supplies outstanding results in a short time if the process of acquisition is well mastered.Our laboratory uses a Faro Focus X330 TLS.This ultra-fast TLS allows to scan objects up to 330 meters, even in case of high sunlight.Two scan stations has been performed on both sides of the first tunnel to digitize it.For the second tunnel, three stations were necessary to survey both extremities and a junction element in the centre of the structure.Spheres were distributed mainly under the tunnel that were visible from all scan stations.It is important to have at least three common spheres per couple of stations in order to perform an accurate consolidation.To ensure controlled spatial resection, ten spheres were judiciously distributed in the project zone on different heights (spheres directly laid on the ground and others raised by tripod).According to the accuracy of the desired cloud and the scanned object, various acquisition modes are possible.To achieve centimeter accuracy an angular pitch of 1/5 has been selected.The chosen quality (4x) displays four measurements taken by the scanner for each surveyed point.It has been also chosen to scan the entire environment around the device.The acquisition time for each scan station was about 10 minutes.

Processing of point clouds:
For the processing, the Faro Scene software package has been used.After import point clouds, consolidation was performed using the spheres.However, this detection is automatic, sometimes the software does not find the objects.Then the operator has to pick the object in the point cloud.The identification of the elements allows the calculation of the position of TLS.The indicated RMS of the scanner relative positions was 3 mm.Once the consolidation is complete, the point cloud of each scan station has to be exported in a standard format.Then operator has to clean up the point clouds by removing unwanted elements (moving vehicles, spheres used for registration).These obtained data will allow the model scaling and the comparison to this reference model.At this stage, a resample has been performed to facilitate the use and improve the processing time for evaluations.The chosen density is a point every 2 cm.

Bar mounting device:
A mounting bar was designed especially for the simultaneous mounting of 4 cameras.The system of this experiment consists in a perforated metal bar, in order to screw the handlebar mounts that can house small cameras.This system has been designed to attach to a car roof bar to complete the acquisition in dynamic mode.This is the first version of this system.To ensure optimal image captures, the prototype will be improved by strengthening the metal bar to avoid its twisting and minimize vibration that directly impact the quality of images and therefore the accuracy of the final result.

Static acquisition:
The mounting device allows the use of multiple configurations which are set out in Figure 2.

Modeling a tunnel by static image acquisition:
The image capture was carried out on a day when the weather was overcast.The exposure time has been long enough for most of them.The mounting has been supported by a tripod upright to the side walls.The assembly was held at 1.50 m for pictures of the top structure and 1.10 m for the acquisition of the roadway (configuration #1) and the walls (configuration #5). Figure 3 shows the shooting locations for mounting at -45° relative to the side wall (configuration #4).Positions for additional photos were taken between the blue dots.The panoramic mounting with five sensors was used with a tripod in the center of the roadway.We used the same acquisition mode as that performed in a test corridor, ie two passes for each surface.In total, we used eight simple passages of six positions each, with the mounting with four cameras: 192 pictures.On average, we have 84% recovering between the images of the same position.This important value is understandable by the type of fisheye sensor used.The recovering between two positions is about 70%.With this configuration we obtain an object pixel size of 2 mm.The panoramic assembly allowed the capture of 30 photos in six different positions, which corresponds to three stations by roadway.

Results:
The best obtained result is derived from 102 images (Figure 4).The majority of images comes from the mountings of configuration #1 and #5.

Comparison to the reference point cloud:
For this step, the free software CloudCompare has been used.Its ease of use and additional features make it a reference in the analysis of point clouds.The first result of this comparison is based on all the points of the model.The results show a dispersion of about 80 cm on the extremities of the reference point cloud.The histogram of this first analysis shows that 99.5% of the points lie in the range [-20 cm; +20cm] from the reference point cloud.After refining the analysis, the mean standard deviation was 3.0 cm with a standard deviation of 3.4 cm.These results suggest an unfavorable systematism in this case study.The errors are probably due to a heterogeneous distribution of the control points during the absolute orientation and the scaling of the model.

Conclusion:
The error distribution shows a systematism.The facades without control points are more subject to modelling errors.The roadway has no control points and errors seem to augment at the end of the densified portion.The upper corners between the side walls and the top structure have not been detected by the process.However, this part is not very important in obtaining the searched profiles.

Fig. 4: Point cloud of the tunnel
The result is very interesting.It shows that the mounting, the number of photos and processing were conducted properly.The resulting model is consistent with the expected accuracy.

Objective:
In the same way, the objective was to establish a simple and rapid acquisition protocol to obtain precise geometry of traffic circles by using the same type of sensors.The acquisition hardware consists in the same mounting which is fixed to a vehicle to enable a mobile acquisition.The device is mounted on a roof bar at the front of the vehicle (Figure 5).All cameras have a large degree of freedom to orient them as defined.Fig. 5: Acquisition device mounted on a roof bar Acquisition was obtained by synchronized cameras (even if synchronization was not mandatory for this experiment as previously explained).The vehicle runs slowly in the traffic circle and allows a simultaneous acquisition of 4 images throughout the way.Figure 6 shows the various trajectories of the cameras.In this case, it is the geometrical shape of the object (circular) and the so circular trajectory which will make it possible to obtain convergent images.
Test #1: For these capture, the following settings were used: -Picture mode: 12 Mpx -Angle: wide (170°) -Time laps: 2 images/sec.-Vehicle speed: about 10 km/h With these implemented parameters we obtain the following project characteristics: -Displacement between 2 images: 1.80 m (camera #1) and 2.50 m (camera #4) -Total number of images: 126 -GCP: 6   Projected shadows add noise and erroneous information in the point cloud.

Dynamic acquisition of streets and obstacles
In another test site, the implementation possibilities of the device for the acquisition of streets has been checked.

Acquisition device:
The cameras were installed on the vehicle at the back and on the right side.The orientation of the cameras with a wide degree of freedom (Figure 9 & 10) is always possible with this device.

Conditions of dynamic image acquisition:
The image capture are made through videos (Teo, 2015;Forlani et al., 2005) with different camera resolution (2.1 Mpx or 4K).The image rate is 30 frames/sec.The vehicle speed was about 15 km/h.

Processing of videos sequences:
The Photoscan software has been reused for the generation of the dense point clouds.At first, the videos sequences have been resampled by handling only one image of ten, which corresponds to an image every 0.33 sec.So there were, for a section of 150 meters, 486 images/camera, in total 1944 images to be handled.Sections were captured in a go and return way with the device fixed on the back, then on the side of the vehicle.Four passages were thus necessary for a complete acquisition.The scenes were georeferenced using 14 reference points (Figure 11).The results are satisfactory with a comparison of profiles extracted from point clouds.The comparison is distorted by illdefined elements (vegetation).Other tests should be conducted with higher resolutions, here only 4K was available.
The mounting on the right side of the vehicle has been performed.By fixing the mounting on the left side it can be expected to have a better connection between the images because it would increase the distance to object and the recovery percentage.

Final preconisation
The final preconisation the following advices can be given: -It seems to be important to have a good and uniform space repartition of GCP.GCP also on facades are encouraged to provide sufficient height differences in the scene; -The use of a single camera or to convergent cameras will increase the redundancy.Current software packages have difficulties to combine different acquisition devices with different parameters; -Use high resolution sensor to increase the amount points in dense point clouds; -Use camera orientation with favourable intersections.

PERSPECTIVES AND IMPROVEMENTS
The dynamic acquisition mode can be improved.Indeed, the quality of obtained images from a vehicle in motion, furthermore on rather dark surfaces is not optimal.Additional lighting supplies could help to improve the device, but would also weigh it down.Other technologies such as 3D cameras or structured light projection sensors could be an alternative to solve the problem of lack of luminosity in tunnels.This experiment system is aimed to be continuously improved in order to increase the whole recording and post-processing process and therefore the accuracy of the 3D geometric models.
From the obtained point clouds it is possible to extract profiles (Figure 16) which are useful for the sizing of the road structures and the adjacent infrastructures.Even if the accuracy is not as good as that of a TLS point cloud, the method is fast and accurate enough to make estimations on the possibilities of passage of a convoy.This method must be improved by other tests on other structures in other environments and image acquisition conditions.

Fig. 3 :
Fig. 3: Static image acquisition To keep objectivity in the analysis of the point clouds the different point clouds have not been cleaned.The point clouds were only resampled to a point every 2 cm to reduce the calculations.The resultant photogrammetric point cloud was composed of 621.000 points.The CloudCompare M3C2 plug-in (Multiscale Model to Model Cloud Comparison) (Lague et al., 2013) allowed comparison of the point clouds.This algorithm determines without data corruption the distance between two point clouds via the normals.It calculates a local distance between the point clouds along the normal direction and determines for each distance the confidence interval from the thickness of the point clouds and the accuracies of the consolidations.

Fig. 6 :
Fig. 6: Trajectories of cameras in a traffic circle Image sequences were conventionally processed with the Photoscan software: -Relative Orientation, -Adding manually tie points, -Re-iteration on non-referenced images, -Deleting uncertain tie points on images pairs, -Auto-calibration of cameras, -Generation of dense point cloud.The results are shown on Figure 7.

Fig. 7 :
Fig. 7: Point cloud from processing of the images in the case of a traffic circle (Test #1) It has been highlighted the presence of significant noise in the resulting point cloud.This noise is caused by movements (including people) during the image capture, but this corresponds to normal use in the future.The resulting point cloud includes 5.5 million points.In this context, it is recommended to add coded targets along the outer contour or inside the traffic circle to facilitate the automatic orientation of the images.Some areas remain poorly handled, including the sidewalk borders that generate shadows and make recognition of homologous points more difficult.It also appears essential to remove the masks generated by moving elements, including pedestrians.As result, this acquisition remains fast and the processing are relatively simple to execute because they allow to follow the predefined user protocol.The use of high resolution (12 Mpx) images lets to reduce in an important way the number of processed images.

Fig. 8 :
Fig. 8: Point cloud from processing of the images in the case of a traffic circle (Test #2) It has been obtained similar results as with the test #1 with a less important density of points in the final point cloud.It is also recommended to place coded targets in the main outlines of the traffic circle to facilitate the links between the sequences of images.Other vehicles engaged in the traffic circle are negative factors.It is necessary to privilege periods without activity.Projected shadows add noise and erroneous information in the point cloud.

Fig. 11 :
Fig. 11: Reference points for scene georeferencing For the generation of dense point clouds comparisons of several software packages MicMac (IGN), VisualSFM (OS), ContextCapture (Bentley) have been done.The different treatments showed problems to handle connections between the different passages (go / return).Tests were also conducted with a resample of one picture of twenty.In another test only 6 GCP have been used to reference the image sequences.The processing are relatively time consuming but obviously depend on iT resources.From the encountered problems the following recommendations can be suggested: -It seems important to limit the number of cameras and the number of images to reduce the processing time, -It is necessary to ensure strong recovering between images to produce a complete dense point cloud -Images should be captured at different times to avoid repeating unwanted mobile features on the study area.

2. 4 . 5
Dense point clouds analysis:A georeferenced TLS point cloud has been used as a reference.The reference point cloud consolidation precision is about 5 mm.The dispersion could be assessed on flat facades observed for at least 3 TLS stations.RMS obtained is 5 mm.This point cloud is used as a reference for the comparison to dense point cloud obtained by processing of image sequences under Photoscan.The comparison was performed using CloudCompare with the plugin M3C2 enabling point cloud to point cloud comparisons as described before.The results of the comparison are shown in Figure12.