Telecentric camera calibration with virtual patterns

Chao Chen; Bin Chen; Bing Pan

doi:10.1088/1361-6501/ac1bec

1. Introduction

Telecentric lenses have been highly advocated and increasingly used for high-precision three-dimensional (3D) reconstruction of small objects [1–3] due to the advantages of nearly zero distortion and constant magnification in the range of working distance. Different from the perspective projection model of pinhole cameras, a telecentric camera (a camera attached with a telecentric lens) follows the orthogonal projection model. As such, most of existing calibration methods for pinhole cameras are not applicable for telecentric cameras. Therefore, accurate calibration of a telecentric camera is generally considered as a challenging task.

In the literature, several calibration methods have been developed to retrieving the intrinsic and extrinsic parameters of telecentric cameras. These methods can be basically grouped into three types [4] according to the used calibration target: self-calibration, planar-target-based calibration, and 3D-target-based calibration.

(a)
Self-calibration. This type of methods [5–7] does not need any calibration target. A telecentric camera is moved in a static scene to capture a series of images from various positions. Then, feature correspondences in the captured image sequence are established and used for reconstructing the 3D data of scene, as well as the camera motion in affine space. Although this method is flexible, it is hard to apply this type of methods in practical applications due to its drawbacks of a mass of data, heavy computational burden, and high sensitivity to noise.
(b)
Planar-target-based calibration. In contrast to self-calibration methods, the position of a telecentric camera is fixed in this type of methods. The fixed camera captures planar calibration targets from different poses and orientations to estimate the extrinsic and intrinsic parameters. For example, Li et al [8] used a planar target with circle pattern to calibrate a single telecentric camera. However, the method did not consider the uncertainty in the signs of the third column in a rotation matrix when recovering the extrinsic parameters of the camera. To address this problem of sign ambiguity, a planar target was first placed on an accurate translation stage. Then, the target was moved a known distance along its normal vector. The images before and after movement were captured and subsequently used for unambiguously determining the rotation matrix [9–14]. However, these methods are inflexible because of the use of the extra translation stage device. Rather than using the translation stage, Rao et al [15] applied the perspective projection model to express the imaging process of the telecentric camera based on the assumption that the rays passing through the telecentric lens are not strictly parallel to each other. However, this method is not suitable for lenses with good telecentricity, which strictly perform parallel projection model.
(c)
3D-target-based calibration. This type of methods uses a 3D calibration target with a known geometry, which is obtained either from fabrication [16] or from 3D reconstruction with the help of a perspective projector [17]. The 3D calibration target in different poses and orientations is first acquired. Then the linear least-square algorithm is used to calibrate camera parameters. Though the determination of the projection parameters is very simple, this method needs to fabricate high-accuracy 3D targets with different sizes for telecentric cameras with different field of views (FOV).

In short, the aforementioned methods calibrating telecentric cameras by manually operating a 2D or 3D physical target are both cumbersome and inflexible. This is because it is cumbersome to manually place and move the physical calibration target within the small FOV of a telecentric camera. Also, the requirement of an extra translation stage makes the calibration procedure even inflexible.

Recently, an automatic and flexible camera calibration method using active displays of a virtual pattern was proposed [18, 19]. It has been successfully applied for calibrating pinhole cameras because of its advantage of no any manual operations. However, existing camera calibration methods using virtual pattern are only applicable for calibrating pinhole cameras following the perspective projection model, but not for calibrating telecentric cameras. Inspired by the idea of using virtual patterns, we propose an easy-to-implement and automatic virtual patterns-based telecentric camera calibration method. The method uses a virtual planar target, rather than a 2D or 3D physical target, displayed on a liquid crystal display (LCD) screen for telecentric camera calibration. During the calibration, the camera and the LCD screen are fixed without any manual operations, which greatly simplifies the calibration process. Also, a virtual calibration target is automatically generated by using pre-defined parameters, a rotation matrix and a translation vector, to solve the problem of sign ambiguity without using a translation stage, thus increasing the flexibility of the proposed method. In the following, the imaging model of telecentric cameras is first described. Based on this model, methods for calibrating a single telecentric camera and a telecentric stereo vision system are introduced. The effectiveness and accuracy of the proposed telecentric camera calibration method were validated by calibrating a single-camera telecentric pseudo-stereo vision system, and applying it for 3D reconstruction of both planar and complex surfaces.

2. Single telecentric camera calibration using a virtual target

2.1. Telecentric camera imaging model

A telecentric camera is generally described by an orthographic projection model. Based on the model, an arbitrary physical point (x_w, y_w, z_w ) on the 3D object and its projection (u, v) on the imaging plane can be formulated as [20]:

$\begin{equation}\,\,\left[ {\begin{array}{*{20}{c}} u \\ v \\ 1 \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} \alpha &0&0&{{u_0}} \\ 0&\beta &0&{{v_0}} \\ 0&0&0&1 \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{R_{3 \times 3}}}&{{T_{3 \times 1}}} \\ 0&1 \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ {{z_w}} \\ 1 \end{array}} \right]\end{equation} \tag{ 1 }$

where R_3×3 and T_3×1 = (t₁, t₂, t₃)^T represent the 3 × 3 rotation matrix and the 3 × 1 translation vector, respectively; α and β are the magnification radio along the U and V axes of the imaging plane; and (u₀, v₀) is the coordinate of the center of the imaging plane.

We can denote the ith row and jth column of matrix R by r_ij . Thus, we have

$\begin{align}\left[ {\begin{array}{*{20}{c}} u \\ v \\ 1 \end{array}} \right]\,\, &= \,\underbrace {\left[ {\begin{array}{*{20}{c}} \alpha &{\text{0}}&{{u_{\text{0}}}} \\ {\text{0}}&{{\beta }}&{{v_{\text{0}}}} \\ {\text{0}}&{\text{0}}&{\text{1}} \end{array}} \right]}_{{A_s}}\,\,\underbrace {\left[ {\begin{array}{*{20}{c}} {{r_{{\text{11}}}}}&{{r_{{\text{12}}}}}&{{r_{{\text{13}}}}}&{{t_{\text{1}}}} \\ {{r_{{\text{21}}}}}&{{r_{{\text{22}}}}}&{{r_{{\text{23}}}}}&{{t_{\text{2}}}} \\ {\text{0}}&{\text{0}}&{\text{0}}&{\text{1}} \end{array}} \right]}_C\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ {{z_w}} \\ 1 \end{array}} \right]\,\nonumber\\& = \, \underbrace {\left[ {\begin{array}{*{20}{l}} {{m_{11}}}&{{m_{12}}}&{{m_{13}}}&{{m_{14}}} \\ {{m_{21}}}&{{m_{22}}}&{{m_{23}}}&{{m_{24}}} \\ 0&0&0&1 \end{array}} \right]}_M \,\,\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ {{z_w}} \\ 1 \end{array}} \right].\end{align} \tag{ 2 }$

When a planar calibration target is used, Z_w of the world coordinate system is generally set as zero. Consequently, equation (2) can be written as

$\begin{align}\,\,\left[ {\begin{array}{*{20}{c}} u \\ v \\ 1 \end{array}} \right]\, &= \,\left[ {\begin{array}{*{20}{c}} \alpha &0&{{u_0}} \\ 0&\beta &{{v_0}} \\ 0&0&1 \end{array}} \right]\,\underbrace {\left[ {\begin{array}{*{20}{l}} {{r_{11}}}&{{r_{12}}}&{{t_1}} \\ {{r_{21}}}&{{r_{22}}}&{{t_2}} \\ 0&0&1 \end{array}} \right]}_{{C_{s}}}\,\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ 1 \end{array}} \right]\,\nonumber\\ &= \,H\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ 1 \end{array}} \right]\,\end{align} \tag{ 3 }$

where H is a 3 × 3 homography matrix, which can be decomposed for calculating all the parameters of telecentric camera.

2.2. Estimation of matrices A_s and C_s

Figure 1 shows the schematic diagram of the proposed telecentric camera calibration method. Specifically, a virtual target is first produced by preset parameters and orthogonally projected on an LCD screen. Then, the patterns displayed on the screen are acquired by a telecentric camera. Lastly, the intrinsic and extrinsic parameters of the camera are calculated according to the corresponding relationships between virtual world points and their projection points on the camera imaging plane.

We assume that P_v on the virtual target is an arbitrary 3D feature point with coordinates (x_v, y_v , 0) in the virtual world coordinate system {O_v; X_v, Y_v, Z_v}. The coordinates of its projection in the screen coordinate system {O_s; X_s, Y_s } and in the camera pixel coordinate system {O; U, V} are P_s (x_s, y_s , 0) and P(u, v), respectively. The imaging process from the virtual world point P_v to image point P can be described as

$\begin{equation}\left[ {\begin{array}{*{20}{c}} u \\ v \\ 1 \end{array}} \right] = \,{H^c}{H^v}\left[ {\begin{array}{*{20}{c}} {{x_v}} \\ {{y_v}} \\ 1 \end{array}} \right] = \,H\left[ {\begin{array}{*{20}{c}} {{x_v}} \\ {{y_v}} \\ 1 \end{array}} \right]\end{equation} \tag{ 4 }$

where 3 × 3 homography matrix H^v denotes the preset orthogonal projection from the virtual calibration target to the screen. The 3 × 3 homography matrix H^c denotes the orthogonal projection from the screen to the camera imaging plane, which comprises the camera parameters to be solved. The 3 × 3 homography matrix H denotes orthogonal projection from the virtual world coordinate system to the camera pixel coordinate system, which can be directly calculated according to a set of corner points in virtual target and their 2D projections on the camera imaging plane.

We denote the ith row and jth column of matrix H by h_ij , the ith row and jth column of matrix H^c by $h_{ij}^c$ , and the ith row and jth column of matrix H^v by $h_{ij}^v.$ Then equation (4) can be rewritten as

$\begin{align}\left[ {\begin{array}{*{20}{c}} {{h_{11}}}&{{h_{12}}}&{{h_{13}}} \\ {{h_{21}}}&{{h_{22}}}&{{h_{{\text{23}}}}} \\ 0&0&1 \end{array}} \right] &= \left[ {\begin{array}{*{20}{c}} {h_{11}^c}&{h_{12}^c}&{h_{13}^c} \\ {h_{21}^c}&{h_{22}^c}&{h_{23}^c} \\ 0&0&1 \end{array}} \right]\nonumber\\&\quad\times\left[ {\begin{array}{*{20}{c}} {h_{11}^v}&{h_{12}^v}&{h_{13}^v} \\ {h_{21}^v}&{h_{22}^v}&{h_{23}^v} \\ 0&0&1 \end{array}} \right].\end{align} \tag{ 5 }$

From equation (5), we have

$\begin{equation}\left[ {\begin{array}{*{20}{c}} {h_{11}^c}&{h_{12}^c} \\ {h_{21}^c}&{h_{22}^c} \end{array}} \right] = \,K\left[ {\begin{array}{*{20}{c}} {{h_{11}}}&{{h_{12}}\,\,} \\ {{h_{21}}}&{{h_{22}}} \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {h_{22}^v}&{ - h_{12}^v} \\ { - h_{21}^v}&{h_{11}^v} \end{array}} \right]\end{equation} \tag{ 6 }$

with $K\, = \,\frac{1}{{h_{11}^vh_{22}^v - h_{21}^vh_{12}^v}}$ .

According to equation (3), the following equation can be achieved

$\begin{align}\left[ {\begin{array}{*{20}{c}} u \\ v \\ 1 \end{array}} \right] &= \left[ {\begin{array}{*{20}{c}} \alpha &0&{{u_0}} \\ 0&\beta &{{v_0}} \\ 0&0&1 \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{r_{11}}}&{{r_{12}}}&{{t_1}} \\ {{r_{21}}}&{{r_{22}}}&{{t_2}} \\ 0&0&1 \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ 1 \end{array}} \right]\nonumber\\ &= \;{H^c}\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ 1 \end{array}} \right].\end{align} \tag{ 7 }$

Based on equation (7), we can get

$\begin{equation}{R_l}\; = \left[ {\begin{array}{*{20}{c}} {{r_{11}}}&{{r_{12}}} \\ {{r_{21}}}&{{r_{22}}} \end{array}} \right] = \;\left[ {\begin{array}{*{20}{c}} {1/\alpha \;}&0 \\ 0&{1/\beta } \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {h_{11}^c}&{h_{12}^c} \\ {h_{21}^c}&{h_{22}^c} \end{array}} \right].\end{equation} \tag{ 8 }$

Since the rotation matrix R is both unitary and orthogonal, we have the following three independent constraints

$\begin{align}\left\{ {\begin{array}{*{20}{c}} {\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\|{\boldsymbol{r}_1}\|\; = \;r_{11}^2\; + \;r_{12}^2\; + r_{13}^2\; = \;1} \\ {\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\|{\boldsymbol{r}_2}\|\; = \;r_{21}^2\; + \;r_{22}^2\; + r_{23}^2\; = \;1} \\ {{\boldsymbol{r}_1} \cdot {\boldsymbol{r}_2}\; = \;{r_{11}}{r_{21}}\; + \;{r_{12}}{r_{22}}\; + \;{r_{13}}{r_{23}}\; = \;0} \end{array}} \right.\end{align} \tag{ 9 }$

where ${\boldsymbol{r}_{\text{1}}}{\text{ and }}\,{\boldsymbol{r}_2}{ }$ are the row vectors of the matrix R. Substituting the first two equations in equation (9) into the last equation in equation (9) and eliminating ${r_{13}}$ and ${r_{23}}$ , we have

$\begin{equation}\|{R_l}\|_{F}^{2} - \,1\, = \,{\left| {{R_l}} \right|^2}\end{equation} \tag{ 10 }$

where $\|{R_l}\|_{F}$ and $\left| {{R_l}} \right|$ are the Frobenius norm and determinant of ${R_l}$ , respectively. After rearranging equation (10), we can derive a linear equation:

$\begin{equation}{l_1} - \,l_2{F_1}\, - \,l_3{F_2}\, = \,Q\end{equation} \tag{ 11 }$

where, $\,{l_1}\, = \,{\alpha ^2}{\beta ^2},\,{l_2}\, = \,{\alpha ^2},\,{l_3}\, = \,{\beta ^2},{F_1}\, = \,[{(h_{21}^c)^2} + \,{(h_{22}^c)^2}],$ ${F_2}\, = \,[{(h_{11}^c)^2} + \,{(h_{12}^c)^2}]$ , $Q\, = \, - {\left[ {h_{11}^ch_{22}^c - h_{12}^ch_{21}^c} \right]^2}$ .

If n virtual target images from different poses and orientations are captured, n corresponding homographic matrices H^v can be obtained. The element in the ith row and jth column of the nth homographic matrix H^v ⁽ⁿ⁾ is denoted as $h_{ij}^{v(n)}$ . For the nth image, $F$ ₁, $F$ ₂, $Q$ are written as $F_1^{(n)}$ , $F_2^{(n)}$ , ${Q^{(n)}}$ , respectively. Therefore, equation (11) can be stacked to be a n × 3 matrix as follows

$\begin{equation}\;\underbrace {\left[ {\begin{array}{*{20}{c}} 1&{F_1^{(1)}}&{F_2^{(1)}} \\ 1&{F_1^{(2)}}&{F_2^{(2)}} \\ \vdots & \vdots & \vdots \\ 1&{F_1^{(n)}}&{F_2^{(n)}} \end{array}} \right]}_F\,\underbrace {\left[ {\begin{array}{*{20}{c}} {{l_1}} \\ {{l_2}} \\ {{l_3}} \end{array}} \right]}_L\, = \,\underbrace {\left[ {\begin{array}{*{20}{c}} {{Q^{(1)}}} \\ {{Q^{(2)}}} \\ \vdots \\ {{Q^{(n)}}} \end{array}} \right]}_Q.\end{equation} \tag{ 12 }$

In equation (12), it should be mentioned that at least three images are required to calculate matrix L. The linear least-squares solution is given by

$\begin{equation}L = \left({F}_{}^{T}F \right)^{ - 1}{F}^{T}Q.\end{equation} \tag{ 13 }$

Therefore, the intrinsic parameters can be recovered by

$\begin{equation}\alpha = \sqrt {{l_2}} ,{ }\beta = \sqrt {{l_3}} .\end{equation} \tag{ 14 }$

Once α and β are known, according to equation (7), t₁ and t₂ can be calculated by

$\begin{equation}\left\{ {\begin{array}{*{20}{c}} {\alpha {t_1}\; + \;{u_0}\; = \;h_{13}^c} \\ {\beta {t_2}\; + \;{v_0}\; = \;h_{23}^c} \end{array}} \right..\end{equation} \tag{ 15 }$

Besides, the extrinsic parameters $r_{ij}^{ n}$ (i = 1, 2, j = 1, 2) for the nth images can be also acquired based on equation (8).

2.3. Recovery of parameters r₁₃ and r₂₃

It is noteworthy that, at each target position, the rotation matrix C can be directly obtained except for the elements r₁₃ and r₂₃. According to the first two equation in equation (9), the parameters ${r_{13}}$ and ${r_{23}}$ can be calculated by

$\begin{equation}{r_{13}}\, = \, \pm \,\sqrt {1\, - \,r_{11}^2\, - \,r_{12}^2} \end{equation} \tag{ 16 }$

$\begin{equation}{r_{23}} = \pm { }\sqrt {1 - r_{21}^2 - r_{22}^2} .\end{equation} \tag{ 17 }$

Unfortunately, the signs of ${r_{13}}$ and ${r_{23}}$ cannot be directly determined from equations (16) and (17) since the orthogonality of R can only offer one constraint. To solve the problem, conventional methods provide a depth value for 2D physical calibration target by means of a translation stage, which makes calibration produce inflexible. Therefore, we use pre-defined parameters to locate a virtual calibration target rather than using a translation stage to move a physical calibration target. The original pattern in the screen coordinate system is displayed on the LCD screen and is converted into the local virtual world coordinate system. Then the original pattern in the virtual world coordinate system is used for identifying the signs of ${r_{13}}$ and ${r_{23}}$ . To simplify the notation, the coordinate of a feature point in the screen is still (x_s, y_s ), then the point (x_w, y_w, z_w ) in the virtual world coordinate can be expressed by

$\begin{equation}\left[ {\begin{array}{*{20}{c}} {{{{x}}_{{w}}}} \\ {{{{y}}_{{w}}}} \\ {{{{z}}_{{w}}}} \end{array}} \right] = \underbrace {\left[ {\begin{array}{*{20}{c}} {{{r}}_{{{11}}}^{{v}}}&{{{r}}_{{{12}}}^{{v}}}&{{{r}}_{{{13}}}^{{v}}} \\ {{{r}}_{{{21}}}^{{v}}}&{{{r}}_{{{22}}}^{{v}}}&{{{r}}_{{{23}}}^{{v}}} \\ {{{r}}_{{{31}}}^{{v}}}&{{{r}}_{{{32}}}^{{v}}}&{{{r}}_{{{33}}}^{{v}}} \end{array}} \right]}_{{{{R}}^{{v}}}}\left[ {\begin{array}{*{20}{c}} {{{{x}}_{{s}}}} \\ {{{{y}}_{{s}}}} \\ {{0}} \end{array}} \right] + \underbrace {\left[ {\begin{array}{*{20}{c}} {{{t}}_{{1}}^{{v}}} \\ {{{t}}_{{2}}^{{v}}} \\ {{{t}}_{{3}}^{{v}}} \end{array}} \right]}_{{{{T}}^{{v}}}}\end{equation} \tag{ 18 }$

where R^v and T^v are 3 × 3 rotation matrix and 3 × 1 translation vector, respectively. And the pixel coordinates of this point on the camera imaging plane is (u_t, v_t ). Then, by combining equations (2) and (18), r₁₃ and r₂₃ meet the following criteria

$\begin{align}\left[ {\begin{array}{*{20}{c}} {{r_{13}}} \\ {{r_{23}}} \\ 0 \end{array}} \right] &= \frac{1}{{r_{31}^v{x_s}\; + \;r_{32}^v{y_s}\; + \;t_3^v}}\left( {{\left[ {\begin{array}{*{20}{c}} \alpha &0&{{u_0}} \\ 0&\beta &{{v_0}} \\ 0&0&1 \end{array}} \right]}^{ - 1}}\left[ {\begin{array}{*{20}{c}} {{u_t}} \\ {{v_t}} \\ 1 \end{array}} \right]\right.\nonumber\\&\quad- \left.\left[ {\begin{array}{*{20}{c}} {{r_{11}}}&{{r_{12}}}&{{t_1}} \\ {{r_{21}}}&{{r_{22}}}&{{t_2}} \\ 0&0&1 \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {r_{11}^v}&{r_{12}^v}&{t_1^v} \\ {r_{21}^v}&{r_{22}^v}&{t_2^v} \\ 0&0&1 \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{x_s}} \\ {{y_s}} \\ 1 \end{array}} \right] \right).\end{align} \tag{ 19 }$

Their signs can be well determined according to the values calculated by equation (19).

3. Calibration of telecentric stereo vision system using a virtual target

Based on the method described in subsection 2, a telecentric stereovision vision system consisting of two cameras can be calibrated. Figure 2 illustrates the coordinate system of the system. O_lc and O_rc are respectively the origins of the left and right virtual camera coordinates, which are located at infinity. Consequently, the principal rays P_wO_lc and P_wO_rc are parallel to the optical axes O_lcZ_l and O_rcZ_r , respectively. Suppose that the two principal rays emitted by the point P_w (x_w, y_w, z_w ) in the world coordinate system respectively intersect with the left and right camera imaging planes at the points P_l (u_l, v_l ) and P_r (u_r, v_r ), which are called the corresponding points. According to equation (2), the imaging process of the system can be express as

$\begin{equation}\left[ {\begin{array}{*{20}{c}} {{u_l}} \\ {{v_l}} \\ {{u_r}} \\ {{v_r}} \end{array}} \right] = \underbrace {\left[ {\begin{array}{*{20}{c}} {m_{11}^l}&{m_{12}^l}&{m_{13}^l}&{m_{14}^l} \\ {m_{21}^l}&{m_{22}^l}&{m_{23}^l}&{m_{24}^l} \\ {m_{11}^r}&{m_{12}^r}&{m_{13}^r}&{m_{14}^r} \\ {m_{21}^r}&{m_{22}^r}&{m_{23}^r}&{m_{24}^r} \end{array}} \right]}_G\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ {{z_w}} \\ 1 \end{array}} \right]\end{equation} \tag{ 20 }$

**Figure 2.** Coordinate system of a telecentric stereovision system.
Download figure:
Standard image High-resolution image

where $m_{ij}^l$ and $m_{ij}^r$ are the model parameters of the left and right cameras, respectively.

Once these parameters are obtained through the proposed single telecentric camera calibration method, the coordinates of the points P_w can be calculated by carrying out a reverse operation on equation (20):

$\begin{equation}\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ {{z_w}} \\ 1 \end{array}} \right] = {G^{ - {\text{1}}}}\left[ {\begin{array}{*{20}{c}} {{u_l}} \\ {{v_l}} \\ {{u_r}} \\ {{v_r}} \end{array}} \right].\end{equation} \tag{ 21 }$

Here, G⁻¹ is the inverse matrix of G. It is worth noting that a unique world coordinate system for two cameras needs to be established for 3D reconstruction. In this work, the world coordinate system was defined on the LCD screen set with its XY axes on the plane, and Z axis perpendicular to the plane and pointing towards the system.

4. Experiments and results

4.1. Experimental system

To reconstruct the 3D shape of an object, a single-camera telecentric pseudo-stereo vision system is established, as shown in figure 3(a). The proposed system is mainly composed of a telecentric lens, a digital camera, and a four-mirror adapter. The four planar mirrors (denoted as M₁, M₂, M₃, and M₄) are fixed in front of the telecentric camera. Note that the inside two mirrors M₃ and M₄ are perpendicularly fixed on the two sides of a triangular prism-shaped block and constitute a right angle, and the outside two mirrors M₁ and M₂ are mounted on a rotation stage. By properly adjusting the position of the tested object, its surface image can be clearly imaged by the camera through both the left and right optical paths in a single shot. As shown in figure 3(a), aided by the reflection of the mirror array, this system is equivalent to a pseudo-stereo vision system comprising two virtual telecentric cameras. The image recorded by each virtual telecentric camera corresponds to a half image captured by the real telecentric camera. Since the two virtual cameras are generated by the same camera, synchronization problem is avoided in the system.

Figure 3(b) shows a photograph of the established experimental system. The system consists of a four-mirror adapter, a digital camera (TXG20, Baumer Electric AG, Switzerland) with a resolution of 1624 × 1236 pixels and a pixel size of 0.0044 mm, and a bilateral telecentric lens (Xenoplan 1: 5, Schneider Optics, Inc., Germany) with a fixed working distance of 268 mm and an FOV of 35.7 × 27.2 mm. Because the used telecentric lens has a distortion ratio less than 0.1%, lens distortion can be neglected for briefness. An iPad mini 5 with a physical resolution of 2048 × 1536 pixels and pixel size of 0.078 mm is used as a display screen for system calibration. Before experiments, the screen was placed in front of the established system to guarantee that the whole camera view is covered by the screen. By carefully adjusting the position of the telecentric camera and rotating the mirrors, patterns displayed on the screen can be simultaneously and clearly imaged on the left and right halves of the camera imaging plane.

4.2. Calibration results

To calibrate the developed system, a chessboard pattern with 5 × 8 corner points was firstly produced by computer software, as shown in figure 3. The distance between adjacent points is 1.56 mm. Then, by applying ten groups of preset parameters to the original pattern generated, ten virtual calibration targets which have different positions and directions in 3D space were created. Next, these virtual targets were orthogonally projected and displayed on the screen, as shown in figure 4. Afterwards, each image displayed on the screen was acquired for camera calibration. The feature points on all the captured images were extracted by using the method in [21] for calculating matrices A_s and C_s . Finally, the parameters r₁₃ and r₂₃ were recovered with the assistance of another virtual calibration target. The A_s and C of two virtual telecentric cameras expressed in equation (2) were calibrated as following:

$\begin{align*}\left[ {\begin{array}{*{20}{c}} {{u_l}} \\ {{v_l}} \\ 1 \end{array}} \right] &= \left[ {\begin{array}{*{20}{c}} {45.522}&0&{406} \\ 0&{45.556}&{618} \\ 0&0&1 \end{array}} \right]\nonumber\\&\quad\times\left[ {\begin{array}{*{20}{c}} {0.917}&{ - 0.008}&{ - 0.399}&{ - 3.499} \\ { - 0.008}&{0.999}&{ - 0.054}&{ - 5.672} \\ 0&0&0&1 \end{array}} \right]\nonumber\\&\quad\times\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ {{z_w}} \\ 1 \end{array}} \right]\end{align*}$

$\begin{align*}\left[ {\begin{array}{*{20}{c}} {{u_r}} \\ {{v_r}} \\ 1 \end{array}} \right] &= \left[ {\begin{array}{*{20}{c}} {{\text{45}}{\text{.585}}}&{\text{0}}&{{\text{406}}} \\ {\text{0}}&{{\text{45}}{\text{.649}}}&{{\text{618}}} \\ {\text{0}}&{\text{0}}&{\text{1}} \end{array}} \right]\nonumber\\&\quad\times\left[ {\begin{array}{*{20}{c}} {{\text{0}}{\text{.837}}}&{ - {\text{0}}{\text{.042}}}&{{\text{0}}{\text{.546}}}&{ - {\text{5}}{\text{.725}}} \\ {{\text{0}}{\text{.051}}}&{{\text{0}}{\text{.998}}}&{ - {\text{0}}{\text{.049}}}&{ - {\text{5}}{\text{.152}}} \\ {\text{0}}&{\text{0}}&{\text{0}}&{\text{1}} \end{array}} \right]\nonumber\\&\quad\times\left[ {\begin{array}{*{20}{c}} {{x_w}} \\ {{y_w}} \\ {{z_w}} \\ 1 \end{array}} \right].\end{align*}$

**Figure 4.** Images displayed on the screen used for calibration.
Download figure:
Standard image High-resolution image

Furthermore, the matrix G can be estimated as

$\begin{align*}G = \left[ {\begin{array}{*{20}{c}} {{\text{41}}{\text{.729}}}&{ - {\text{0}}{\text{.377}}}&{ - {\text{18}}{\text{.188}}}&{{\text{246}}{\text{.742}}} \\ { - {\text{0}}{\text{.368}}}&{{\text{45}}{\text{.488}}}&{ - {\text{2}}{\text{.464}}}&{{\text{359}}{\text{.622}}} \\ {{\text{38}}{\text{.154}}}&{ - {\text{1}}{\text{.927}}}&{{\text{24}}{\text{.872}}}&{{\text{145}}{\text{.012}}} \\ {{\text{2}}{\text{.347}}}&{{\text{45}}{\text{.535}}}&{ - {\text{2}}{\text{.214}}}&{{\text{382}}{\text{.824}}} \end{array}} \right].\end{align*}$

Based on the obtained camera parameters, the re-projection errors of the two virtual telecentric cameras were respectively calculated and shown in figure 5. Their standard deviations are (0.0700, 0.0778) pixels for the left virtual camera and (0.0751, 0.0752) pixels for virtual right camera, respectively, confirming the accuracy of the proposed calibration method.

**Figure 5.** Re-projection errors of left and the right virtual cameras.
Download figure:
Standard image High-resolution image

4.3. 3D reconstruction with the calibrated system

After calibrating the single-camera telecentric pseudo-stereo vision system, it can be used for 3D reconstruction of a test object surface. To quantitatively evaluate the performance of the calibrated system, two tests, i.e. 3D shape measurement of corner points on the LCD calibration board and a plate plane, were performed. After that, the proposed system was also applied to measure a key with complex geometry to validate its practicability in reconstructing complicated shape on small objects.

In the first test, the original pattern was first displayed on the screen and captured. Then, the pixel coordinates of all the corner points, where the black squares intersect, on the captured image were extracted, as shown by the red and green dots in figure 6(a). Finally, their 3D coordinates were reconstructed in accordance with the obtained system parameters, as shown in figure 6(b). The Euclidean distance between each two adjacent corner points in 3D space can be calculated. There are 35 and 32 pairs of adjacent corner points in X and Y directions, respectively. The measured distances among the reconstructed 3D coordinates have a mathematical expectation of 1.555 mm, which is close to real value (1.56 mm) of the chessboard pattern generated by computer software.

**Figure 6.** (a) Captured original pattern and (b) 3D distribution of the reconstructed corner points.
Download figure:
Standard image High-resolution image

In the second test, we reconstructed the 3D shape of the planar plane using the conventional method that requires human interaction to move physical target and the proposed method. Before experiment, black and white random speckles were sprayed on the plate surface for establishing the corresponding relationship between the left virtual camera and the right virtual camera. In experiment, the plate was placed in the appropriate position in front of the calibrated system, so that two views of the plate surface can be imaged on the camera imaging plane, as shown in figure 7(a). By correlating the left and the right virtual camera images positioned in a rectangular region of interest (ROI) using the advanced inverse-compositional Gauss-Newton (IC-GN) algorithm [22] with a subset size of 29 × 29 pixels, the 3D data of the ROI were calculated based on the calibrated extrinsic and intrinsic parameters. Also, we compare it with an ideal plane acquired through least-square fitting. Figures 7(b) and (c) show the 3D geometry of the plate plane and the color-coded error distribution map using the proposed method, respectively. The corresponding results using the conventional method are shown in figures 7(d) and (e). Clearly the proposed method has a little smaller error than the conventional method: 1.6 μm for our proposed method comparing against 1.8 μm for the conventional method. In the root mean square (RMS) error maps, the errors are appeared at boundary regions. The results may be caused by lens distortion. In future work, the distortion model with radial and tangential coefficients should be used for increasing measurement accuracy. The results from the two tests confirm that the established system and the proposed calibration method offer high measurement accuracy.

**Figure 7.** Experimental results of measuring the plate plane. (a) Speckle image captured by the system; (b) and (c) results with the proposed method: (b) reconstructed the 3D shape and (c) 3D error map; (d) and (e) results with the conventional method: (d) reconstructed the 3D shape and (e) 3D error map.
Download figure:
Standard image High-resolution image

To visually evaluate the performance of the calibrated system in reconstructing complex shape on small objects, a region on the surface of a key was measured by the calibrated system. Prior to the experiment, high contrast random speckles were sprayed on the region for finding the corresponding points between left and right sub-images. During the measurement, the key was placed in front of the system and clearly imaged in a single camera imaging plane via left and right optical reflection paths. As shown in figure 8(a), a circular ROI was first specified in the left sub-image, and the measured points within the ROI were searched in the right sub-image to determine their disparity data using the advanced IC-GN algorithm with a subset size of 29 × 29 pixels. Then, according to the known intrinsic and extrinsic parameters, the 3D shape of the specified ROI was reconstructed, as show in figure 8(b). It is clear that the 3D details of the specified ROI have been reconstructed. The results demonstrate that the proposed system is capable of reconstructing small objects with complex shape.

**Figure 8.** Experimental results of measuring complex surface shape (a) Speckle image captured by the system; (b) reconstructed 3D shape.
Download figure:
Standard image High-resolution image

5. Concluding remarks and future work

We propose an easy-to-operation and automatic method for calibrating the telecentric camera. The method uses a virtual planar target displayed on of an LCD screen, rather than a 2D or 3D physical target one, to calibrate the system. During the whole calibration procedure, the camera and the LCD screen are fixed without requiring human interaction to move physical target, which makes the method easy-to-operation. In addition, a pre-defined transform is performed on the virtual calibration target to solve this problem of sign ambiguity in telecentric camera calibration without using a translation stage, further guaranteeing the flexibility of the method. 3D reconstruction tests are presented to show the metrological performance of the proposed method. By combing the proposed calibration method with digital image correlation (DIC) [23], an easy-to-implement and accurate single-camera telecentric stereo-DIC technique, which can measure the kinematic fields of small-scale specimens, can be developed. And this single-camera telecentric stereo-DIC technique is expected to offer higher accuracy than the previously reported single-camera microscopic stereo-DIC technique [24]. Ongoing works on this subject will be reported elsewhere.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (NSFC) (Grant Nos. 11925202, 11872009, 11632010) and China Postdoctoral Science Foundation (Grant Nos. 2019M660013, 2021T140043).

Data availability statement

All data that support the findings of this study are included within the article (and any supplementary files).

Telecentric camera calibration with virtual patterns

Article metrics

Submit

Permissions

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction