Vision-based adaptive stereo measurement of pins on multi-type electrical connectors

Delong Zhao; Feifei Kong; Fuzhou Du

doi:10.1088/1361-6501/ab198f

1. Introduction

As a fundamental component of most electrical, electronic and mechanical products, connectors build bridges for communication between modules, which ensures the reliability and effectiveness of products. Application scenarios for connectors include aerospace, vehicle engineering, computer technology, medical engineering, industrial automation and so on. The pins regularly located on the connector are the key to achieving continuity between the electrical circuit and the socket, and the position of the pins affects the overall appearance of the connector, hence the quality of the pins receives significant attention. In the appearance inspection of the pins, it is necessary to observe their distribution state, and judge whether there are missing, skewed, jutting, and shortened pins according to the design basis. This requires not only qualitative analysis, but also the quantitative measurement of the pins.

The development of the manufacturing industry and the promotion of the intelligent manufacturing brings about subtle changes in the production mode and a continued increase of quality requirements. Under the mode of multi-category and small-batch production, many types of products will be subjected to many quality inspections, whether in the process of mixed-line batch production or post-production secondary sampling. With the development of sensor technology, many of the most advanced solutions for vision-based measurements and defect detection have been implemented, which is usually equipped with customized vision hardware and servo control systems. These solutions tend to focus on the solving problems of one single model, for which any changes in imaging environment and features will invalidate the method, especially for measurement tasks with a high precision requirement. Therefore, adjusting the optimal image for the target and ensuring the stability of the internal variables of the system are crucial steps in traditional machine vision solutions.

Figure 1 shows the vision environment of three different types of electrical connectors when detecting the pins. It can be seen that in order to display more stable and distinguishable features of the pins in their images, the imaging condition required for different electrical connectors is bound to vary. However, it is impossible to configure corresponding vision systems for every single type of connectors as the production scale expands and models increase, so the detection of multiple models using one single vision system is more in line with actual production requirements.

For electrical connectors, the main problem with inspecting pins of multi-type models in a fixed imaging environment is that there are many uncertain changes in the image features obtained in the non-optimal view. Factors leading to the uncertainty include: the appearance of the electrical connector, the type and mounting position of the pins, etc. The key to pin inspection is to describe the pin location with features in the image and to extract these features steadily. In the current state of the art vision-based approach, there are two solutions to locate the pins: one is to use the centroid of the highlighted area produced by the strong reflection in the best view, and the other is to use the center of the rectangular obtained by template matching (TM). Obviously, these strategies that rely highly on the visual environment and image quality are not suitable for multi-model tasks. In addition, due to the structure and material specificity of the pins, the existing sensors based on structured light, infrared and ultrasound are not suitable for their inspection, so the inspection of multi-model electrical connectors is still limited to manual work, which has certain limitations in accuracy, efficiency, objectivity and repeatability.

In this paper, taking the electrical connector as the research object, we establish a vision inspection and measurement scheme for multi-category products, which can be effectively applied to situations such as small batch production, mixed line batch production, and products offline sampling, for which traditional vision solutions are not suitable. Using the structural invariance of industrial products, we propose an algorithm that combines deep learning (DL), weight-based set registration with key point constraints and pattern matching to solve the problem of image diversity and tiny feature recognition. We construct a robust feature scheme to represent the position of pins and further propose a hierarchical extraction algorithm. Additionally, inspired by the iterative closest point (ICP) commonly used in the computer-aided manufacturing, we improve the algorithm and employ it in the proposed algorithm framework to identify the outlier data without a reference and learn the rules of target feature arrangement. The experimental results show that the proposed method has a high degree of robustness and adaptability, and can be successfully applied to the inspection and measurement of vertical pins on multi-category electrical connectors. In addition, analysis of the 3D reconstruction results verifies that the proposed method has a higher stability and accuracy compared with the centroid-based or TM-based strategy.

The remainder of this paper is organized as follows. The related work is arranged in section 2. Section 3 describes the vertical pin inspection problem of multi-type connectors. Section 4 shows the proposed method. Experimental results and the discussion are provided in section 5. Section 6 draws the conclusion and shows future plans.

2. Related work

As a non-contact inspection method, machine vision technology can replace humans to perform tasks with high efficiency while reducing costs. In the field of industrial automation, vision-based applications are mainly in qualitative analysis [1–4] and quantitative inspection [5–7]. The quantitative task is generally a comprehensive process that involves not only target recognition, but also reasonable feature design methods (such as nature features of the object itself or special features provided by structured light technology [8, 9] or advanced depth sensors) as well as an extraction strategy (generally a robust image processing algorithm which can adaptively capture features of interest such as the key points or edges) to further reconstruct the spatial coordinates through triangulation for quality analysis.

Over the years, the pins inspection on the connector, and in particular, the method of position measurement, involves contact and non-contact approaches. The state-of-the-art contact method takes advantage of a mechanical template to verify the correctness of the spatial position by touching the pins on the fixed connector. In addition, an array of micro switches is embedded into the template so that the pins height information can be captured by checking the state of the switches. Intuitively, this approach has the potential to expand into pins inspection of multi-type connectors because of its insensitivity to the shape of connectors and the machining quality of pins. However, it causes an accumulative damage to the pins and close attention should be paid throughout the rigorous operation.

On the other hand, non-contact measurement technologies in the industrial field are increasingly drawing people's attention. These technologies mainly include laser triangulation, phase measurement, fringe projection, and depth from focus/defocus and stereo vision methods. Among them, laser triangulation is the most widespread technology for stereo measurement [10]. As an active vision method using the structured light technology, this approach is suitable for the tasks of measuring products with inconspicuous features or complex shapes [11, 12], for its reliability, efficiency, accuracy and low-cost. However, laser triangulation is limited to measuring reflective and shiny targets [13] like vertical pins. The reason is that the shape of the pin tip will generate a strong reflection, which greatly reduces the visibility of the target and the accuracy of reconstruction. Moreover, it is difficult to select a proper position of the laser projector to scan the plane of connectors where pins are located, while maintaining the robustness to the noise introduced by the pins body reflection. As active vision technologies, the phase measurement [14], time-of-flight [15], fringe projection [16] and some complex combinations [17] require a large number of hardware devices as well as high installation demands. Like many traditional visual measurement solutions [18–20], especially those for the industrial manufacturing field, the set of customized precision equipment provide a rigorous and simple environment, thereby simplifying the algorithms while reducing the adaptability and flexibility of the method. In addition, similar to the problems faced by laser triangulation, the reconstruction of small features such as the pin tip is also trapped by the resolution and the lack of texture. Therefore, to the best of our knowledge, such vertical pin detection on electrical connectors can only rely on the small features formed by the pin tip itself in the image.

Compared with the abovementioned methods, stereo vision-based measurement has made effective progress in addressing the problems in the manufacturing industry, and already has found many applications such as a printed circuit board (PCB) [21, 22], chips [23], ball grid array (BGA) [18, 24], and many mechanical parts [2, 25]. Apparently, with the advantage of image processing and photographic geometry, stereo vision approach enables the measurement to scale to some previously intractable works. Most applications generally involve the same basic procedure, i.e. image pre-processing, feature defining, feature identifying, reconstruction and analysis. The prime concern in this work is how to define the appropriate features for representing the position of the pins and accurately match the corresponding pixels in each given image. After this, the triangulation method can be further employed to calculate the location of the target in the camera coordinate system. Unlike the horizontally arranged and exposed pins [20, 26, 27], the pins on the connecters are densely distributed and vertically arranged, which means that the cameras are only allowed to be placed above to provide a top view. Furthermore, from the perspective of image processing, the features appear textureless and edgeless due to the smooth surface of the pin tip, which makes it difficult to construct and extract. Therefore, features such as edges, corners, colors, and regions that are often used for matching [28, 29] cannot be directly applied. A compromising method sacrificing accuracy is to extract the highlight area centroid [9, 30] presented by the pin tip reflection with appropriate illumination. Another approach [19] is to obtain a rough position through the TM algorithm to determine if it is qualified. Lei [31] extracted the center of the pin highlight area for a specific type of electrical connectors of the car. After aligning the left and right views by rotation, the pin number was determined according to the manually designed confirmation rule and the stereo coordinates of the pins tip were then derived. Similarly, Li [18] proposed a stereo vision-based measurement scheme for the BGA flatness inspection problems. They classified the targets by the imaging features of the highlighted areas of the solder balls and organized the strategies of representing the pin tip position through a connected component analysis. However, regardless of the BGA model, the shape and image of the solder ball are highly similar. For this reason, the inspection task with low image diversity essentially belongs to a single category problem. Although this method cannot solve the problem of pin detection, the idea of divide-and-rule gives important inspiration to this research.

Some research focusing on the electronic components prefer to employ high precision but complex equipment to provide a stable scene and a high-quality image so that the difficulty of the algorithm can be reduced and the accuracy improved. Deokhwa Hong et al [26] designed vision equipment with two telecentric lenses and performed the 3D reconstruction of the components on the PCB using stereo vision and active projection. Based on solving the problem of phase ambiguity caused by the projection system, the vision module uses the projection features to calculate the height of the target. Yann Armand and Hideo Saito [27] proposed a method of changing the position of the light source in a circular manner to inspect the skew angle of the horizontally arranged and exposed pins. As the light source gradually moves under the control of the servo system, the normal pins and skew pins will exhibit a series of different reflection results, by which the authors achieved the identification task. Woerner and Klaus [32] invented a complex set of equipment for the detection of electronic products in industrial production lines, which facilitated much other research and applications [33, 34]. Two cameras equipped with the telecentric lens are vertically and symmetrically mounted on the Z-axis of the machine and photograph by controlling the rotation of the Z-axis. It is usually chosen to rotate 180° to obtain symmetrical views, for which the field of view of the telecentric lens can be expanded skillfully and the binocular technology can also be used successfully. Additionally, a compact lighting system consisting of infrared LEDs, filters, and reflectors is also quite important. Obviously, the previous research is more demanding on the environment and equipment, lacking flexibility and inherently limited to single-category problems. If being extended to multi-category inspection tasks, it will affect the existing rigorous imaging environment, and the original algorithm cannot overcome the diversity problems.

3. Vertical pin inspection task of multi-type connectors

Generally, there are many necessary inspection items to ensure proper operation of the electrical connectors. In this work, we mainly focus on the vertical position of pins on the multi-model of electrical connectors, identify defects such as skew pins, shortened pins, jutting pins, missing pins, and extract the number of defective pins, as shown in figure 2(c).

**Figure 2.** Multi-type electrical connector overview. (a) Examples with various pin arrangement rules, (b) shape and size (especially relative to the height of the lateral casing of the connector), (c) items to check.
Download figure:
Standard image High-resolution image

Figures 2(a) and (b) show that pins in different connectors vary in size, shape and arrangement rules. The shapes of the pin tips include curved surfaces (ball-head), tapered surfaces (cone-head), and flat surfaces (square-head). The difference in pin size refers to the difference in the height of the pins relative to the connector shell. The diversity of arrangement rules means that the pin number cannot be obtained using a fixed strategy or template [31]. As shown in figure 3, we define the clear and unobstructed highlight area provided by the best image as the A-region. Moreover, for the tip with a cylinder or a smaller curvature, the reflectivity is poor, thus forming a more accurate low-gray value area for positioning the pin. We define this area as the B-region. The diversity of the tip features and influencing factors are shown in table 1.

(1)
Type: The type of connector determines the array rules for the pins and the pins with different sizes and shapes tend to exhibit different tip features;
(2)
Relative height: The height of the pin relative to the electrical connector shell affects the reception of light, which makes the pixel intensity of the interested feature inconsistent;
(3)
Machining quality and installing position: Under the non-optimal view, the shape and structure of the tip feature (A-region, B-region) are highly correlated with the machining quality and the mounting position of the pin.
(4)
Imaging posture: Similar to (3), in vision-based applications, target pose is also an important factor influencing features.
(5)
Viewpoint: Due to the above-mentioned factors, the target captured by the binocular vision system in different perspectives may exhibit different features. For example, as shown in table 1, the B-region is clearly visible in the left view but may be clustered or lost in the right view. At this time, an effective correspondence cannot be established.

Table 1. Examples of the diversity of pin tip features.

Influencing factor	Under the unified vision environment
Type of pin
Relative height
Machining quality, installation position
Pin-imaging posture
Viewpoint (Two cameras)

**Figure 3.** Examples of optimal imaging environments for different types of electrical connectors and the unified vision environment we established.
Download figure:
Standard image High-resolution image

Therefore, this paper focuses on constructing a highly robust method to analyze the above problems adaptively, estimate the most suitable feature to describe the position of the pin, and establish the correct correspondence between views. Identification of the pin number with various arrangement rules also needs to be performed. Additionally, it is virtually hard to find a reference that accurately describes the position error in the pin area. Therefore, a general approach [19] is to compare the current result with the data obtained from previously prepared good products. A qualified product may have many states, for which one or two artificially selected examples cannot provide strong support. It is therefore necessary to construct a strategy that can continuously approximate the data range of qualified products based on historical results, and to realize the identification of outliers beyond the limit without references.

4. Proposed adaptive inspection method

This section provides a detailed discussion of the position detection and defect identification of the vertical pins for multi-category electrical connectors. Figure 4 shows the flowchart of the proposed adaptive inspection algorithm. The block diagram summarizes the analysis of the relation among the pin features in the image and position information, the identification of electrical connectors and pins, the hierarchical fitting and extraction strategy of the position data, the arrangement rule learning, and the judgement of three abnormal problems. More details about the proposed method are provided in the following sections.

**Figure 4.** Diagram of proposed adaptive inspection method.
Download figure:
Standard image High-resolution image

4.1. DNN-based recognition of multi-type connectors and pins

In recent years, the rapid development of DL technology with powerful representation capabilities has provided an excellent solution to target detection problems. Common recognition frameworks based on DL mainly include Faster R-CNN [35], SSD [36] and YOLO2 [37]. Among them, Faster R-CNN is famous for the idea of candidate proposal (region proposal) and anchor mechanism, while YOLO2 and SSD are both combinations of regression thought and anchor mechanism.

The first issue this paper has encountered is how to effectively identify the pin. Apparently, the recognition of the electrical connector is also essential because the extra prior information that is important for the subsequent algorithm execution can only be obtained according to the model of the product. However, a problem exists, i.e. the area ratio between the pin and the image is too small. This actually belongs to the problem of small target detection [38, 39], which has drawn increasing attention in the DL field. The properties of the current problem are quite common in the industrial automation, which is rich in prior knowledge compared with daily computer vision tasks. Products are rigid and the interested targets are often arranged in a regular manner. With this observation, this paper proposes a two-step identification strategy based on prior knowledge constraints through embedded manufacturing information of the products. Firstly, the tasks are defined as prior knowledge loading and pin recognition, and two Faster R-CNN models are separately trained. The former realizes the classification of the electrical connectors and the recognition of the pins region, and then obtains all prior information related to the corresponding model. Subsequent works are operated within the pins-region, reducing the background reference from another perspective.

The principle of object detection based on the Faster R-CNN framework is shown in figure 5. Faster R-CNN consists of two branch modules and a shared convolutional layer. One branch is the region proposal network (RPN), which is used to generate a series of candidate regions; the other is the Fast R-CNN [40] detector used to perform specific target detection on candidate regions. The shared layer employs the structure of ZF-Net [41] and contains five convolution layers. During the training phase, the RPN and Fast R-CNN detectors share the convolutional layer to achieve joint training. In the inference phase, the images are processed sequentially by pre-trained RPN and Fast R-CNN detectors to obtain the final result.

However, for some big connectors with a large number of pins, the small area ratio between the tip feature and the pins region is still a major factor plaguing the recognition rate. Therefore, we define the pin type, feature size, and distribution rule as a priori information and bind it to the corresponding connector model. The size of the tip feature can provide a strong scaling constraint for the regressor to prevent the regression operation from large shifting. Once the pin distribution rules are known, an empirically based 2D scatters template can be constructed in advance, which can be applied for registering to the pins that have been identified to perform the pins recognition supplement and the judgment of the absence. The advantage of this approach stems from the fact that most of the targets in the industry are rigid and structurally invariant, and do not undergo excessive deformation. The method flow is shown in figure 6.

**Figure 6.** Aided detection based on prior information constraints.
Download figure:
Standard image High-resolution image

Four vertices of the pin region are selected as the control points to play a major role in implementing the registration process, where the registration strategy we apply is inspired by weight-based ICP algorithms [42]. The given 2D points are defined by ${\bf P}(\in {{\mathbb{R}}^{N\times 2}})$ , and the centers of these obtained positions are indicated as ${\bf X}(\in {{\mathbb{R}}^{M\times 2}})$ , which can be formed as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle {\bf X}=\{{{{\bf x}}^{1}},{{{\bf x}}^{2}},{{{\bf x}}^{3}},{{{\bf x}}^{4}},{{{\bf x}}^{5}},\ldots ,{{{\bf x}}^{{\bf M}}}\},\nonumber \end{align} \tag{ 1 }$

where $\{{{{\bf x}}^{1}},{{{\bf x}}^{2}},{{{\bf x}}^{3}},{{{\bf x}}^{4}}\}$ represent the vertices. The registration problem can be considered as gradually correcting the correspondence of the point pairs $\{\ldots ,{{{\bf p}}^{i}},{{{\bf x}}^{i}},\ldots \}$ during the iteration, and configuring the weights $w$ according to current result, so that the objective function (2)

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle f\left( {\bf R},{\bf t} \right)=\sum\nolimits_{i=1}^{M}{\left\{\frac{{{w}^{i}}\begin{array}{@{}cccccccccccccccccccc@{}} \left\Vert {\bf R}{{{\bf p}}^{i}}+{\bf t}-{{{\bf x}}^{i}} \right\Vert \nonumber \end{array}_{2}^{2}}{\sigma } \right\}}\nonumber \end{align} \tag{ 2 }$

converges to the minimum (or local) value. In this problem, ${\bf R}\in {{\mathbb{R}}^{2\times 2}}$ is the rotation matrix, ${\bf t}\in {{\mathbb{R}}^{2}}$ is the translation vector and $\sigma$ represents the scale factor. According to [42, 43], the formula (2) can be simplified to

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle f\left( {\bf R},{\bf t} \right)=&\,\underset{i=1}{\overset{M}{\mathop \sum }}\,\left\{\frac{{{w}^{i}}}{\sigma }\left( \begin{array}{@{}cccccccccccccccccccc@{}} \left\Vert {{{\bf p}}^{i}}-\bar{{\bf p}} \right\Vert \nonumber \end{array}_{2}^{2}+\begin{array}{@{}cccccccccccccccccccc@{}} \left\Vert {{{\bf x}}^{i}}-\bar{{\bf x}} \right\Vert \nonumber \end{array}_{2}^{2} \right) \right\}\nonumber \\ &+\frac{{\hat{w}}}{\sigma }\left( \left\Vert {\bf C} \right\Vert_{2}^{2}-2{\rm Trace}\left( {\bf RH} \right) \right),\nonumber \end{align} \tag{ 3 }$

where

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \begin{array}{@{}cccccccccccccccccccc@{}} & {\bf H}=\frac{1}{{\hat{w}}}\underset{i=1}{\overset{M}{\mathop \sum }}\,\{{{w}^{i}}{{{\bf p}}^{i}}{{{\bf x}}^{{{i}^{T}}}}\}-\bar{{\bf p}}{{{\bar{{\bf x}}}}^{T}},\begin{array}{cccccccccccccccccccc} {\bf C}={\bf R}{\bar{\bf p}} \nonumber \end{array}+{\bf t}-\bar{{\bf x}}, \nonumber \\ & \hat{w}=\sum\limits_{i=1}^{M}{{{w}^{i}}},\bar{{\bf p}}=\frac{1}{{\hat{w}}}\sum\limits_{i=1}^{M}{{{w}^{i}}{{{\bf p}}^{i}},\boldsymbol{\bar{x}}=\frac{1}{{\hat{w}}}}\sum\limits_{i=1}^{M}{{{w}^{i}}{{{\bf x}}^{i}}}. \nonumber \\ \end{array}\nonumber \end{align} \tag{ 4 }$

A common method is to search an ${{{\bf R}}^{*}}$ that maximizes ${\rm Trace}({\bf RH})$ using singular value decomposition (SVD), and then calculate ${{{\bf T}}^{*}}$ through the $\left\Vert {\bf C} \right\Vert_{2}^{2}=0$ . A good initial state is critical for the ICP-based algorithm, so we perform the initialization by aligning (or approximately aligning) the template ${{{\bf p}}_{{\rm c}}}(0)$ with the obtained corners ${{{\bf x}}_{{\rm c}}}(0)=\{{\bf x}_{{\rm ul}}^{1},{\bf x}_{{\rm ur}}^{2},{\bf x}_{{\rm bl}}^{3},{\bf x}_{{\rm br}}^{4}\}$ , in which the scaling factor $\sigma$ . n also get an excellent initial ${{\sigma }^{0}}$ . At the same time, a corner movement threshold $\delta$ is introduced to limit the registration process to avoid the unexpected optimization as shown in case III in figure 5. The constraints can be described as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \begin{array}{@{}cccccccccccccccccccc@{}} \left\Vert {\bf R}_{k}^{*}\overline{{{{\bf p}}_{{\rm c}}}}(k-1)+{\bf T}_{k}^{*}-\frac{\overline{{{{\bf p}}_{{\rm c}}}}(0)}{{{\sigma }^{0}}} \right\Vert \nonumber \end{array}_{2}^{2}\leqslant \frac{\delta }{{{\sigma }^{0}}},\nonumber \end{align} \tag{ 5 }$

where ${\bf R}_{k}^{*}$ and ${\bf T}_{k}^{*}$ are optimal solution in the kth iteration, and $\overline{{{{\bf p}}_{{\rm c}}}}(k-1)$ represents the center of the template corners after k − 1 iteration. In the iterations, weight factor are updated according to the weight function

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle {{w}_{{\rm Tu}}}\left( s \right)=\left\{\begin{array}{@{}cccccccccccccccccccc@{}} {{\left( 1-\frac{{{s}^{2}}}{\kappa _{{\rm Tu}}^{2}} \right)}^{2}}, & \left| s \right|\leqslant {{\kappa }_{{\rm Tu}}} \nonumber \\ 0, & \left| s \right|>{{\kappa }_{{\rm Tu}}} \nonumber \end{array} \right.\nonumber \end{align} \tag{ 6 }$

in which the Tukey bi-weight is adopted as the criterion function for the protection of data and ${{\kappa }_{{\rm Tu}}}$ we choose 7.0 according to [42]. The parameter

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle {{s}^{i}}(k)=\frac{{{\begin{array}{@{}cccccccccccccccccccc@{}} \left\Vert {{{\bf p}}^{i}}(k-1)-{{{\bf x}}^{i}}(k-1) \right\Vert \nonumber \end{array}}_{2}}}{{{\sigma }^{k-1}}}\nonumber \end{align} \tag{ 7 }$

provides the results of current registration. Considering that the value of $\sigma$ should approach to optimal ${{\sigma }^{*}}$ from above, we tend to prepare a large initial scale ( ${{\sigma }^{0}}>1.0$ ). In this way, ${{\sigma }^{k}}$ can be updated in exponential decrement using ${{\sigma }^{k}}=0.995\left( {{\sigma }^{k-1}}-{{\sigma }^{*}} \right)+{{\sigma }^{*}}$ (We let ${{\sigma }^{*}}=\frac{{\bf x}_{{\rm ul}}^{1}-{\bf x}_{{\rm br}}^{4}}{100}$ ).

Each point in the scatters template is bound to the pin number. With the completion of the registration, the pins in the image are numbered one by one. Obviously, once a point in the template does not match a suitable target, there must be an unrecognized pin or a missing pin. Then, a rectangular area is expanded with the point $\left(\,p_{x}^{i},p_{y}^{i} \right)$ as the center, and m templates are randomly selected from the identified pins. Selection of the area size can be described as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \left( a_{x}^{i},a_{y}^{i} \right)=\left(\,p_{x}^{i}-\frac{p_{x}^{i-1}-p_{x}^{i}}{{{\beta }_{x}}},p_{y}^{i}-\frac{p_{y}^{i-k}-p_{y}^{i}}{{{\beta }_{y}}} \right),\nonumber \end{align} \tag{ 8 }$

where $\left( a_{x}^{i},a_{y}^{i} \right)$ is the top-left corner, $[{{\beta }_{x}},{{\beta }_{y}}]$ is a pair for controlling the cutoff ratio in X and Y directions (we choose [1/2, 1/3]), and the parameter k is used to navigate to the previous row. Matching work is performed within the rectangular area to complete secondary recognition, where the degree of similarity is judged by calculating the normalization cross correlation (NCC) according to

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle {\rm NCC}=\frac{1}{{{N}_{{{T}^{m-1}}}}}\underset{i,j}{\mathop \sum }\,\frac{({{T}^{m}}\left( i,j \right)-\overline{{{T}^{m}}})({{S}^{c}}\left( i,j \right)-\overline{{{S}^{c}}})}{(\sqrt{{{({{T}^{m}}\left( i,j \right)-\overline{{{T}^{m}}})}^{2}}})(\sqrt{{{({{S}^{c}}\left( i,j \right)-\overline{{{S}^{c}}})}^{2}}})},\nonumber \end{align} \tag{ 9 }$

where ${{T}^{m}}$ and ${{N}_{{{T}^{m}}}}$ represent the mth template and its number of pixels, respectively, and ${{S}^{c}}$ is the sub-image currently covered by ${{T}^{m}}$ .

The selection criterion for the parameter m depends on the total number of pins of the connector model. Usually having more pins means more possible imaging states, where m tends to be larger. If the score after the matching is too low, the missing pin and its corresponding number will be saved.

4.2. Hierarchical extraction strategy of target features

Until now, the model of the electrical connector and its pins with the position rectangle have been identified. On this basis, a hierarchical extraction strategy is constructed to search the expected position characterization points according to Section II from the rectangular boxes of different kinds of pins and to accommodate any possible imaging diversity issues.

4.2.1. Adaptive binarization strategy.

The pin type has been defined before the training of the model classifier and is provided in this section as the prior knowledge. The first thing to consider is how to extract the A-region and B-region for subsequent analysis. By observing the features of different types of pins in the fixed visual environment, we firstly design an adaptive binarization strategy based on shape and structure constraints, which can effectively solve the problem of incomplete A- and B-region obtained by traditional binarization algorithm. As shown in figure 7, in addition to a few normal states (similar to ((a)2)), the common diversity of tip feature includes: different background (((a)1) and ((b)1)), region adhesions (((b)2)), region mixing and hybrids (((c)1), ((d)1) and ((d)3)), noise and region deformation.

**Figure 7.** The control process of proposed adaptive binary algorithm. (a)–(d) Provide some common diversity in different types of pins, where it can be seen that different stages of the algorithm can be adapted to different tip features.
Download figure:
Standard image High-resolution image

The shape and quality score rules are separately formulated to examine the target so that the desired features can be flexibly discriminated in different states. In the initial stage, binarization based on the percentage of pixel intensity is performed. Generally, considering the fact that the pixel intensity of the noise is lower than the tip, the threshold $\mu$ is chosen to be more relaxed and thus the noise will gradually degenerate during the subsequent iterations (The noise in ((c)2) gradually disappears with the iteration). The shape scoring criterion $Q(a,b,S,e)$ is established by the aspect ratio $a$ of bounding rectangle (BR), $b$ of minimum area bounding box (MABB), area S, and roundness e of the connected component:

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle Q(a,b,S,e)=\frac{{{a}_{t}}{{b}_{t}}+{{a}_{t}}+{{b}_{t}}+\frac{{{a}_{t}}}{{{b}_{t}}}}{{{S}_{t}}}+{{e}_{t}}.\nonumber \end{align} \tag{ 10 }$

On the other hand, the binarization operation determines the amount of information observed in subsequent processing steps and is the key to determining the accuracy and consistency of the measurement. By modifying the parameters, we found that the consistency of the 3D point reconstructed is highly correlated with the parameters $b$ and $e$ . Polynomial fitting is performed using parameters that have provided good results, and a feature quality condition $T\left( b,e \right)$ is constructed. The overall control process and the best result (red rectangle) of the algorithm are shown in figure 7.

In algorithm 1, we use the notation $f$ to denote a contour contraction operation. That is ${{\hat{{\bf A}}}^{c}}=f({{{\bf A}}^{c}})$ is a contracted contour made up of pixel point ${{\boldsymbol{\hat{a}}}_{i}}\in {{\hat{{\bf A}}}^{c}}$ and

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle {{\boldsymbol{\hat{a}}}_{i}}=\xi \left( {{a}_{i}}-{{{\hat{{\bf A}}}}^{c}} \right)+{{\hat{{\bf A}}}^{c}},\nonumber \end{align} \tag{ 11 }$

where $\xi$ ( $0<\xi <1$ ) is a contraction coefficient positively related to ${{S}_{{{t}_{A}}}}$ and type. We get the following values of the parameters:

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \left\{\begin{array}{@{}cccccccccccccccccccc@{}} {{\xi }_{{\rm ball{{\text -}}head}}}={{\gamma }_{{\rm ball}}}+\frac{{{S}_{{{t}_{A}}}}}{{\rm wh}} \nonumber \\ {{\xi }_{{\rm cone{{\text -}}head}}}={{\gamma }_{{\rm cone}}}+\frac{{{S}_{{{t}_{A}}}}}{{\rm wh}} \nonumber \\ {{\xi }_{{\rm square{{\text -}}head}}}={{\gamma }_{{\rm square}}}+\frac{{{S}_{{{t}_{A}}}}}{{\rm wh}} \nonumber \end{array} \right..\nonumber \end{align} \tag{ 12 }$

Algorithm 1. Adaptive iterative binarization.

Input: Pins ${{I}_{1}}$ ${{I}_{1}}$ , ${{I}_{2}}$ ${{I}_{2}}$ ,..., ${{I}_{{\rm pins}}}$ ${{I}_{{\rm pins}}}$ with size of $W\times H$ $W\times H$ , percentage $\mu$ $\mu$ , step $\tau$ $\tau$ , area limit $\Delta$ $\Delta$ , condition T.

Output: A-region ${\bf A}$ ${\bf A}$ . and its binary threshold ${{t}_{A}}$ ${{t}_{A}}$ , ${\bf B}$ ${\bf B}$ , ${{t}_{B}}$ ${{t}_{B}}$

* intensity percentage */

• For each ${{I}_{k}}$ ${{I}_{k}}$ , $N$

$N$ pixels are sorted by gray-value $g$

$g$ and binarization is performed with ${{t}_{0}}={{g}_{\mu N}}$ ${{t}_{0}}={{g}_{\mu N}}$ .

• Label connected component A, B (if exists) and noise

/* iterations based on shape constraints */

• Let $t={{t}_{0}}$ $t={{t}_{0}}$ and repeat if $t<$

$t<$ 255 and ${{S}_{t}}>\Delta$ ${{S}_{t}}>\Delta$

a) Binarize by $t$

$t$ and fit the external polygon ${\bf A}_{t}^{c}$ ${\bf A}_{t}^{c}$ of A (and noise).

b) Calculate the ${{S}_{t}}$ ${{S}_{t}}$ , ${{e}_{t}}$ ${{e}_{t}}$ , ${{a}_{t}}$ ${{a}_{t}}$ and ${{b}_{t}}$ ${{b}_{t}}$ of ${\bf A}_{t}^{c}$ ${\bf A}_{t}^{c}$ and save the score ${{Q}_{t}}(a,b,S,e)$ ${{Q}_{t}}(a,b,S,e)$ .

c) Save $T\left( b \right)={{p}_{1}}{{b}^{2}}+{{p}_{2}}b+{{p}_{3}}.$ $T\left( b \right)={{p}_{1}}{{b}^{2}}+{{p}_{2}}b+{{p}_{3}}.$

If ${{e}_{t}}>T\left( {{b}_{t}} \right)$ ${{e}_{t}}>T\left( {{b}_{t}} \right)$ and ${{S}_{t}}>|\varepsilon |N$ ${{S}_{t}}>|\varepsilon |N$ ( $|\varepsilon |<0.02$ $|\varepsilon |<0.02$ ), break.

d) Let $t=t+\tau$ $t=t+\tau$

• ${{t}_{A}}=~\underset{t}{\mathop{\arg \max }}\,({{Q}_{t}}+{{e}_{t}}-T({{b}_{t}})),{{{\bf A}}^{c}}={\bf A}_{{{t}_{A}}}^{c}$ ${{t}_{A}}=~\underset{t}{\mathop{\arg \max }}\,({{Q}_{t}}+{{e}_{t}}-T({{b}_{t}})),{{{\bf A}}^{c}}={\bf A}_{{{t}_{A}}}^{c}$

/* layered strategy */

• ${{\hat{{\bf A}}}^{c}}=f({{{\bf A}}^{c}})$ ${{\hat{{\bf A}}}^{c}}=f({{{\bf A}}^{c}})$ , ${{{\bf B}}_{t}}$ ${{{\bf B}}_{t}}$ is obtained by binarizing ${{\hat{{\bf A}}}^{c}}$ ${{\hat{{\bf A}}}^{c}}$ with ${{t}_{B}}(>240)$ ${{t}_{B}}(>240)$ .

The variable $\gamma$ is related to the size of the feature. Generally, several images are selected as input samples and empirically set these parameters by observing the degree of separation of A- and B-regions. In this work we choose 0.84, 0.68 and 0.75 for ${{\gamma }_{{\rm ball}}}$ , ${{\gamma }_{{\rm cone}}}$ and ${{\gamma }_{{\rm square}}}$ respectively.

Finally, the high-quality A- and B-region ${\bf A},{\bf B}\in {{\mathbb{R}}^{K\times 2}}$ , their contours ${{{\bf A}}^{c}},{{{\bf B}}^{c}}$ , and the binarization thresholds ${{t}_{A}},~{{t}_{B}}$ are the output. In such a way, for several tips with good quality feature and some tips with feature adhesion, noise, or intensity inconsistency, the A- and B-region can be effectively distinguished and sent to the hierarchical analyzer for estimating the position of the pins. However, for the occurrence of regional mixing (A-, B-, noise region), we can only use the intensity percentage and directly employ the hierarchical analyzer for analysis according to the invariance of the regional structure. Define ${{{\bf N}}_{A}}$ and ${{{\bf N}}_{B}}$ as the noises generated by A and, ${\bf N}_{A}^{c}$ , ${\bf N}_{B}^{c}$ as corresponding external contours.

4.2.2. Hierarchical analyzer.

For the convenience of description, ${{\tilde{*}}^{c}}$ represents the new connected component formed by the external contour of $*$ in this section. For any tip image $I$ , the basic task of the second stage is to perform structural analysis based on the received $\boldsymbol{\Omega }=({{\tilde{{\bf A}}}^{c}},\tilde{{\bf N}}_{A}^{c},{{\tilde{{\bf B}}}^{c}},\tilde{{\bf N}}_{B}^{c})$ to adaptively distinguish these elements and extract the most appropriate target point $x$ for describing the pin position while ensuring the data consistency requirements. The core of measurement data consistency is derived from the consistency of pixel extraction and the correctness of point-pair relationships. Let the function $\Psi$ denote a target point estimation operation, ${\bf x}=\Psi ({{{\bf Q}}^{c}})$ where ${{{\bf Q}}^{c}}$ is the interested region inferred by hierarchical analyzer according to $\boldsymbol{\Omega }$ .

The $\left\{{\bf Q}_{{\rm left}}^{c}\left( 1 \right),{\bf Q}_{{\rm left}}^{c}\left( 2 \right),\ldots \right\}\,{\rm and}\,\left\{{\bf Q}_{{\rm right}}^{c}\left( 1 \right),{\bf Q}_{{\rm right}}^{c}\left( 2 \right),\ldots \right\}$ obtained in multiple postures of a same pin by binocular vision system need to meet

(1)
for any ${\bf Q}_{{\rm left}}^{c}\left( i \right)$ , once ${\bf Q}_{{\rm left}}^{c}\left( i \right)\subseteq \tilde{{\bf B}}_{{\rm left}}^{c}$ , the rest should $\subseteq \tilde{{\bf B}}_{{\rm left}}^{c}$ ;
(2)
for any posture $i$ , if ${\bf Q}_{{\rm left}}^{c}\left( i \right)\subseteq \tilde{{\bf B}}_{{\rm left}}^{c}$ , then ${\bf Q}_{{\rm right}}^{c}\left( i \right)\subseteq \tilde{{\bf B}}_{{\rm right}}^{c}$ .

The same rule applies to ${{\tilde{{\bf A}}}^{c}}$ . Although the $\boldsymbol{\Omega }$ is subject to a variety of factors (as shown in the table 1), the structure of the tip feature is invariant and can be described as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \begin{array}{@{}cccccccccccccccccccc@{}} & {{{\tilde{{\bf B}}}}^{c}}\subset {{{\tilde{{\bf A}}}}^{c}},\tilde{{\bf N}}_{B}^{c}\subset {{{\tilde{{\bf A}}}}^{c}} \nonumber \\ & \tilde{{\bf N}}_{B}^{c}\subset {{{\tilde{{\bf B}}}}^{c}}\,{\rm or}\,\tilde{{\bf N}}_{B}^{c}\cap {{{\tilde{{\bf B}}}}^{c}}=\varnothing \,{\rm or}\,(\tilde{{\bf N}}_{B}^{c}\cap {{{\tilde{{\bf B}}}}^{c}})\subset {{B}^{c}} \nonumber \\ & (\tilde{{\bf N}}_{A}^{c}\cap {{{\tilde{{\bf A}}}}^{c}})\subset {{A}^{c}}\,{\rm or}\,\tilde{{\bf N}}_{A}^{c}\cap {{{\tilde{{\bf A}}}}^{c}}=\varnothing . \nonumber \\ \end{array}\nonumber \end{align} \tag{ 13 }$

In addition, when the pixel ${{x}_{1}}$ of the left view is selected, the polar line in the right view can be calculated, and the corresponding pixel point ${{x}_{2}}$ will theoretically fall on the straight line, which satisfy the formula $x_{2}^{T}\boldsymbol{F}{{x}_{1}}=0$ . (The epipolar constraint between two corresponding pixel points ${{x}_{1}}$ and ${{x}_{2}}$ is represented through the fundamental matrix F as $x_{2}^{T}\boldsymbol{F}{{x}_{1}}=0$ .) Generally, due to the presence of calibration errors, noises, and pixel quantization errors, the correct point pairs may not satisfy this formula. Here we use the error tolerance factor and design a noise reduction strategy based on the principle of the polar geometry to promote the analysis. In this strategy, for any pixel point ${{\boldsymbol{b}}_{{\rm left}}}\in \tilde{{\bf B}}_{{\rm left}}^{c}$ and the corresponding ${{\boldsymbol{b}}_{{\rm right}}}\in \tilde{{\bf B}}_{{\rm right}}^{c}$ , they should meet the conditions

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \frac{\left| \begin{array}{@{}cccccccccccccccccccc@{}} \boldsymbol{b}_{{\rm right}}^{T}\boldsymbol{F}{{\boldsymbol{b}}_{{\rm left}}} \nonumber \end{array} \right|}{{{\left\Vert \begin{array}{@{}cccccccccccccccccccc@{}} \boldsymbol{F} \nonumber \end{array}{{\boldsymbol{b}}_{{\rm left}}} \right\Vert}_{2}}}\leqslant \gamma ,~\frac{\left| \begin{array}{@{}cccccccccccccccccccc@{}} \boldsymbol{b}_{{\rm left}}^{T}{{\boldsymbol{F}}^{T}}{{\boldsymbol{b}}_{{\rm right}}} \nonumber \end{array} \right|}{{{\left\Vert \begin{array}{@{}cccccccccccccccccccc@{}} {{\boldsymbol{F}}^{T}} \nonumber \end{array}{{b}_{{\rm right}}} \right\Vert}_{2}}}\leqslant \gamma .\nonumber \end{align} \tag{ 14 }$

If $Q_{{\rm left}}^{c}$ is determined on the left image, then the search range can be reduced on the right image. As well, if the noise $\tilde{{\bf N}}_{A}^{c}$ and $\tilde{{\bf N}}_{B}^{c}$ can be found in one view, they can also be checked in another view.

Based on the above analysis, a hierarchical extraction strategy tree is constructed, and six kinds of fitting approaches $\Psi$ based on connected components are assembled to describe the various states of A-region and B-region. The labeled connected component map can be effectively represented by a inclusion tree structure [43]. As shown in figure 8, the hierarchical analyzer proposed by this paper analyzes the structure, describes the remaining ${{\tilde{{\bf A}}}^{c}},\tilde{{\bf N}}_{A}^{c},{{\tilde{{\bf B}}}^{c}},\tilde{{\bf N}}_{B}^{c}$ using the eleven-category layout state and obtains eleven different scenarios. According to the location information of their layout, similar letters are used as their names (O, Q, G, D, M, X, V, B, C, U, H), and corresponding solutions are provided.

**Figure 8.** Hierarchical analyzer operation process and 11 kinds of solutions for different layouts of the connected component map.
Download figure:
Standard image High-resolution image

For the ball-head, after A- and B-regions are analyzed by their decision makers, ${{{\bf Q}}^{c}}$ will be determined according to their scores. Methods O, M, and G are then selected based on their relative positions. Besides, method D portrays the solution to the deformation of A-region. It is worth emphasizing that when the relative position of the child node B-region and the parent node A-region is appropriate, the centroid of B-region is directly used and, in contrast, the A-region needs to be truncated based on the empirical value like D (right).

For the ball-head with cylinder, the top of cylinder $\tilde{{\bf B}}_{1}^{c}$ can provides a strong positional constraint, and its sidewall will show an arcuate low-gray area $\tilde{{\bf B}}_{2}^{c}$ . The current task is to find $\tilde{{\bf B}}_{1}^{c}$ accurately, and when $\tilde{{\bf B}}_{1}^{c}$ is missing, it is necessary to infer the possible position of it according to the shape and direction of $\tilde{{\bf B}}_{2}^{c}$ . Method X generally uses the new ${{\tilde{{\bf B}}}^{c}}$ labeled by the A-region decision maker. Methods D, B, and C correspond to three different cases (Only $\tilde{{\bf B}}_{1}^{c}$ or $\tilde{{\bf B}}_{2}^{c}$ , or both). Method V is directed to solving the region mixing or adhesions, which estimates the possible region of $\tilde{{\bf B}}_{1}^{c}$ by calculating the defect direction of $\tilde{{\bf B}}_{2}^{c}$ .

For the cone-head, method O is usually used to perform centroid fitting of A- and B-regions with a high roundness, which is the most basic strategy. Method Q usually contains an obvious-noise elimination operation on this basis. In addition, when roundness is poor in B-region (B-region being damaged) but good in the circumscribed contour, method Q can also be used. Sometimes, the A-region will be destroyed by B-region, which leads to some cracks and defects. If the cracks are small or skew with respect to the central position of the A-region, method X is more inclined to be used. In contrast, once the crack is large enough to penetrate the A-region, the B-region centroid can be calculated according to the U-layout. When the A-region has a defect and is accompanied by a neighbor node, we can identify whether it is noise or a part of the region by observing the positional relationship between the node and the defect orientation. X- and H-methods distinguish A- and B-regions in different ways and both fit the centroid.

According to the above strategy, the corresponding pin position characterization points will be extracted and bound with the pin number registered in section 3.1. These multiple pairs of points are further used to reconstruct the 3D data through triangulation so that defects recognition and the arrangement rule can be learned.

4.3. Triangulation and outlier judgment through rule learning

Generally, triangulation is employed to obtain the stereo reconstruction of these corresponding 2D pixel points. Through the camera models, a point P in the 3D space is converted into the two pixel coordinate systems and can be denoted by p, ${{{\bf p}}^{\prime }}$ respectively. We use $\hat{{\bf X}}$ , ${{{\bf x}}_{l}}$ , and ${{{\bf x}}_{r}}$ to represent the homogeneous coordinates of these three points, which can be described according to the formula

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \left\{\begin{array}{@{}llcccccccccccccccccc@{}} \begin{array}{@{}ll@{}} {{{\bf x}}_{l}}={{{\bf M}}_{1}}\hat{{\bf X}} \nonumber \\ {{{\bf x}}_{r}}={{{\bf M}}_{2}}\hat{{\bf X}} \nonumber \end{array} \nonumber \end{array} \right.\nonumber \end{align} \tag{ 15 }$

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle {\bf A}=\left[ \begin{array}{@{}c@{}} {{x}_{l}}{\bf M}_{1}^{3T}-{\bf M}_{1}^{1T} \nonumber \\ {{y}_{l}}{\bf M}_{1}^{3T}-{\bf M}_{1}^{2T} \nonumber \\ {{x}_{r}}{\bf M}_{2}^{3T}-{\bf M}_{2}^{1T} \nonumber \\ {{y}_{r}}{\bf M}_{2}^{3T}-{\bf M}_{2}^{2T} \nonumber \end{array} \right].\nonumber \end{align} \tag{ 16 }$

After equation ${\bf A}{\hat{\bf X}}=0$ is formed, $\hat{{\bf X}}$ can be derived by using direct linear transform (DLT). The $3\times 4$ matrices ${{{\bf M}}_{1}}$ and ${{{\bf M}}_{2}}$ can be obtained by camera calibration, and are composed of R₁, R₂, ${{{\bf t}}_{1}}$ and ${{{\bf t}}_{2}}$ , which can be decomposed by the essential matrix ${\bf E}$ . However, due to the existence of noise and digitization errors, the intersection point P of rays ${\rm CP}$ and ${{{\rm C}}^{\prime}}{\rm P}$ does not exist, so the epipolar-based optimized triangulation [44] is usually applied to improve accuracy. Assuming that the noise satisfies the Gaussian distribution, the polar lines are denoted as ${{{\bf l}}_{l}}$ and ${{{\bf l}}_{r}}$ respectively. A pair of rectified points conforming to the epipolar constraint is selected to keep the smallest Euclidean distance between their original noise points ${{{\bf x}}_{l}}$ and ${{\mathbf{x}}_{r}}$ in both views. This process can be expressed as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle {{\min }_{t}}\{d{{\left( {{{\bf x}}_{l}},{{{\bf l}}_{l}}\left( t \right) \right)}^{2}}+d{{\left( {{{\bf x}}_{r}},{{{\bf l}}_{r}}\left( t \right) \right)}^{2}}\},\nonumber \end{align} \tag{ 17 }$

where parameter t is introduced to describe the position of the rectified pixel and parametrize the polar lines. In this way, the process can be converted into the minimum calculation of the univariate function. In addition, considering the relevance of the pin data, the optimization objective function needs to be rewritten as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \min {{\{}_{\boldsymbol{t}}}\sum\limits_{i=1}^{{\rm pins}}{[d{{\left( {{{\bf x}}_{il}},{{{\bf l}}_{l}}\left( {{t}_{i}} \right) \right)}^{2}}+d{{\left( {{{\bf x}}_{ir}},{{{\bf l}}_{r}}\left( {{t}_{i}} \right) \right)}^{2}}]\}}.\nonumber \end{align} \tag{ 18 }$

A set of corrected corresponding point pairs satisfying the epipolar constraint can be obtained from t, and the global error is minimized. After reconstruction, the relative position of the pin position characterization point in the left camera coordinate system is represented by $(x,y,z)$ .

Considering that the size benchmark of a product cannot usually be obtained accurately by the visual measurement program originally designed to perform the intended work, and if, on the other hand, there is no reference, then measuring whether the position of the acquired point cloud (PCB) has a deviation depends more on the horizontal comparison. Therefore, the solution of the problem still needs to be supported by a set of point cloud registration algorithms, and a known reference point cloud (PCA) is also essential. First of all, the pins number of the PCB and PCA in this stage are known, which means that we can apply the rigid transform directly to calculate the rotation matrix R and the translation matrix T to align them. This operation can be expressed as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \left( {\bf R},{\bf T} \right)={\rm argmin}\{\underset{i=1}{\overset{{\rm pins}}{\mathop \sum }}\,{{\left\Vert \begin{array}{@{}cccccccccccccccccccc@{}} [{\bf R}][{\bf PC}{{{\bf A}}_{i}}]+{\bf T} \nonumber \end{array}-[{\bf PC}{{{\bf B}}_{i}}] \right\Vert}_{2}}\},\nonumber \end{align} \tag{ 19 }$

where the SVD decomposition can be used directly to calculate the transformation. However, in general the pursuit of minimizing the global deviation will cause the well-located points to share the offset of the outlier, which is similar to the impact of an abnormal point faced by the point registration process, therefore easily leading to false detection. Excessive skew or uneven height is still a minority and there are no noise points in the current problem. Thus, the paper simply employs the distance between the corresponding points after registration as the analysis target.

To obtain a more fine-grained discrimination, usually the clustering category is set to 4. In a single iteration, once the number of points whose distance from the target points exceeds the threshold meets the quantity requirement, the corresponding data is removed from PCB and PCA, and the matching is performed again. Considering that the outliers belong to minorities, the quantity requirement is limited by a linear control condition $({{t}_{1}}{\rm pins}+{{t}_{2}}){\rm pins}$ , in which the percentage parameter $({{t}_{1}}{\rm pins}+{{t}_{2}})$ decrements with the number of pins. Additionally, it is diagnosed as abnormal by dynamically adjusting the segmentation threshold σ to reduce excessive data.

On the other hand, in addition to using the features and extraction algorithms designed above, the strategy to further avoid inspection errors is to construct a horizontal comparison mode. The idea is that PCA is not generated from a CAD; instead, it is obtained from a known good connector considered as a template [11]. This paper designs a strategy based on algorithm 2, aiming to provide an automatic learning ability of pin arrangement rules, avoiding the subjectivity caused by artificial selection. As shown in figure 10, the strategy performs iterative fitting through the historical data to gradually generate a data state space for different types of products.

Algorithm 2. Abnormal diagnosis based on clustering and rigid transform.

Input: 3D points PCB and PCA, and a spilt threshold σ.

Output: rotation matrix R, translation matrix T and abnormal index I.

• Let $w=1\times {{10}^{6}}$ $w=1\times {{10}^{6}}$ and repeat while $w>\sigma$ $w>\sigma$

a) Calculate the centroid C_PCA and C_PCB, then construct matrix ${\bf H}=({\bf PCA}-{{\boldsymbol{C}}_{{\bf PCA}}}){{({\bf PCB}-{{{\bf C}}_{\boldsymbol{PCB}}})}^{T}}$ ${\bf H}=({\bf PCA}-{{\boldsymbol{C}}_{{\bf PCA}}}){{({\bf PCB}-{{{\bf C}}_{\boldsymbol{PCB}}})}^{T}}$

b) Perform SVD decomposition to get matrices U and V

c) Let ${\bf R}=\boldsymbol{V}{{\boldsymbol{U}}^{T}}$ ${\bf R}=\boldsymbol{V}{{\boldsymbol{U}}^{T}}$ , $\boldsymbol{T}={{\boldsymbol{C}}_{{\bf PCB}}}-\boldsymbol{R}{{\boldsymbol{C}}_{{\bf PCA}}}$ $\boldsymbol{T}={{\boldsymbol{C}}_{{\bf PCB}}}-\boldsymbol{R}{{\boldsymbol{C}}_{{\bf PCA}}}$ and obtain distance vector $\newcommand{\e}{{\rm e}} \boldsymbol{D}=\sum\nolimits_{i=1}^{{\rm pins}}{\begin{array}{@{}cccccccccccccccccccc@{}} \Big \Vert [{\bf R}][{\bf PC}{{{\bf A}}_{i}}]+{\bf T} \end{array}-{{[{\bf PC}{{{\bf B}}_{i}}]\Big \Vert}_{2}}}$ $\newcommand{\e}{{\rm e}} \boldsymbol{D}=\sum\nolimits_{i=1}^{{\rm pins}}{\begin{array}{@{}cccccccccccccccccccc@{}} \Big \Vert [{\bf R}][{\bf PC}{{{\bf A}}_{i}}]+{\bf T} \end{array}-{{[{\bf PC}{{{\bf B}}_{i}}]\Big \Vert}_{2}}}$

d) Use k-means to cluster D into four categories and obtain the categories centroid C. Let $w=\underset{{{\boldsymbol{C}}_{{\rm kindi}}}}{\mathop{{\rm argmax}}}\,({{\boldsymbol{C}}_{{\rm kindi}}}-{{\boldsymbol{C}}_{{\rm median}}})$ $w=\underset{{{\boldsymbol{C}}_{{\rm kindi}}}}{\mathop{{\rm argmax}}}\,({{\boldsymbol{C}}_{{\rm kindi}}}-{{\boldsymbol{C}}_{{\rm median}}})$

e) If $\left( w-~{{\boldsymbol{C}}_{{\rm median}}} \right)>\sigma$ $\left( w-~{{\boldsymbol{C}}_{{\rm median}}} \right)>\sigma$ and the data number of this category ${{N}_{i}}<({{t}_{1}}{\rm pins}+{{t}_{2}}){\rm pins}$ ${{N}_{i}}<({{t}_{1}}{\rm pins}+{{t}_{2}}){\rm pins}$ , then save their index into I, remove them from PCB as well as PCA. Let ${\rm pins}={\rm pins}-{{N}_{i}}$ ${\rm pins}={\rm pins}-{{N}_{i}}$ and $\sigma =1.1\sigma$ $\sigma =1.1\sigma$

• Return R, T, I

**Figure 9.** Examples of different types of pins handled in different ways.
Download figure:
Standard image High-resolution image

**Figure 10.** Arrangement rule learning strategy based on algorithm 2.
Download figure:
Standard image High-resolution image

The generation of PCA depends on the data with no obvious defects, which is continuously collected. The judgment of the defects mainly relies on the artificially given tolerance threshold vector $\boldsymbol{\gamma }$ (Usually, let $\newcommand{\e}{{\rm e}} \sigma \leqslant 0.9{{\left\Vert \begin{array}{@{}c@{}} \boldsymbol{\gamma } \end{array} \right\Vert}_{2}}$ to adapt to artificially modified $\mathbf{\gamma }$ ), which represents the warning line for the outliers. After the reconstructed PCB is registered with PCA, we collect I and define the corresponding data PCB_I. Obviously, as long as we determine whether PCB_I exceeds ${{\boldsymbol{\gamma }}_{xy}}$ in the XY-plane, and whether it exceeds ${{\boldsymbol{\gamma }}_{z}}$ in the Z direction, we can easily diagnose the abnormal value, and further identify the skewness or the unusual height. On the other hand, no defect means that PCB_I is empty or t data does not exceed the threshold, thus PCA can be updated using the reconstructed data and historical data. In fact, every point in $\mathbf{PCA}$ is the centroid point where the history data fits at each pin position. In the updating stage, the currently reconstructed data is registered to PCA, which is updated consequently according to the newly generated points cluster on each pin position. In addition, it is necessary to continuously filter out some points that have been recorded in I to gradually approximate a more reliable pin position data arrangement space.

Hence, it is possible to flexibly learn the target data with various rules and adjust the tolerance threshold according to different periods, different operators, and different detection requirements to modify the warning intensity and control the accuracy of PCA generation in this way. Moreover, the visual-based diagnosis of the quality of product geometry, in cases where it is difficult to obtain a design benchmark, can be improved by the proposed strategy.

5. Experiment and discussion

5.1. Preliminary work

For this study, 33 kinds of electrical connectors belonging to eight series were prepared. Parts of these products were given in figure 2(a) and the pin type covers the above-mentioned profiles. The experimental platform based on binocular vision was shown in figure 3(a)(right). The low-angle ring light was used and adjusted to a suitable brightness to highlight the A-region and B-region. The working distance is about 250mm, the depth of field is about 23.6 mm, the field of view is $80\,{\rm mm}\times 65\,{\rm mm}$ , and the image resolution is $2592\,{\rm mm}\times 1944\,{\rm mm}$ . The proposed stereo vision system was performed on the computer with an Intel i7 2.40 GHz processor and 8 GB of RAM.

In order to verify the performance of the proposed method, connectors were held in hd and inspected in the current fixed imaging environment. There are two reasons for this: one can keep all types of electrical connectors in the depth of field, on the other hand, it can increase the uncertainty of features and the difficulty of the problem. Based on these, the self-adaptation and robustness of the proposed method were analyzed and the accuracy and repeatability of the measurement were further verified. In such a way, two datasets were prepared.

(1)
Dataset I: we kept the end faces of all electrical connector pins as close as possible to the same visual working distance. In the case of no rotation, as shown in figure 11(a), 20 images were sequentially captured, and a total of 660 experimental images were obtained as dataset I, in which a small amount of horizontal translation was allowed. Dataset I is intended to simulate the on-line visual inspection of a single category of products, emphasizing the detection efficiency of various models, the success rate, and the random error.
(2)
Dataset II: Connectors were allowed to rotate $\pm 10{}^\circ$ around the three coordinate axes respectively as well as a displacement that does not exceed the depth of field. Samples were taken as shown in figures 11(b), and 300 images were taken for each type of electrical connectors (150 left images and 150 right images), and a total of 9900 images were used as dataset II. Through the various imaging postures of different products, the diversity of the environment and the target features was increased to simulate the conditions of off-line sampling or mixed line batch production.

**Figure 11.** Experimental preparation (a) handheld approach employed to ensure that each type of connectors can be captured by the camera in the depth of field (the product will be fixed when preparing dataset I.) (b) Pose variation: generating diversity by changing the posture to verify the adaptive ability and robustness of the proposed method.
Download figure:
Standard image High-resolution image

In this study, for the model identification task, 100 images were prepared for each model, in which the proportions of the training set, the verification set and the test set were 25%, 25% and 50%, respectively, according to the strategy of classic VOC2007 [45]. Image flips were used during training to achieve data enhancement. The training process was divided into two phases. In each phase, the RPN and Fast R-CNN were iterated 8000 times and 4000 times respectively to achieve alternate joint training. The remaining hyperparameters of the training process were set to the default value. The original dataset of the pin recognition task contains 1348 images, which are divided into four categories. The number of iterations of the RPN and Fast R-CNN is 15 000 and 8000, respectively, and the other hyperparameters were the same as the model task.

Due to the difficulty of reconstructing the vertical pins, research in this field tends to focus on the reproducibility of the data. Nevertheless, considering the particularity of the proposed method, we have to pay attention to the accuracy problem. Unfortunately, laser scanning or confocal detection was not suitable for obtaining high-precision reference data for multi-type pins as the study did for BGA in [10]. Therefore, we built the corresponding optimal imaging environment for different models of electrical connectors as shown in figure 3, in which the lighting effects and working distances were constantly modified to produce a more refined B-region, thereby reducing interference. In this approach, the arrangement data of the pins tip could be theoretically approximated to 0.01mm resolving accuracy and was defined as ${\bf PCA}_{{\rm opt}}^{m}$ , where m represents a different electrical connector model.

It should be pointed out that we artificially caused the skewness of pin No. 5 in model Y2-36 ZJLM and the missing of pin No. 32 in model J36A-52ZJ, and also respectively added 80 and 600 images to datasets I and II to correspondingly build ${\bf PCA}_{{\rm opt{\text {{\text {--}}}}d}}^{{\rm Y}2{{\rm \text {--}}}36}$ and ${\bf PCA}_{{\rm opt}{\rm d}}^{{\rm J}13{\rm A}52{\rm ZJ}}$ .

5.2. Comprehensive assessment of the main aspects

In this section, dataset I was employed for the experimentation. According to the algorithm procedure, connectors, pins-regions, and pins were recognized in sequence. Afterwards, the position points were extracted according to the layout state of the tips, and then the 3D reconstruction and the defects diagnosis were performed. The last step of the method was to register each ${\bf PCB}_{k}^{m}$ on ${\bf PCA}_{{\rm opt}}^{m}$ using formula (19) and obtain a new data $\widehat{{\bf PCB}}_{k}^{m}$ , where k represents the kth of the 20 images. Theoretically, for these images captured from the model m connector, the inspection results ${{\widehat{{\bf PCB}}}^{m}}$ will be completely coincident. However, due to the presence of target translation and system random noise in dataset I, the results will be difficult to maintain consistency. According to

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle {{\overline{{\bf PCB}}}^{m}}=\frac{1}{20}\sum\limits_{k=1}^{20}{[\widehat{{\bf PCB}}_{k}^{m}]},\nonumber \end{align} \tag{ 20 }$

we calculated the data set centroid at each pin position for the current model. Then as shown in table 2, we defined the average difference $A{{D}^{m}}$ as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle A{{D}^{m}}=\frac{1}{{{P}_{m}}}{{\sum\limits_{i=1}^{{{P}_{m}}}{\left\Vert \begin{array}{@{}cccccccccccccccccccc@{}} {{\overline{{\bf PCB}}}^{m}}(i) \nonumber \end{array}-{\bf PCA}_{{\rm opt}}^{m}(i) \right\Vert}}_{2}},\nonumber \end{align} \tag{ 21 }$

where ${{P}_{m}}$ is the total number of pins. Range was defined as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle {\rm Rang}{{{\rm e}}^{m}} = {\rm argma}{{{\rm x}}_{\left\Vert \cdot \right\Vert}}\{{{\left\Vert {{\overline{{\bf PCB}}}^{m}}(i)-{\bf PCA}_{{\rm opt}}^{m}(i) \right\Vert}_{2}}\}.\nonumber \end{align} \tag{ 22 }$

Table 2. Evaluation results on dataset I.

Product features	Model	Success (%)	Average time (s)	Recognition rate		AD^m (mm)	${\rm Rang}{{{\rm e}}^{m}}$ (mm)	$F_{\max }^{m}$ (mm)
Product features	Model	Success (%)	Average time (s)	Model (%)	Pins (%)	AD^m (mm)	${\rm Rang}{{{\rm e}}^{m}}$ (mm)	$F_{\max }^{m}$ (mm)
Ball-head; aligned	J6W-9	100	0.102	100	100	0.044	0.094	0.019
	J6W-15	100	0.107	100	100	0.029	0.091	0.017
	J6W-25J	100	0.112	100	100	0.031	0.093	0.018
	J6W-37	100	0.134	100	100	0.037	0.101	0.016
	J6W-50	100	0.153	100	100	0.032	0.114	0.017
Cylinder; low	J14A-15ZJ	100	0.098	100	100	0.017	0.075	0.018
	J14A-26ZJ	100	0.103	100	100	0.019	0.064	0.015
	J14A-38ZJ	100	0.108	100	100	0.021	0.081	0.019
	J14A-51ZJ	100	0.111	100	100	0.013	0.051	0.011
Ball-head; low	J18-9P	100	0.114	100	100	0.021	0.083	0.019
	J18-15P	100	0.132	100	100	0.020	0.098	0.021
	J18-25P	100	0.144	100	100	0.035	0.097	0.019
	J18-37P	100	0.163	100	100	0.041	0.060	0.019
	J18-50P	100	0.187	100	100	0.039	0.079	0.021
Ball; exposed	J30JHT05P	100	0.117	100	100	0.051	0.085	0.021
Ball-head; inside the hole	J30JHT9TJ	100	0.065	100	100	0.023	0.058	0.013
	J30JHT15TJ	100	0.066	100	100	0.017	0.087	0.013
	J30JHT25TJ	100	0.071	100	100	0.017	0.067	0.013
	J30JHT51TJ	100	0.075	100	100	0.014	0.071	0.012
Cone; low	J36A-9ZJ	100	0.090	100	100	0.020	0.074	0.011
Cone; aligned	J36A-17TJ	100	0.121	100	100	0.036	0.090	0.017
Cone; low	J36A-17ZJ	100	0.092	100	100	0.020	0.073	0.015
Cone; aligned	J36A-26TJ	100	0.154	100	100	0.030	0.065	0.015
Cone; low	J36A-26ZJ	100	0.099	100	100	0.019	0.059	0.012
Cone; aligned	J36A-38TJ	100	0.177	100	100	0.024	0.069	0.018
Cone; aligned	J36A-52TJ	100	0.192	100	100	0.035	0.064	0.019
Cone; low	J36A-52ZJ	100	0.113	100	100	0.017	0.078	0.016
Cone; aligned	J36A-62TJ	100	0.185	100	100	0.039	0.072	0.019
Ball; exposed	CEFC004-X07	100	0.134	100	100	0.040	0.101	0.017
Cone; low	J7-9ZJ	100	0.093	100	100	0.017	0.085	0.016
Ball; aligned	PRC20C-8002	100	0.112	100	100	0.028	0.124	0.022
Cylinder; aligned	Y2-36 ZJLM	97.5	0.122	97.5	100	0.040	0.119	0.020
	Y2-36 ZJLM (skewness)	100	0.121	100	100	0.043	0.121	0.021
	Y2-50 ZJLM	97.5	0.154	97.5	100	0.034	0.124	0.020

Furthermore, we used the average fluctuation value F to describe the fluctuation of the data group at each pin position and it was expressed as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle F_{i}^{m}=\frac{1}{20}\underset{k=1}{\overset{20}{\mathop \sum }}\,\begin{array}{@{}cccccccccccccccccccc@{}} {{\left\Vert \widehat{{\bf PCB}}_{k}^{m}\left( i \right)-{{\overline{{\bf PCB}}}^{m}}\left( i \right) \right\Vert}_{2}}. \nonumber \end{array}\nonumber \end{align} \tag{ 23 }$

It should be noted that table 2 only shows the data of the pins with the maximum value, which is denoted as $F_{\max }^{m}$ .

It can be observed that both the target recognition rate and the calculation accuracy can achieve good results with a small amount of displacement. The small value of $AD$ and Range with most models proves that even if it is not under the optimal imaging state, the proposed strategy still maintains sufficient significance, and the corresponding extraction algorithm can successfully divide and conquer the problem of feature diversity. The precision of the cone-tip and the tip located in the hole is the highest (^*-ZJ and J30JHT^*), because of its own structural advantages. On the other hand, the ball-tip tends to undergo a relatively low precision, particularly those with larger top spherical surfaces, which will expand the range of A- and B-regions, thereby reducing the accuracy.

By observing $F_{\max }^{m}$ , we can find that the fluctuations of the reconstructed data basically do not change with the product type. Analysis of the captured images shows that a proper amount of movement of the target will not cause a large variation in its features. Therefore, the precision of the final results mainly depends on the random errors of the visual devices (such as camera, light source, etc) themselves, rather than the extraction of the corresponding pixel. It should be pointed out that the Y2-36 ZJLM identification error occurs once and is erroneously judged as Y2-50 ZJLM, which in our opinion can be compensated by the subsequent pin matching process in practical applications.

As shown in figure 12, the results of Y2-36 ZJLM and J36A-52ZJ indicate that the data of each pin position is quite stable and close to the reference generated from the optimal view. Furthermore, the abnormal point does not have a significant impact on the registration results of other locations, proving that the algorithm can effectively filter outliers, thereby avoiding error allocation. Additionally, it is verified that this operation of embedding a priori information to the template point set and performing the registration can successfully return the abnormal pin number. More importantly, once the established method can guarantee the stability and accuracy of the reconstructed data, it is reasonable to believe that quantitative determination of the defects and the strategy for generating PCA based on historical data are feasible.

**Figure 12.** Data analysis and detection identification. (a) Reconstructed data and registration result of model Y2-36ZJLM, (b) registered effect obtained from the corresponding ${\rm PC}{{{\rm A}}_{{\rm opt}}}$ before and after generating the skewness, (c) registered results of reconstructed data containing skewness with PCA generated from 20 non-defective samples, (d)–(f) are the same as (a)–(c), with the target being replaced by J36A-52ZJ containing a missing pin.
Download figure:
Standard image High-resolution image

It should be noted that we did not arrange a control group here. Because the $F_{\max }^{m}$ of control group is bound to be at the same as proposed method. Besides, Dataset II can be regarded as an extension of I. The comparison of parameters $AD$ and Range are arranged in table 4 can better show the performance and avoid excessive content.

5.3. Analysis of repeatability and robustness

Experiment I has verified that the horizontal displacement cannot constitute the main factor affecting the results, and all the steps of the method can successfully accomplish their tasks. Therefore, in this section, dataset II was used for the experiment. By adding translation and rotation, we simulated the uncertainty of the manual mode during off-line secondary sampling. The robustness of the proposed method and the precision and repeatability of the results can also be further verified.

Table 3 compares the performance of seven pin recognition strategies. Method 1 (direct method) classifies electrical connectors, pins, and backgrounds into different categories of identification tasks [35]. Method 2 directly applies the two-step strategy. Method 3 (direct method) is based on the fact that the electrical connector has been identified. We employed imaging processing algorithms [9, 27, 30, 31] to find the highlight area as the identified pin position. However, this method requires considerable noise reduction as well as the threshold modification of the image processing algorithm, and the generalization capability is seriously insufficient. Methods 4 to 6 apply the template matching algorithm [19, 31], in which we tested both the single template and the multiple templates. Single-template strategy is prone to leak-matching and requires proper termination strategies and noise reduction methods. The multi-template strategy has a higher coverage; however, we need to provide additional processing to avoid the noise being matched before the correct pins are done. Clearly, this approach is highly dependent on the pre-integrated analysis of images and the careful selection of appropriate templates. Method 7 is the proposed method in this paper.

Table 3. Comparison of the pins recognition.

	Only DL-based		Image processing based	TM-based			Proposed strategy
	Direct method	Two-steps approach	Image processing based	One	Two	Three	Proposed strategy
Pins detection	27.955%	82.746%	67.546%	69.412%	77.732%	81.323%	98.644%

The pin recognition rate was defined as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \frac{1}{300\mathop{\sum }_{m=1}^{35}{{p}_{m}}}\{\sum\limits_{m=1}^{35}{(30,0{{p}_{m}}-}\sum\limits_{k=1}^{150}{{{\varrho }_{{\rm left}}}}-\sum\limits_{k=1}^{150}{{{\varrho }_{{\rm right}}}})\},\nonumber \end{align} \tag{ 24 }$

where ${{\varrho }_{{\rm left}}}$ and ${{\varrho }_{{\rm right}}}$ are obtained according to the condition

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle {{\begin{array}{@{}cccccccccccccccccccc@{}} \left\Vert \begin{array}{cccccccccccccccccccc} \widehat{{\bf PCB}}_{k}^{m}\left( i \right) \nonumber \end{array}-{\bf PCA}_{{\rm opt}}^{m}(i) \right\Vert_2 \nonumber \end{array}}}\geqslant \frac{80\times 11}{2592}.\nonumber \end{align} \tag{ 25 }$

We adjusted these inaccurate recognitions to get the Dataset III with rectangular box ${{r}_{m}}(i)$ labels and define the pin recognition rate for other methods as

$\begin{align} \newcommand{\e}{{\rm e}} \displaystyle \frac{1}{300\mathop{\sum }_{m=1}^{35}{{p}_{m}}}\{\sum\limits_{m=1}^{35}{\left( 300{{p}_{m}}-{{\lambda }_{1}}-{{\lambda }_{2}} \right)\}},\nonumber \end{align} \tag{ 26 }$

where ${{\lambda }_{1}}$ represents total unrecognition pins and ${{\lambda }_{2}}$ is the total number of regions that overlap with ${{r}_{m}}(i)$ by less than 75%.

The application of deep learning to the recognition task of the electrical connectors can be well completed. The recognition rate reaches 99.086% on dataset II, and the errors mainly occur on some images captured under excessive tilt angles and a few images with blurs. However, it can be seen from table 3 that, as the small target has very little information, the direct application of deep learning in pin recognition can only achieve a success rate of 27.955%, accompanied by a large number of false matches (mainly occurring on the background outside the electrical connector). The pin-region is recognized using the two-step strategy in advance, which means that the area ratio between the pin and the background is reduced. Thus, we can obtain a recognition rate of 82.746%; however, we still cannot handle the models with densely arranged pins such as J36A-52ZJ, J36A-62TJ, etc. On the other hand, by constantly adjusting the parameters and functions, the best result we have collected was 67.546%. Many poorly reflective pins cannot be successfully extracted. In the experiment, based on the high accuracy rate of model identification, we applied the corresponding pin arrangement rules to guide the matching work, instead of relying solely on template matching algorithm. However, the recognition rate from 69.412% to 81.323% indicates that it does not increase linearly with the increase of templates, and the algorithm in turn will become increasingly difficult to control. The core problem is that the selected templates cannot adequately cover the various features. Therefore, the termination condition is too difficult to design since the score of the noise tends to exceed the rest of the pins after several matches. Contrastively, the proposed algorithm achieves a success rate of 98.644%, outperforming other methods. We found that the errors tend to appear on excessive tilt or blurred images.

Similar to table 2, we produced table 4 where control group II collects the rough pin position obtained in the previous step, which is a common strategy for some applications [32]. In addition, control group I adopts the conventional centroid strategy [19, 31, 33]. It should be noted that the method we used for comparison is not a direct application of the coarse positioning results, but an adaptive version that requires special binarization operations designed in our section 3.2. Considering that these approaches cannot be run separately from high recognition rates, these control groups need to be embedded in our framework. The operational efficiencies of three groups are roughly the same, which are close to the average time shown in the table 2.

Table 4. Evaluation results on dataset II.

Product features	Model	Control group I			Control group II			Proposed strategy
Product features	Model	${\rm A}{{{\rm D}}^{m}}$ (mm)	${\rm Rang}{{{\rm e}}^{m}}$ (mm)	$F_{{\rm max}}^{m}$ (mm)	${\rm A}{{{\rm D}}^{m}}$ (mm)	${\rm Rang}{{{\rm e}}^{m}}$ (mm)	$F_{\max }^{m}$ (mm)	${\rm A}{{{\rm D}}^{m}}$ (mm)	${\rm Rang}{{{\rm e}}^{m}}$ (mm)	$F_{\max }^{m}$ (mm)
Ball-head; aligned	J6W-9	0.141	0.201	0.113	0.191	0.297	0.159	0.112	0.143	0.044
	J6W-15	0.139	0.213	0.114	0.189	0.274	0.153	0.109	0.161	0.053
	J6W-25	0.141	0.224	0.115	0.189	0.281	0.155	0.111	0.159	0.053
	J6W-37	0.159	0.256	0.177	0.180	0.254	0.161	0.099	0.183	0.064
	J6W-50	0.171	0.274	0.189	0.204	0.331	0.164	0.120	0.181	0.060
Cylinder; low	J14A-15ZJ	0.077	0.122	0.022	0.093	0.171	0.065	0.076	0.059	0.018
	J14A-26ZJ	0.093	0.132	0.017	0.101	0.184	0.079	0.089	0.104	0.018
	J14A-38ZJ	0.084	0.149	0.016	0.121	0.197	0.071	0.084	0.061	0.011
	J14A-51ZJ	0.063	0.097	0.018	0.119	0.167	0.060	0.041	0.064	0.012
Ball-head; low	J18-9P	0.154	0.207	0.159	0.177	0.283	0.174	0.121	0.174	0.062
	J18-15P	0.131	0.199	0.108	0.190	0.241	0.158	0.116	0.153	0.044
	J18-25P	0.169	0.217	0.031	0.197	0.281	0.193	0.153	0.187	0.026
	J18-37P	0.074	0.101	0.035	0.164	0.254	0.185	0.048	0.062	0.014
	J18-50P	0.067	0.098	0.041	0.184	0.273	0.190	0.064	0.070	0.019
Ball; exposed	J30JHT05P	0.099	0.153	0.037	0.194	0.301	0.175	0.101	0.154	0.036
Ball-head; inside the hole	J30JHT9TJ	0.042	0.062	0.015	0.245	0.487	0.253	0.042	0.062	0.015
	J30JHT15TJ	0.051	0.087	0.019	0.227	0.474	0.214	0.051	0.077	0.019
	J30JHT25TJ	0.047	0.079	0.017	0.221	0.437	0.231	0.047	0.079	0.017
	J30JHT51TJ	0.045	0.066	0.011	0.230	0.454	0.235	0.045	0.056	0.011
Cone; low	J36A-9ZJ	0.097	0.142	0.026	0.107	0.194	0.064	0.085	0.095	0.017
Cone; aligned	J36A-17TJ	0.081	0.131	0.079	0.110	0.209	0.139	0.046	0.078	0.018
Cone; low	J36A-17ZJ	0.069	0.134	0.057	0.108	0.157	0.137	0.083	0.121	0.060
Cone; aligned	J36A-26TJ	0.051	0.121	0.022	0.095	0.180	0.174	0.049	0.077	0.020
Cone; low	J36A-26ZJ	0.058	0.139	0.055	0.105	0.199	0.145	0.069	0.103	0.044
Cone; aligned	J36A-38TJ	0.048	0.115	0.041	0.121	0.201	0.177	0.048	0.079	0.023
Cone; aligned	J36A-52TJ	0.047	0.125	0.044	0.159	0.184	0.162	0.044	0.096	0.027
Cone; low	J36A-52ZJ	0.067	0.134	0.040	0.118	0.219	0.071	0.070	0.111	0.032
Cone; aligned	J36A-62TJ	0.066	0.113	0.051	0.134	0.191	0.075	0.065	0.095	0.024
Ball; exposed	CEFC004-X07	0.101	0.159	0.047	0.154	0.222	0.135	0.077	0.104	0.035
Cone; low	J7-9ZJ	0.083	0.121	0.029	0.105	0.137	0.048	0.077	0.071	0.018
Ball; aligned	PRC20C-8002	0.099	0.154	0.085	0.114	0.149	0.105	0.104	0.175	0.065
Cylinder; aligned	Y2-36 ZJLM	0.174	0.259	0.117	0.169	0.283	0.154	0.141	0.178	0.037
	Y2-36 ZJLM (skewness)	0.168	0.247	0.119	0.166	0.291	0.161	0.139	0.181	0.038
	Y2-50 ZJLM	0.170	0.236	0.120	0.173	0.252	0.164	0.117	0.201	0.045

As can be seen, the proposed method can achieve a better performance. The results of Group II are generally unstable and can only be applied in some less demanding applications. The results obtained in Group I are close to the proposed strategy on some special situations, in which A-region is the only and excellent quality feature. However, for the electrical connectors with a high feature diversity (such as Y2- and J6W-), the difference is still large. Observation of $F_{\max }^{m}$ implies that the proposed strategy can still maintain a high stability on dataset II. Moreover, given the height abnormality determination range (>0.5 mm), the $F_{\max }^{m}$ of each model is in the range of 0.010 to 0.065 mm and most AD is within 0.100 mm, which proves the capability of the proposed method in performing this multi-category products inspection work. Range describes a few cases where there may be large errors. From the table, it can be found that the data of each model basically stay within 0.2 mm, suggesting that it still enjoys a strong ability of discrimination despite encountering some abnormal conditions under the current system configuration.

Judged from the type and the structure of the pins, the ball-tip usually contains more information, in which A- and B-regions are prone to various unpredictable changes. As shown in figure 13(a), the deformation of A-region will bring about a great deviation to the feature extraction, for which Group I often loses accuracy in these cases. The cracks and the missing of B-region will lead to the extracted pixels not having the correct correspondence. Consequently, the proposed strategy selectively repairs B-region according to the geometric properties of A-region and further ensures the accuracy. On the other hand, as can be seen in figure 13(b), the area of A-region in the tapered-tip is extremely small, for which the ranges of A- and B-regions are all severely restricted. Therefore, both Group I and the proposed method can perform well. Similarly, for the pins mounted in holes (such as J30JHT*TJ), both Group I and the proposed method employ the proposed binarization operation and apply the centroid method, thus the same results are collected and shown in figure 13(c). However, Group II shows a large number of errors due to the fact that the rough recognition phase of this model is different from other products, the target of which is a hole rather than a tip. In addition, given that the machining quality determines the state of the feature, models J18-37P and J18-50P maintain significant A- and B-regions under the light source used, hence the results of Group I and the proposed method are quite desirable. Similarly, for model J30JHT05P, the proposed method automatically judges that the A-region enjoys good geometric properties, thus the centroid method is finally adopted to obtain the same results as Group I.

**Figure 13.** Influence of pin type and structure on the extraction of the tip position characterization point. (a) Ball head, (b) cone head, (c) the pin with a flattened tip installed in the hole, (d) the features of a few connectors in the captured image maintaining robustness to a degree of pose variation.
Download figure:
Standard image High-resolution image

Analyzed from the pins layout of the electrical connector, longer rectangular electrical connectors tend to produce larger errors. The left and right sides of the horizontally arranged pins are closer to the edges of the field of view. These two positions are more sensitive to perspectives when the target changes its posture (J6W-50, J6W-37, J18-50). As shown in figure 14, the left side of the left view is often accompanied by greater feature deformation and slide resulting from perspective effects. Groups I and II are generally unable to handle both normal and bad conditions, thus often introducing large errors. The proposed hierarchical extraction strategy fully considers these conditions and ultimately ensures a high performance. However, for different degrees of feature slides, our method is still limited to estimating the possible movement distance of B-point according to the sliding process of B-region from good to normal condition. This compensating method will bring potentially with it small errors and unavoidable instability.

**Figure 14.** Influence of target poses on features during capture.
Download figure:
Standard image High-resolution image

The angle parameter is limited to ±10° because the tip will encounter an information hiding phenomenon once it exceeds the range from 12° to 15°. Although the pins located on the left side of a long rectangular product enjoy good extraction results in the right view, the corresponding pixel in the left view cannot be obtained due to an excessive tilt angle. Therefore, the uncertainty brought by the various postures mainly lies in the illumination allocation and the diversity of image features. These are the core factors that affect the inspection results, rather than the posture itself. On the other hand, we observed that the 10° threshold is sufficient for most scenes and applications, even when capturing images by a handheld approach. In general, the proposed method can provide greater flexibility for the traditional vision solutions and successfully complete the detection and measurement tasks of multi-category electrical connectors.

6. Conclusion

Injecting a stronger robustness and adaptability to the traditional vision solution, this paper proposes a vision-based inspection and measurement scheme for multi-category products as well as various target features, extending applications of the visual technology. The ability to recognize small targets is enhanced by combining Faster RCNN-based DL technology, prior knowledge, and conventional template matching algorithm, by which the interested objects in industrial products can be free from the limitations of image processing with manual features and fixed parameters. Based on the analysis of the influence of multi-category problems on visual measurement, a hierarchical feature extraction strategy is constructed to flexibly perform the key points fitting of target features with different types and shapes. In practical applications, many products are similar to the pins of electrical connectors, whose design reference is difficult to use in directly analyzing the quality. The designed rule learning approach applies the historical data to a continuous approximation of the real state of the product to generate a reference, which can also be used to analyze products from different batches or manufacturers. Compared with many existing pins inspection vision schemes, the proposed method shows a successful inspection of 33 kinds of products, of which the adaptivity and robustness demonstrate the potential for conversion to portable measurement. Moreover, the proposed feature strategy shows better accuracy, repeatability, and robustness than traditional centroid methods and coarse positioning methods. In addition, compared to other existing single-category online visual inspection technologies for electronic products, the proposed method has lower requirements for equipment and imaging environments.

Structured information of industrial products can be used as prior knowledge combined with graph convolution network (GCN) to realize few-shot inspection/detection in future. Besides, the analysis of tiny features can be inferred by establishing end-to-end neural network, and the possibility of reducing perspective interference can also be explored.

Acknowledgments

This work was supported by the Graduate Student Innovation Fund of Beihang University and National Defense Pre-Research Foundation of China (Grant No. 414230104).

Compliance with ethical standards

Funding: This study was funded by the Graduate Student Innovation Fund of Beihang University and the National Defense Pre-Research Foundation of China (Grant No. 414230104).

Conflict of Interest: The authors declare that they have no conflict of interest.

Vision-based adaptive stereo measurement of pins on multi-type electrical connectors

Article metrics

Submit

Permissions

Author e-mails

Author affiliations

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. Related work

3. Vertical pin inspection task of multi-type connectors

4. Proposed adaptive inspection method

4.1. DNN-based recognition of multi-type connectors and pins

4.2. Hierarchical extraction strategy of target features

4.2.1. Adaptive binarization strategy.

4.2.2. Hierarchical analyzer.

4.3. Triangulation and outlier judgment through rule learning

5. Experiment and discussion

5.1. Preliminary work

5.2. Comprehensive assessment of the main aspects

5.3. Analysis of repeatability and robustness

6. Conclusion

Acknowledgments

Compliance with ethical standards

Vision-based adaptive stereo measurement of pins on multi-type electrical connectors

Article metrics

Submit

Permissions

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. Related work

3. Vertical pin inspection task of multi-type connectors

4. Proposed adaptive inspection method

4.1. DNN-based recognition of multi-type connectors and pins

4.2. Hierarchical extraction strategy of target features

4.2.1. Adaptive binarization strategy.

4.2.2. Hierarchical analyzer.

4.3. Triangulation and outlier judgment through rule learning

5. Experiment and discussion

5.1. Preliminary work

5.2. Comprehensive assessment of the main aspects

5.3. Analysis of repeatability and robustness

6. Conclusion

Acknowledgments

Compliance with ethical standards