Image Judgment Auxiliary System for Table Tennis Umpiring under Low Light Conditions

ABSTRACT In table tennis competitions, the rule violation judgment with the greatest controversy is the height of the ball serve. This is because inaccuracy in the ball height judgment, which results in erroneous judgment, is unavoidable. Thus, we designed an automatic image judgment auxiliary system for table tennis ball height during service in this study. We used a high-speed camera to record the ball toss in the table tennis service. The designed algorithm architecture can automatically search for the ball and the position of the hand action under low light source conditions. It is often difficult to provide enough light when using high-speed photography and this leads to underexposure. The algorithm is mainly divided into hue-saturation-value color space processing and morphology processing using Hough transform to search for the circular ball. Experiment result shows that color segmentation can successfully and accurately determine the ball position under low light conditions. The morphology method can find the position of the hand and help determine the moment when the ball leaves the hand during the service ball toss. Finally, the actual size of the target is used to estimate the actual distance unit represented by the image pixel. Graphical Abstract


Foreword
Table tennis is currently a popular competition sport. To achieve good performance, the athlete must continuously pursue improvements in table tennis skills. Thus, the setting of the rules and requirement for athletes to comply with competition rules is becoming stricter. The judgment of rule violation in all types of sports competitions almost always rely on the professionalism and subjective judgment of the umpire. However, even though the umpires have gone through professional training to make their judgment more objective, uncertainties in the umpire's subjective judgment and level of physical concentration can occasionally cause erroneous competition judgment. These erroneous judgments can cause significant negative impact on the athlete's psychological status, thereby, affecting the competition results. In table tennis competitions, the rule violation that currently has the greatest controversy is the height of the serve ball. A table tennis service usually takes a few seconds to complete. However, there are over 10 observations an umpire needs to take and make a judgment before or just after the service is completed. This is a very complex task and requires a lot of judgments, even for an experienced umpire [1]. According to Article 2.6.2 of the International Table Tennis Federation Handbook 2017 'The server shall then project the ball near vertically upwards, without imparting spin, so that it rises at least 16 cm after leaving the palm of the free hand and then falls without touching anything before being struck' [2]. Different than out of bounds call, where there is a clear boundary as a reference, there is no clear reference for the minimum 16 cm height in a table tennis service, and it is difficult for umpires to accurately judge whether a violation has occurred with their bare eyes. The inaccuracy of the ball height judgment has resulted in protest from the athletes regarding this penalty. Thus, the use of image technology to help umpires with their judgment is necessary.
Many sports have already introduced image technology to help umpires make the final judgment or to propose ruling change for erroneous judgments. Some examples are the Hawk-Eye in tennis and badminton and the goal-line technology and VAR in soccer. These auxiliary judgment systems can provide clear image proof for judgment that has been questioned. Currently, the judgment in  [4]. Table tennis umpiring auxiliary system is still in the developmental stage. Therefore, we focused on the building of a table tennis umpiring auxiliary system. Desai et al. presented an algorithm for detecting and tracking ping pong balls in sports videos. The proposed method uses motion as the primary cue for detection. The detected object is tracked using the multiple filter bank approach [5]. Wong used videography, image processing, and artificial neural network (ANN) technology to help determine the table tennis serving height [1]. Wong continued to develop an intelligent system which is able to identify and track the location of the ball from live video images and evaluate the service according to the service rules [3]. Wong and Dooley presented a system to automatically detect and track the ball during table tennis services from real match videos [6]. The first task in calculating the height of the serve ball in table tennis is identifying the circular table tennis ball and tracking its movement. Wong used ANN technology to detect the circular ball. The disadvantage is that the ANN model requires training to be able to make the detection. Thus, the ANN model can be easily affected by background interference and produce detection error. Wong then used multi-layer perceptron and the radial basis function network in improve table tennis identification capability. However, this method is only suitable for analyzing one picture [3]. The advantage of using circular Hough transform to find circular objects is that this transform can resist noise and the algorithm is robust [7][8][9][10]. For the image noise filter, not only is the commonly seen averaging filter and median filters used, but morphology-based processing method can also be used to filter out the noise. This includes erosion, dilation, opening, and closing [11,12]. The most important key is the color separation, which separates the background and the foreground. Using color separation technology to separate the background and foreground of the target can significantly reduce calculation quantity. The first difficulty that this technology will encounter is the lighting problem [13,14]. Lighting stability and lighting strength both affect the color separation status. The selected color model can also affect the effectiveness of the separation. Many skin color detections related studies have mentioned this problem. Chaves-González et al. compared red-green-blue, cyan-magenta-yellow, Y is the luminance component and UV stand for two chrominance components (YUV), luminance (Y) in-phase quadrature (YIQ), green (Y)-blue (Pb)-red (Pr) (YPbPr), Y′ is the luma component and CB and CR are the blue-difference and red-difference chroma components (YCbCr), luminancechroma-blue-chroma-red (YCgCr), the color space used in the SÉCAM analog terrestrial color television broadcasting standard (YDbDr), hue-saturation-value (HSV), hue-intensity-saturation, and 1931 international commission on illumination XYZ color space (CIE-XYZ) color models. Among these, the HSV is the most suitable for detecting skin color [15,16]. Cho et al. proposed using adaptive threshold to detect skin in HSV color space [17]. Sigal et al. also built a skin color model and used this model in HSV color space [18]. The key in image technology is the architecture design of the image processing algorithm. Thus, we separated and transformed HSV color in this study and used Hough transform and morphology calculation to build an automatic image processing procedure for table tennis service in low light conditions.

Research Method
The research architecture of this study is described in Section 2.2. Section 2.1 describes the image collection, Section 2.2 describes the color image processing (including the automatic sphere tracking and hand position determination), and Section 2.3 describes the transformation relationship between image pixel and real size. The objective of this study is to determine the location of the ball and the hand's upper edge, which are used to determine the ball height during the serve in relation to the hand. This relation is used to determine the legality of the height of the serve ball, which can provide a reference for researchers who are building an auxiliary judgment system.

Image Collection
In this experiment, we used an ix-cameras i-SPEED 210 (F-mount 1280 × 1024 Resolution, 2 μs shutter, 79,500 fps) high-speed camera. The camera is placed in a position parallel to the table's end line. The shooting distance is 200 cm away from the table's side line. The camera is at 100 cm vertical height away from the ground. The highspeed camera is used to extract the ball toss action during table tennis services. Sampling takes 525 frames per second and the image size is 880 × 1194 pixels. Image processing is conducted on each frame during the post-analysis. Generally, high-speed camera recording requires sufficient light. However, shinning an excessively strong light source on the server not only affects the server's vision, but a does not conform to the environmental light source at an actual competition. Filming should not allow strong light to affect the server's vision. Therefore, lack of strong light shining on the server conform more to actual competition lighting. Thus, the high-speed shooting in this study was done with a low light source. The ball used was of a type designated for the 2016 Rio Olympics.

Color Image Processing Procedure
Serving in table tennis can be divided into serving preparation action, the instant that the ball leaves the hand, the apex of the ball toss, and when the ball starts to fall. The objective of this study is to obtain the image track from when the ball leaves the hand to the apex of the ball toss, and then calculating the ball height. Algorithm processing architecture is shown in Figure 1. The challenge to the auxiliary judgment system is that it must make a judgment within a few seconds after the serve is complete. Therefore, the region of interest (ROI) must be cut out from each frame to reduce information size and remove unnecessary noise, which increases the calculation and determination speed. Type 1 and Type 2 image processing are then used to find the ball and hand's approximate position. HSV color space based Type 1 and Type 2 separates the ball and hand's approximate position. The difference between the two image processing methods is that they have different threshold parameters. The partial screenshot of the frame is as shown in Figures 2 and 3.
The approximate ball position can already be found in the image after Type 1 processing. To find the precise position, image thresholding was first conducted and then Hough transform was used to find the circular object and determine whether the circular object has appeared. If the circular object appears, this means that the ball has left the palm of the server. If the circular object appears, then the center and radius information is recorded. This information is the ball's precise position. The processed partial frame is as shown in Figure 4.
The approximate hand position can be found in the image after Type 2 processing. To find a more accurate position, we first used morphology to conduct opening and then closing to solve the noise problem. Afterwards, image connected component analysis was used find the upper edge of the hand. Methods used included calculating the eight-connected components and the size of the connected components. The result was used to eliminate images that do not include the hand position. Finally, thresholding of the image's upper edge was used as a basis for the hand's upper edge. The processed partial frame is as shown in Figure 5.

Conversion Relationship between the Image Pixels and Actual Size
In this study, we used the   Table 1 is (X, Y) = (84, 1097) and (X,Y) = (124 1097). Thus, the white end line width is 40 pixels. We can then calculate that at the actual shooting site each pixel 2 cm/40pixels = 0.05 cm. Because the server is already serving on the side, Type 1 algorithm is used to automatically determine the ball location. The track is recorded, as shown in Figure 6. Figure 7 shows that pixel difference of the ball center position from when the ball leaves the hand to the apex of the toss is 113-75 = 38 pixel. Thus, the ball height is 38 pixels × 0.05 cm = 1.9 cm. We can then judge that this service ball toss violated the rule. In addition, Type 2 algorithm can be used to automatically detect the hand's upper edge position, which can be used as a second verification for determining the position when the ball leaves the hand.

Result and Discussion
The system framework proposed by Myint et al. [4] is as shown in Figure 8. The idea of searching for the ball location by left and right views is provided. First the possible binary area is searched by adaptive color thresholding and motion detection (ACTMD). Then the possible ball location is found by feature-based ball detection (FBD) and the ball location is confirmed by inter-view self-correction (IVSC). If this is in error, correction is made by the second-order motion model. The ROI and color processing methods used in this paper are similar to ACTMD, while the Hough transform determines the object structure. This part is more like FBD and IVSC. The ROI and color processing methods reduce the amount of information that needs to be processed to as little as possible. The binary image is used to do follow-up computation. The Hough transform searches for a circular shape and the anti-noise feature is excellent. Theoretically, in the same environment, the difference will not be too big. This is also one of the reasons for the choice of HSV color space.
The height of the serve ball in table tennis must be higher than 16 cm. However, without a clear image reference, accurately judging this with the naked eye is difficult. It only takes a few seconds to complete the table tennis service process (from serving preparation to service  Figure 6. Automatic ball tracking and recording (recording starts the moment that the ball appears). completion). Therefore, it is a challenge for image auxiliary to provide a judgment within short time after a service is completed. Image auxiliary must be able to rapidly identify the position where the ball leaves the hand and track the ball's movement, and then calculate the ball toss within the shortest time to provide image auxiliary judgment. The advantage of high-speed camera shooting is that each key frame can be obtained. The more frames taken each second the clearer the athlete's action and the track of the ball can be re-rendered. Although high-speed camera has this advantage, the short exposure time in high-speed photography requires strong lighting so that the light-sensitive components can successfully capture enough light. This is why additional strong light source is required. In this study, we shot videos under simple light source conditions and used image processing techniques to automatically track ball and skin on the hand. We constructed a series of image processing methods to solve the low light source problem so that we do not affect the athlete. Different than Wong, who used ANN technology to detect the circular table tennis [1], we implemented image segmentation on the color space to track the ball and the skin of the hand (results shown in Figures 2-5). Hough transform can automatically and accurately find the ball position in a high noise thresholded image. The blue track in Figure 6 clearly shows the ball track. In the track display, the ball track conforms to the ball position in each frame. Thus, search of circular objects with Hough transform can resist noise. The result of this example is excellent. Once the ball position can be determined, we can also determine the point the ball leaves the hand and calculate the apex height of the ball toss. Ball track in Figure 6 shows that the height of the ball has not reached the regulated height, but the distance of the ball drop is very long. This can easily cause the umpire to misjudge and cause wrong cognition by the athlete. The primary objective of this study is to help the umpire determine whether a violation has occurred. Thus, the recorded video can be repeatedly replayed and the processed image can be superimposed in the video and displayed. This can help the umpire make the final judgment. During image processing, the pixel position that corresponds to the actual distance can be obtained from the target's actual size in proportion to the size of the pixels in the image. No complex calculation is required for the conversion. In consideration of computer calculation quantity and the possession of only one camera, we did not set up a mechanism to determine ball toss angle. We hope that future technological advances can solve problems posed by large calculation requirements and high-cost hardware so that studies in this field can make further breakthrough.

Conclusion
In this study, we used the characteristics of high-speed cameras to design an automatic image processing procedure, which can accurately record the key moment when the service ball leaves the hand during the toss and the apex of the ball toss. This can help umpires easily determine if a rule violation has occurred during the serve. The special feature of this study is using the actual size of the background object to calculate the actual distance that corresponds to the pixel size in the film. Thus, no complex calculation is required. In addition, image processing can be implemented in a low light source environment. Usually, with high-speed camera recording, light source become an important issue. However, we only used basic lighting in this study, and avoided interference from strong light sources. The algorithm was still able to automatically process the image within a short period of time.