Development of Automated Physical Defect Inspection Systems in Greige Tarpaulin Fabric Based on Machine Learning Algorithms

. The textile industries in Indonesia have challenges, one of which is improving the efficiency of processes. PT NTX is a weaving industry company that produces gray tarpaulin textile products. In this industry, the inspection process for gray cloth resulting from the weaving process is generally carried out conventionally, using a cloth inspection machine and visually with the human eye. Traditional inspection methods cannot be applied to greige tarpaulin textile products since the characteristics of the greige tarpaulin textile are very prone to shifting or slipping of the woven thread construction if it is pulled, touched, or rewound. The process of detecting defects and quality control of greige tarpaulin textile products is carried out by the operator on the loom during the weaving process. This process, of course, will result in the need for inspection operators with high skills, and the consistency of inspection results is very dependent on the condition of the inspection operator. This research has used image processing techniques based on machine learning algorithms to overcome this problem by examining the product directly on the machine. This research has used the Mean Pixel Value method combined with the Logistic Regression Model and the Local Binary Pattern method combined with the Support Vector Machine as image process techniques. Based on the research results, the Mean Pixel Value method combined with the Logistic Regression Model had an accuracy rate of 61% on greige tarpaulin images, and the Local Binary Pattern method with the Support Vector Machine had an accuracy rate of 68%.


I. INTRODUCTION 1
The textile and textile product sector significantly impacts the national economy, constituting 18.79% of the manufacturing sector's workforce.In recent years, the nation's textile sector has evolved into a major global exporter (Prihandono & Religi, 2019), prompting a transition to Industry 4.0 to retain its competitive edge.Industry 4.0 fosters the integration of traditional manufacturing with information technology, creating interlinked, efficient systems (Harja et al., 2019).At PT NTX, located in Karawang, West Java, the inspection of fabrics woven pre-dominantly uses conventional methods.The assessment process is visual, augmented by specialized lighting, relying on operators' keen eyesight to uphold set quality benchmarks.However, quality control presents unique challenges for the greige tarpaulin textile.These textiles are sensitive to manual handling, and given their considerable size, conventional inspection could be more practical.
Advancements in image processing technology offer innovative solutions.Researchers have extensively explored various methods in this area; the first research accentuates the significance of industrial vision units that employ basic image processing for fabric inspection.Anagnostopoulos et al. (Anagnostopoulos et al., 2001) propose an algorithm that amalgamates statistical measurements with thresholding and morphological operations to improve accuracy and speed.Deep learning, specifically Convolutional Neural Networks (CNN), has been introduced, and it is adapted to discern and learn from image features across varying scales automatically.Jing et al. (Jing et al., 2020) have modified the adaptation of the CNN model YOLOv3, the framework's augmentation, using the k-means algorithm.In the continuous effort to improve the precision and efficiency of fabric defect detection, new methods are emerging, each with distinct advantages and challenges.Li P (Li et al., 2014), explored in recent research, is centered on the Local Binary Pattern (LBP) for defect detection.LBP serves primarily as a tool to extract feature values from fabric images.
This paper introduces an automatic textile defect detection system designed to identify prevalent faults in textile production, precisely the damaged edge defect.Utilizing computer vision methodologies, the proposed system merges image processing and machine learning techniques, which have gained traction in the textile research domain.Such technologies, leveraging computer vision and image processing, effectively detect textile defects during active production (Iqbal et al., 2020)).The innovation of this research lies in the utilization and combination of image processing and machine learning algorithm models, which are the Mean Pixel Value-Logistic Regression Model and Local Binary Pattern-Support Vector Machine.Before being applied to actual fabrics, these algorithms were initially tested on an artificial fabric dataset.Then, we subsequently adapted these algorithms for defect detection on the Greige Tarpaulin fabric dataset.This research's primary goal is to evaluate the accuracy of both algorithms.Furthermore, these results will be used to design an automatic visual inspection system that harnesses machine learning algorithms to classify defects, incorporating visual image processing techniques for a more efficient and precise defect detection mechanism.

II. RESEARCH METHOD
The research used methodology based on the problem formulation and research objectives.Based on the study, the system design is separated into two parts.First are the methods and overview of the system depicted in Figure 1 and Figure 2, while the second part focuses on processing the images to detect defects, as shown in Figure 5.The proposed system can be a model for recognizing textile defects in the real world.
Figure 2 offers a more granular understanding of the system's workflow.The system initiates with a continuous imagecapturing phase.Within this phase, a camera is designed to consistently record images of the desired object or area for one hour.All images captured within this time frame are temporarily stored for impending preprocessing.Following the cessation of each hour-long interval, these images are subjected to a preprocessing regimen.This phase is predominantly characterized by a defect detection process hinging on a dedicated machine learning model.Each preprocessed image is rendered to this model, which proffers a prediction indicating the potential presence of defects.After the projections, images recognized to harbor defects are systematically ushered to permanent storage, furnishing an avenue for in-depth scrutiny or future consultation.All processed images within the temporary repository are purged after their respective treatments to economize on storage and sustain operational efficiency.In parallel to the defect detection of images from a preceding hour, the system remains assiduously engaged in capturing images for the next cycle, ensuring an uninterrupted operational continuum.This system, by design, allows for both manual cessation by an operator and automated termination under pre-defined conditions.The overarching objective is to harness this system as an efficient tool for defect detection in textiles in real-world applications.

Experiments
Our fabric defect detection research pivots around two primary datasets, representing a confluence of artificial and real-world fabric scenarios.The first, the Artificial Fabric Dataset, finds its origins in the notable work of Bergmann (Bergmann et al., 2019).Comprising images of fabrics, this dataset presents an apparent dichotomy of fabrics, namely those with defects and those without.Initially, this dataset underpins the development of our primary binary classification algorithm.Still, as our study advances, it morphs into a pivotal reference, a touchstone for validating real-world fabric samples.It is the foundation upon which the primary defect detection algorithm is constructed.Subsequently, its role transforms, becoming a reference or master key, a standard against which real-world data can be juxtaposed and validated.
On the other side of the spectrum is the Greige Tarpaulin Dataset.This dataset is a manifestation of rigorous experimental data acquisition.Much like its artificial counterpart, it encapsulates images of greige tarpaulin fabric and distinguishes them based on the presence or absence of defects.We are transitioning to the intricacies of image acquisition.The Greige Tarpaulin Dataset was defined by its rigorous acquisition protocol.Utilizing an imaging camera device, there was a dedicated endeavor to ensure optimal resolution and accurate lighting conditions, thereby capturing the fabric in its most genuine state.Beyond mere acquisition, the task of data annotation was foundational.A systematic procedure was administered across both datasets, classifying each image as 'defect' or 'non-defect'-the precision inherent in this labeling, especially within the context of the Greige Tarpaulin Dataset.
This binary-centric approach furnishes our research with clarity, precision, and robustness, ensuring unequivocal outcomes in our algorithmic pursuits.Figure 3 shows the artificial fabric dataset and greige tarpaulin fabric dataset image of the nondefect class.Figure 4 illustrates the artificial fabric dataset and greige tarpaulin fabric dataset image of the defect class.

Data Preparation
Two primary datasets served as the pillars for this research, each containing 868 images.The Artificial Fabric Dataset, derived from Bergmann et al., and the Greige Tarpaulin Dataset, sourced from experimental acquisitions, are of particular mention.Figure 5 shown a notable aspect of these datasets is their balanced composition: 50% of the images in each dataset depict defects, while the remaining half showcase fabrics devoid of any defects.Such an even distribution ensures that the machine learning models are not biased and can learn to distinguish between the two classes effectively.The image labeling was anchored in a binary system: '0' for images with defects and '1' for those without.This categorical facilitated the machine learning algorithms, particularly the Logistic Regression Model and the Support Vector Machine, in discerning and classifying the images based on their inherent characteristics.
Prior to diving into training, the datasets were judiciously divided.In line with standard practices, 80% of the images (approximately 694 images from each dataset) were allocated for training, while the balance of 20% (approximately 174 images from each dataset).

Mean Pixel Value-Logistic Regression Model
This method combined image processing principles and machine learning to arrive at insightful conclusions.Central to the image processing component was the Mean Pixel Value method.This method calculates the average pixel value of the entire image or a defined region.In greigescale images, pixel values span from 0 (black) to 255 (white).For color images, each of the three channels -red, green, and blue -adhere to this range.The Mean Pixel Value is determined by summing all the pixel values and dividing by the total pixel count.It was observed that regions with defects often exhibited a Mean Pixel Value that deviated significantly from the rest of the image, indicating potential anomalies.A significant feature of this model is its ability to understand the relationship between independent variables and a binary dependent variable, offering a probabilistic perspective (Perreault et al., 2017).
regression models are advantageous when the dependent variable is categorical, and the relationship between the independent variable and the outcome is not linear.In logistic regression, the dependent variable is usually binary, representing two possible outcomes: defect or non-.The logistic regression model estimates the probability that a dependent variable falls into a specific category based on the values of the independent variables (Kost et al., 2019).
From Figure 6, it can be seen that one of the image samples with the "non-defect" class has a lower average pixel value compared to the image with the "defect" class because the defects that appear can cause sudden changes in pixel values, for example, scratches on a smooth surface can cause dark lines or sudden brightness in the image, contrasting with the surrounding pixel values.
We utilized random sampling during this segmentation to fortify our data preparation against potential biases.Additionally, to further enhance the robustness of our model evaluation, we implemented k-fold cross-validation on the training set.This iterative method exposes the model to various subsets of the training data, aiming to optimize its generalized performance.
Figure 7 shows the results of our crossvalidation, which delineate accuracy and variability across folds, are illustrated in Figure 6.The accuracy values for K-1, K-2, K-3, K-4, and K-5 are 0.57, 0.59, 0.60, 0.59, and 0.58, respectively, yielding a mean accuracy of 0.59.This method, widely acknowledged in machine learning studies, comprehensively evaluates model performance through diverse training and validation combinations.Despite the inherent randomness in our sampling approach, we meticulously ensured a balanced representation across classes to maintain the authenticity and reliability of our analysis, particularly in light of the challenges posed by imbalanced datasets.
After completing cross-validation, which provided valuable insights into our model's robustness, we transitioned to evaluating its performance on the independent 20% test dataset.This phase is pivotal in assessing how well our model generalizes to unseen data, offering crucial insights into its real-world applicability.To delve into the nuances of the model's classification performance, we employed the Confusion Matrix.The resulting Confusion Matrix shown in Figure 7 presents a detailed breakdown of the model's performance across different classes.For instance, the matrix showcases the accurate classification of cases belonging to class 0 with 89 true positives, underlining the model's proficiency in handling the specific characteristics of this category.The precision rate of 65% further emphasizes the effectiveness of the algorithm in correctly identifying instances of class 0. On the other hand, the representation of class 1 in the Confusion Matrix paints a more complex picture.While 72 instances were accurately classified, contributing to an overall value of 61%, the 47 misidentifications-instances of class 0 marked as class 1-point to potential overlaps or ambiguities in the feature space between the classes.Additionally, the 56 cases where genuine class 1 items were mistakenly classified as class 0 highlight the challenges in differentiation.
The Confusion Matrix provides a comprehensive snapshot of our model's performance post-cross-validation, offering insights into its strengths and improvement areas (see Figure 8).This analysis sets the stage for a broader evaluation of our model's accuracy, precision, recall, and F1 score, which we detail in the subsequent sections of this paper.

Local Binary Pattern-Support Vector Machine
To solidify our experiment, the study initiated the deployment of the Local Binary Pattern (LBP) method for a focused texture analysis of digital images.LBP's prowess lies in its intricate ability to discern detailed texture features, relying on a meticulous study of pixel  (Li et al., 2014).The LBP method offers flexibility through parameters P and r, where P stipulates the number of surrounding pixels, and r, or the radius, demarcates the circular distance between these surrounding pixels and the central one.Having extracted features through LBP, the study seamlessly transitioned to machine learning, specifically leveraging the capabilities of the Vector Machine (SVM) capabilities.The introduction of SVM by Cortes and Vapnik in 1995 heralded a powerful tool adept at classification and regression.Its core functionality is discerning the optimal hyperplane in an expanded feature space, achievable through linear or non-linear mappings.This hyperplane then stands as a critical classifier, distinctly categorizing data.The intricate texture data sourced from LBP primed the SVM for nuanced classifications (Cortes et al., 1995;Wang et al., 2017).
To operationalize this procedure, the image underwent initial processing via the LBP method, effectively translating the image's texture into extractable features for consumption by the SVM.The study's meticulousness is evident in its rigorous testing of the LBP's P and r parameters to ascertain optimal texture detection.Three distinct (P, r) combinations were evaluated: (4,4), (4,8), and (4,10).Visual representations of the LBP processing for both 'non-defect' and 'defect' class images can be referenced in Figures 9 and 10.
Following applying the LBP process to the images, a transformation into histograms was undertaken to provide a visual representation of the pixel intensity distributions.Figure 11 presents a histogram derived from a non-defect class image post-LBP processing.We will use the (P, r) parameters of P=4 and r=8 for further experiments.
These individual histograms highlight the inherent pixel intensity distributions of each class.Figure 13 shows our cross-validation results, which delineate accuracy and variability across folds, as shown in Figure 6.The accuracy values for K-1, K-2, K-3, K-4, and K-5 are 0.80, 0.71, 0.78, 0.82, and 0.76, respectively, yielding a mean accuracy of 0.77.
After finalizing our model and incorporating insights from cross-validation, we evaluated its performance on the 20% test dataset.This phase is crucial for understanding the model's effectiveness on unseen data, offering insights into its real-world applicability.
To comprehensively dissect the model's efficacy, we adopted the Confusion Matrix shown in Figure 14, an essential tool in classification tasks.This matrix provides a clear visualization of true positives, false positives, and false negatives, facilitating a nuanced understanding of the model's strengths and areas that require improvement.
Insights from the confusion matrix from Figure 14 showcase a marked improvement in the actual positive rate for class 0: 110 instances were accurately identified, highlighting the model's enhanced ability to discern patterns specific to this class.However, challenges persist.While showing improvement from the previous model, the 29 misclassifications of class 0 as class 1 signal an area needing attention.For class 1, the model's proficiency is evident with 80 accurate detections.Nevertheless, the 59 false negatives indicate an enduring challenge in consistently recognizing this class.Overall, this model demonstrates significant advancements in its performance, particularly when contrasted with its predecessor.Despite these improvements, misclassifications suggest potential areas for iterative refinements -a testament to the everevolving nature of machine learning endeavors.Further quantitative evaluation provides a detailed lens to this assessment.Our model achieved an accuracy of 68%, These metrics and insights from the confusion matrix offer a comprehensive perspective on the model's capabilities.They highlight the model's strengths and pinpoint potential areas for enhancement.

IV. CONCLUSION
In this study, we explored the integration of image processing techniques with machine learning algorithms to achieve efficient classification.Two primary methodologies were assessed: one leveraged the Mean Pixel Value After completing cross-validation, which provided valuable insights into our model's robustness, we transitioned to evaluating its performance on the independent 20% test dataset.This phase is pivotal in assessing how well our model generalizes to unseen data, offering crucial insights into its real-world applicability.
The cross-validation results demonstrated a mean accuracy, reflecting the model's overall performance across different folds.This metric serves as a measure of the model's stability and generalization capability, providing a solid foundation for further evaluation.
To understand our model classification abilities better we used a Confusion Matrix.This matrix provided a detailed breakdown of the model's performance across different classes, highlighting both its strengths and areas for improvement.
These results provide a comprehensive perspective on the model's capabilities, highlighting its strengths and pinpointing potential areas for enhancement.Moving forward, future research endeavors could focus on addressing the underlying reasons for misclassifications, refining feature engineering techniques, or exploring alternative model architectures to further improve accuracy and overall efficacy in classification tasks.
In conclusion, the fusion of image processing with machine learning algorithms holds promising potential for robust classification tasks.By leveraging insights from experimental results and embracing ongoing advancements in AI and machine learning, we can develop more accurate and effective models in similar applications.

Figure 2 .
Figure 2. Overview of the system

Figure 5 .
Figure 5. Dataset notation for training and testing, For defect class, (1) for non-defect class

Figure 7 .
Figure 7. Cross validation of Greige Tarpaulin dataset Figure 12 offers a comparative histogram juxtaposing the results from the 'good' and 'defect' class images to accentuate the differences further.This comparative visualization underscores distinct patterns and variations, providing crucial insights for the following machine-learning phase.After extracting numerical information and Local Binary Pattern (LBP) image histograms from each class, these

Figure 14 .
Figure 14.Confusion matrix of Grey Tarpaulin with Support Vector Machine algorithm