Adjacent Image Augmentation and Its Framework for Self-Supervised Learning in Anomaly Detection

Anomaly detection has gained significant attention with the advancements in deep neural networks. Effective training requires both normal and anomalous data, but this often leads to a class imbalance, as anomalous data is scarce. Traditional augmentation methods struggle to maintain the correlation between anomalous patterns and their surroundings. To address this, we propose an adjacent augmentation technique that generates synthetic anomaly images, preserving object shapes while distorting contours to enhance correlation. Experimental results show that adjacent augmentation captures high-quality anomaly features, achieving superior AU-ROC and AU-PR scores compared to existing methods. Additionally, our technique produces synthetic normal images, aiding in learning detailed normal data features and reducing sensitivity to minor variations. Our framework considers all training images within a batch as positive pairs, pairing them with synthetic normal images as positive pairs and with synthetic anomaly images as negative pairs. This compensates for the lack of anomalous features and effectively distinguishes between normal and anomalous features, mitigating class imbalance. Using the ResNet50 network, our model achieved perfect AU-ROC and AU-PR scores of 100% in the bottle category of the MVTec-AD dataset. We are also investigating the relationship between anomalous pattern size and detection performance.


Introduction
Anomaly detection is a critical task that involves identifying data patterns that deviate significantly from the norm [1][2][3].This process is essential across various domains such as manufacturing quality inspection [4], medical diagnostics [5], cybersecurity [6], financial monitoring [7], CCTV surveillance [8], and autonomous driving [9].Typical anomaly detection methods leverage deep learning to map normal data features into a latent space, thereby creating a distribution of normal data.Anomalies are detected by comparing the features of input data against this distribution.Despite its importance, anomaly detection faces significant challenges, primarily due to the class imbalance between normal and anomalous data.Anomalies are rare compared to the vast amount of normal data, making it difficult for models to learn to detect them effectively.Recent research has focused on self-supervised learning techniques, which can help mitigate class imbalance by generating synthetic anomaly data [10][11][12].
One notable approach in self-supervised learning is the use of various augmentation techniques.For instance, CutPaste augmentation involves cutting a rectangular patch from a training image and randomly pasting it back into the original image [13].This technique introduces anomalies by disrupting normal patterns.Another method, SmoothBlend augmentation, entails cutting a small round patch, applying color jitter, and reinserting it into the image [14].This method aids in detecting small defects by creating more challenging patterns for the model to learn.
In recent years, contrastive learning frameworks such as SimCLR and SimSiam have gained significant traction [15][16][17][18].These methods generate two samples by applying Sensors 2024, 24, 5616 2 of 22 different augmentations to the same training image.The SimCLR method designates the two generated images as positive pairs to each other and as negative pairs to images generated from other training data within the same batch.In contrast, the SimSiam method passes the two augmented images through an encoder to create vectors, with only one vector passing through a projection.The vector that passes through both the encoder and projection is then designated as a positive pair with the vector that passes only through the encoder.However, these methods do not designate the training data within the same batch as positive pairs to each other, which limits their ability to effectively learn the nuances of normal images.
To enhance anomaly detection, we propose an adjacent augmentation technique that generates synthetic anomaly images by preserving object shapes and distorting the contours of specified regions.There are three methods of adjacent augmentation for generating synthetic anomaly images: Mosaic, Liquify, and Mosiquify.The Mosaic method reduces the resolution of a selected area and applies color jitter, producing defects that appear more natural.The Liquify method distorts contours to mimic real-world defects like scratches and sagging.The Mosiquify method combines both Mosaic and Liquify augmentations to generate even more realistic anomalies.Additionally, we introduce the Strong Overall and Wake Overall methods for augmenting synthetic normal images.By applying these synthetic images and an anomaly detection benchmark dataset [14,19,20] to our framework, we establish positive pairs between the training images within each batch, and between the training images and synthetic normal images, and negative pairs between the training images and synthetic anomaly images [15,21].This approach not only helps mitigate class imbalance but also improves the model's ability to differentiate between normal and anomalous data.Table 1 demonstrates that our proposed augmentation method does not show a significant difference in speed compared to previous augmentation techniques.Our main contributions can be summarized as follows: • We propose novel augmentation techniques and a framework for self-supervised learning aimed at addressing class imbalance in anomaly detection.

•
Our adjacent augmentations generate synthetic anomalies with realistic contour distortions, enhancing the model's learning process.• We develop a contrastive learning framework that leverages characteristics from anomaly detection benchmark datasets, improving the overall effectiveness of anomaly detection models.

Related Work 2.1. MVTec-AD Dataset
The MVTec-AD dataset is a benchmark for anomaly detection, specifically designed for the precise inspection of defects in industrial manufacturing [19].This dataset includes five texture categories and ten object categories, addressing limitations in the scope of previous anomaly detection datasets.The training set comprises 3629 normal images, while the test set contains 1,725 normal images and a mix of anomaly images.Despite the increased dataset size, the issue of class imbalance persists, with significantly fewer anomaly images.All images are captured using high-resolution RGB sensors, and the anomaly images accurately reflect real-world defects.In the texture categories, images Sensors 2024, 24, 5616 3 of 22 exhibit repeating patterns, whereas object category images are captured in specific locations.Our adjacent framework leverages the fact that all training data in this dataset consist of normal images.Figure 1 shows samples from the MVTec-AD dataset.Table 2 provides a description of the MVTec-AD dataset.
Sensors 2024, 24, x FOR PEER REVIEW 3 of 23 while the test set contains 1,725 normal images and a mix of anomaly images.Despite the increased dataset size, the issue of class imbalance persists, with significantly fewer anomaly images.All images are captured using high-resolution RGB sensors, and the anomaly images accurately reflect real-world defects.In the texture categories, images exhibit repeating patterns, whereas object category images are captured in specific locations.Our adjacent framework leverages the fact that all training data in this dataset consist of normal images.Figure 1 shows samples from the MVTec-AD dataset.Table 2 provides a description of the MVTec-AD dataset.

Representative Anomaly Detection
Semi-supervised learning techniques for one-class anomaly detection leverage the feature distribution of normal data to identify anomalies.During training, the model encodes normal data features into a latent space, establishing a distribution that represents normalcy.At inference, the model classifies input data as normal if its features fall within the decision boundary of the normal data distribution.Conversely, if the input data features lie outside this boundary, the data are classified as anomalous [22,23].Figure 2 illustrates the process of this method.

Representative Anomaly Detection
Semi-supervised learning techniques for one-class anomaly detection leverage the feature distribution of normal data to identify anomalies.During training, the model encodes normal data features into a latent space, establishing a distribution that represents normalcy.At inference, the model classifies input data as normal if its features fall within the decision boundary of the normal data distribution.Conversely, if the input data features lie outside this boundary, the data are classified as anomalous [22,23].Figure 2 illustrates the process of this method.Autoencoder-based methods perform anomaly detection by reconstructing compressed input data as normal data.During the training phase, the model learns by repeatedly compressing and reconstructing normal data.In the inference phase, the model calculates the reconstruction error between the input data and the reconstructed data.Since the autoencoder reconstructs normal data well, the error is low, and the model classifies it as normal.Conversely, the autoencoder does not reconstruct anomaly data well, resulting in a high error, and the model classifies it as anomalous [24].Figure 3 illustrates the process of this method.Finally, feature matching methods detect anomalies by comparing the features of normal data with those of input data.Normal images are divided into small patches, with key features stored in memory.The model calculates the similarity between the input im- Autoencoder-based methods perform anomaly detection by reconstructing compressed input data as normal data.During the training phase, the model learns by repeatedly compressing and reconstructing normal data.In the inference phase, the model calculates the reconstruction error between the input data and the reconstructed data.Since the autoencoder reconstructs normal data well, the error is low, and the model classifies it as normal.Conversely, the autoencoder does not reconstruct anomaly data well, resulting in a high error, and the model classifies it as anomalous [24].Figure 3 illustrates the process of this method.

Representative Anomaly Detection
Semi-supervised learning techniques for one-class anomaly detection leverage the feature distribution of normal data to identify anomalies.During training, the model encodes normal data features into a latent space, establishing a distribution that represents normalcy.At inference, the model classifies input data as normal if its features fall within the decision boundary of the normal data distribution.Conversely, if the input data features lie outside this boundary, the data are classified as anomalous [22,23].Figure 2 illustrates the process of this method.Autoencoder-based methods perform anomaly detection by reconstructing compressed input data as normal data.During the training phase, the model learns by repeatedly compressing and reconstructing normal data.In the inference phase, the model calculates the reconstruction error between the input data and the reconstructed data.Since the autoencoder reconstructs normal data well, the error is low, and the model classifies it as normal.Conversely, the autoencoder does not reconstruct anomaly data well, resulting in a high error, and the model classifies it as anomalous [24].Figure 3 illustrates the process of this method.Finally, feature matching methods detect anomalies by comparing the features of normal data with those of input data.Normal images are divided into small patches, with key features stored in memory.The model calculates the similarity between the input im- Finally, feature matching methods detect anomalies by comparing the features of normal data with those of input data.Normal images are divided into small patches, with key features stored in memory.The model calculates the similarity between the input image features and the stored normal features.If the input image features significantly deviate from the stored normal features, the image is classified as anomalous [25][26][27].Figure 4 illustrates the process of this method.
age features and the stored normal features.If the input image features significantly deviate from the stored normal features, the image is classified as anomalous [25][26][27].Figure 4 illustrates the process of this method.

Class Imbalance
The anomaly detection methods discussed in Section 2.2 typically include only normal data for training due to the class imbalance problem [14,28].This imbalance arises when normal data significantly outnumbers anomaly data.In a latent space with class imbalance, the feature distribution of normal data dominates, biasing input data towards being classified as normal.This bias can negatively impact anomaly detection performance, necessitating strategies to mitigate class imbalance.Figure 5 visualizes the class imbalance.

SimCLR
Our adjacent framework draws inspiration from the SimCLR framework [15], a contrastive learning method that embeds images into a latent space where positive pairs are closer together and negative pairs are farther apart [29].SimCLR effectively extracts visual representations through unsupervised learning by generating two differently augmented versions of each training image and treating them as positive pairs, while all other images in the batch are treated as negative pairs [15].We reference characteristics from benchmark training datasets [14,19,20] to slightly modify the concepts of the SimCLR framework.Our adjacent framework generates two synthetic normal images and one synthetic

Class Imbalance
The anomaly detection methods discussed in Section 2.2 typically include only normal data for training due to the class imbalance problem [14,28].This imbalance arises when normal data significantly outnumbers anomaly data.In a latent space with class imbalance, the feature distribution of normal data dominates, biasing input data towards being classified as normal.This bias can negatively impact anomaly detection performance, necessitating strategies to mitigate class imbalance.Figure 5 visualizes the class imbalance.
age features and the stored normal features.If the input image features significantly deviate from the stored normal features, the image is classified as anomalous [25][26][27].Figure 4 illustrates the process of this method.

Class Imbalance
The anomaly detection methods discussed in Section 2.2 typically include only normal data for training due to the class imbalance problem [14,28].This imbalance arises when normal data significantly outnumbers anomaly data.In a latent space with class imbalance, the feature distribution of normal data dominates, biasing input data towards being classified as normal.This bias can negatively impact anomaly detection performance, necessitating strategies to mitigate class imbalance.Figure 5 visualizes the class imbalance.

SimCLR
Our adjacent framework draws inspiration from the SimCLR framework [15], a contrastive learning method that embeds images into a latent space where positive pairs are closer together and negative pairs are farther apart [29].SimCLR effectively extracts visual representations through unsupervised learning by generating two differently augmented versions of each training image and treating them as positive pairs, while all other images in the batch are treated as negative pairs [15].We reference characteristics from benchmark training datasets [14,19,20] to slightly modify the concepts of the SimCLR framework.Our adjacent framework generates two synthetic normal images and one synthetic

SimCLR
Our adjacent framework draws inspiration from the SimCLR framework [15], a contrastive learning method that embeds images into a latent space where positive pairs are closer together and negative pairs are farther apart [29].SimCLR effectively extracts visual representations through unsupervised learning by generating two differently augmented versions of each training image and treating them as positive pairs, while all other images in the batch are treated as negative pairs [15].We reference characteristics from benchmark training datasets [14,19,20]

Methods
This chapter outlines the methods used to generate synthetic data through adjacent augmentations.Our approach involves augmenting training images to create synthetic normal and synthetic anomaly images.We employ the Strong Overall and Weak Overall methods for generating synthetic normal images and the Mosaic, Liquify, and Mosiquify methods for generating synthetic anomaly images.These synthetic anomaly images closely resemble real defects, helping to address class imbalance.Additionally, we discuss

Methods
This chapter outlines the methods used to generate synthetic data through adjacent augmentations.Our approach involves augmenting training images to create synthetic normal and synthetic anomaly images.We employ the Strong Overall and Weak Overall methods for generating synthetic normal images and the Mosaic, Liquify, and Mosiquify methods for generating synthetic anomaly images.These synthetic anomaly images closely resemble real defects, helping to address class imbalance.Additionally, we discuss previous augmentation methods, such as CutPaste [13] and SmoothBlend [14], and how adjacent augmentation synthesizes anomalous patterns.The final section details our adjacent framework, which integrates synthetic images for enhanced anomaly detection.

Augmentation
This section describes the augmentation techniques used in our framework and those from previous work.The first two methods involve augmentations that generate positive samples, while the remaining methods involve augmentations that generate negative samples.
Our augmentation methods are based on their effectiveness in generating synthetic anomaly images that closely resemble real-world defects.These methods introduce realistic variations that challenge the model's ability to distinguish between normal and anomalous data, which is crucial for enhancing anomaly detection performance.Additionally, these techniques allow us to simulate a wide range of defects, addressing the class imbalance issue by providing diverse and realistic anomaly samples.Furthermore, these methods are particularly effective because they exploit the strong correlation between anomalous patterns and surrounding pixels, enabling more effective learning.We believe that these methods strengthen our framework by improving the model's robustness and generalization capabilities.

Weak Overall
In industrial manufacturing, images are captured individually under varying conditions of lighting, angle, and position, resulting in slight differences.To reduce sensitivity to these minor variations, we use the Weak Overall augmentation from the Spot-the-Difference method.This augmentation helps the model better classify normal images despite these small variations [14].Figure 7 shows a Weak Overall sample.
samples, while the remaining methods involve augmentations that generate negative samples.
Our augmentation methods are based on their effectiveness in generating synthetic anomaly images that closely resemble real-world defects.These methods introduce realistic variations that challenge the model's ability to distinguish between normal and anomalous data, which is crucial for enhancing anomaly detection performance.Additionally, these techniques allow us to simulate a wide range of defects, addressing the class imbalance issue by providing diverse and realistic anomaly samples.Furthermore, these methods are particularly effective because they exploit the strong correlation between anomalous patterns and surrounding pixels, enabling more effective learning.We believe that these methods strengthen our framework by improving the model's robustness and generalization capabilities.

Weak Overall
In industrial manufacturing, images are captured individually under varying conditions of lighting, angle, and position, resulting in slight differences.To reduce sensitivity to these minor variations, we use the Weak Overall augmentation from the Spot-the-Difference method.This augmentation helps the model better classify normal images despite these small variations [14].Figure 7 shows a Weak Overall sample.
Algorithm to generate Weak Overall samples: 1.The first step is to crop the anchor from 90% to 100% and then resize it to the size of the anchor.2. The second step is to adjust the brightness, contrast, saturation, and hue properties of the anchor to random values between 0% and 10%.3. The next step is to apply a Gaussian blur with a kernel size of 5 by 5 and a sigma value between 0.1 and 0.3.4. The final step is to apply a horizontal flip with random probabilities.

Strong Overall
Normal images in industrial manufacturing typically have consistent shapes and contours.To detect small anomalies, the model must analyze detailed features of these images.The Strong Overall augmentation focuses on learning the intricate details of normal images, aiding in the detection of subtle anomalies.Figure 8 shows a Strong Overall sample.
Algorithm to generate Strong Overall samples: Algorithm to generate Weak Overall samples: 1.The first step is to crop the anchor from 90% to 100% and then resize it to the size of the anchor.2. The second step is to adjust the brightness, contrast, saturation, and hue properties of the anchor to random values between 0% and 10%.3. The next step is to apply a Gaussian blur with a kernel size of 5 by 5 and a sigma value between 0.1 and 0.3.4. The final step is to apply a horizontal flip with random probabilities.

Strong Overall
Normal images in industrial manufacturing typically have consistent shapes and contours.To detect small anomalies, the model must analyze detailed features of these images.The Strong Overall augmentation focuses on learning the intricate details of normal images, aiding in the detection of subtle anomalies.Figure 8 shows a Strong Overall sample.
Algorithm to generate Strong Overall samples: 1.The first step is to crop the anchor to a random size and then resize it to the size of the anchor.2. The second step is to apply horizontal flipping with random probabilities.3. The next step is to adjust the brightness, contrast, and saturation properties of the anchor to random values between 0% and 80%, and the hue to random values between 0% and 20%.4. The random grayscale method converts images to black and white with a 20% probability.5.The final step is to apply a Gaussian blur using a kernel with a size of 10% of the anchor.
2. The second step is to apply horizontal flipping with random probabilities.3. The next step is to adjust the brightness, contrast, and saturation properties of the anchor to random values between 0% and 80%, and the hue to random values between 0% and 20%.4. The random grayscale method converts images to black and white with a 20% probability.5.The final step is to apply a Gaussian blur using a kernel with a size of 10% of the anchor.

CutPaste
CutPaste augmentation involves cutting a square patch from a training image and pasting it back onto the original image [13].This augmentation distorts the continuous pattern, teaching the model to recognize such disruptions as anomalies.The CutPaste method is effective in highlighting discontinuous patterns indicative of anomalies.Figure 9 shows a CutPaste sample.
Algorithm to generate CutPaste samples: 1.The first step is to apply the Weak Overall augmentation.
2. The second step is to set the size ratio of the patch to 2% to 15% and the aspect ratio to 0.3 to 3. 3. The third step is to cut out the square patch from the anchor to the specified size.4. The final step is to paste the patch into a random location in the original image.

SmoothBlend
SmoothBlend augmentation cuts a small, round patch from a training image and pastes it onto the original image, distorting its continuous pattern.This augmentation helps the model learn to identify small defects by focusing on these local distortions [14].Figure 10 shows a SmoothBlend sample.
Algorithm to generate SmoothBlend samples: 1.The first step is to apply the Weak Overall augmentation.
2. The second step is to set the size ratio of the patch to 0.5% to 1%, and the aspect ratio to 0.3 to 3. 3. The third step is to cut out the round patch from the anchor to the specified size.

CutPaste
CutPaste augmentation involves cutting a square patch from a training image and pasting it back onto the original image [13].This augmentation distorts the continuous pattern, teaching the model to recognize such disruptions as anomalies.The CutPaste method is effective in highlighting discontinuous patterns indicative of anomalies.Figure 9 shows a CutPaste sample.
5. The final step is to apply a Gaussian blur using a kernel with a size of 10% of the anchor.

CutPaste
CutPaste augmentation involves cutting a square patch from a training image and pasting it back onto the original image [13].This augmentation distorts the continuous pattern, teaching the model to recognize such disruptions as anomalies.The CutPaste method is effective in highlighting discontinuous patterns indicative of anomalies.Figure 9 shows a CutPaste sample.
Algorithm to generate CutPaste samples: 1.The first step is to apply the Weak Overall augmentation.
2. The second step is to set the size ratio of the patch to 2% to 15% and the aspect ratio to 0.3 to 3. 3. The third step is to cut out the square patch from the anchor to the specified size.4. The final step is to paste the patch into a random location in the original image.

SmoothBlend
SmoothBlend augmentation cuts a small, round patch from a training image and pastes it onto the original image, distorting its continuous pattern.This augmentation helps the model learn to identify small defects by focusing on these local distortions [14].Figure 10 shows a SmoothBlend sample.
Algorithm to generate SmoothBlend samples: 1.The first step is to apply the Weak Overall augmentation.
2. The second step is to set the size ratio of the patch to 0.5% to 1%, and the aspect ratio to 0.3 to 3. 3. The third step is to cut out the round patch from the anchor to the specified size.Algorithm to generate CutPaste samples: 1.The first step is to apply the Weak Overall augmentation.2. The second step is to set the size ratio of the patch to 2% to 15% and the aspect ratio to 0.3 to 3. 3. The third step is to cut out the square patch from the anchor to the specified size.4. The final step is to paste the patch into a random location in the original image.

SmoothBlend
SmoothBlend augmentation cuts a small, round patch from a training image and pastes it onto the original image, distorting its continuous pattern.This augmentation helps the model learn to identify small defects by focusing on these local distortions [14].Figure 10 shows a SmoothBlend sample.
Sensors 2024, 24, x FOR PEER REVIEW 9 of 23 4. The fourth step is to apply random contrast up to 100% to the patch, random saturation up to 100%, and random color jittering up to 50%. 5.The final step is to alpha blend the original image and its patch.

Mosaic
Drawing inspiration from the SmoothBlend technique, Mosaic augmentation modifies the resolution and color within a specified circular region, rather than employing a cut-and-paste approach.This augmentation introduces subtle, natural-looking anomalies that challenge the model by distorting patterns in a manner highly pertinent to the surrounding pixels.Figure 11 shows a Mosaic sample.
Algorithm to generate Mosaic samples: Algorithm to generate SmoothBlend samples: 1.The first step is to apply the Weak Overall augmentation.
2. The second step is to set the size ratio of the patch to 0.5% to 1%, and the aspect ratio to 0.3 to 3. 3. The third step is to cut out the round patch from the anchor to the specified size.4. The fourth step is to apply random contrast up to 100% to the patch, random saturation up to 100%, and random color jittering up to 50%. 5.The final step is to alpha blend the original image and its patch.

Mosaic
Drawing inspiration from the SmoothBlend technique, Mosaic augmentation modifies the resolution and color within a specified circular region, rather than employing a cutand-paste approach.This augmentation introduces subtle, natural-looking anomalies that challenge the model by distorting patterns in a manner highly pertinent to the surrounding pixels.Figure 11 shows a Mosaic sample.
Figure 10.Depicted here is an image with SmoothBlend augmentation.SmoothBlend augmentation involves cutting a small, round patch from the anchor image and pasting it onto the original image.These SmoothBlend samples, which distort local detailed patterns in normal images, encourage the learning of detailed features found in anomaly data.

Mosaic
Drawing inspiration from the SmoothBlend technique, Mosaic augmentation modifies the resolution and color within a specified circular region, rather than employing a cut-and-paste approach.This augmentation introduces subtle, natural-looking anomalies that challenge the model by distorting patterns in a manner highly pertinent to the surrounding pixels.Figure 11 shows a Mosaic sample.
Algorithm to generate Mosaic samples: 1.The first step is to apply the Weak Overall augmentation.
2. The second step is to set the size ratio of the round area to be converted to 0.5% to 1%, and the aspect ratio to 1. 3. The third step is to reduce the specified area to the rate of ζ and restore it to its original size.4. The fourth step is to apply random brightness up to 50%, random contrast up to 50%, random saturation up to 50%, and random color jittering up to 20%. 5.The final step is to alpha blend the original image and the converted area.Algorithm to generate Mosaic samples: 1.The first step is to apply the Weak Overall augmentation.2. The second step is to set the size ratio of the round area to be converted to 0.5% to 1%, and the aspect ratio to 1. 3. The third step is to reduce the specified area to the rate of ζ and restore it to its original size.4. The fourth step is to apply random brightness up to 50%, random contrast up to 50%, random saturation up to 50%, and random color jittering up to 20%. 5.The final step is to alpha blend the original image and the converted area.

Liquify
Liquify augmentation distorts image contours by displacing random points, thereby generating patterns reminiscent of liquid flow.This technique aids the model in learning to classify distorted contour patterns, effectively simulating natural defects such as scratches and sagging.Figure 12 shows a Liquify sample.

Mosiquify
Mosiquify augmentation synergistically combines the effects of Liquify and Mosaic augmentations, thereby distorting contour, resolution, and color.This technique introduces complex and varied anomalies, facilitating the model's ability to recognize a wide range of anomalous features.Figure 13 shows a Mosiquify sample.
Algorithm to generate Mosiquify samples: 1.The first step is to apply the Weak Overall augmentation.Algorithm to generate Liquify samples: 1.The first step is to apply the Weak Overall augmentation.
2. The second step is to assign a random point to the image.
3. The third step specifies each coordinate of the four triangles centered around the designated point.4. The fourth step moves the specified point to a random location at a distance of the image size × (1/η)%.5.In the final step, four triangles move as the point moves, creating contour distortion.

Mosiquify
Mosiquify augmentation synergistically combines the effects of Liquify and Mosaic augmentations, thereby distorting contour, resolution, and color.This technique introduces complex and varied anomalies, facilitating the model's ability to recognize a wide range of anomalous features.Figure 13 shows a Mosiquify sample.

Adjacent Framework
The adjacent framework encompasses both image augmentation and the learning process.Traditional anomaly detection contrastive learning frameworks typically employ straightforward augmentations and a single loss function.In contrast, our framework introduces novel augmentations and utilizes two distinct loss functions to enhance the learning of features from both normal and anomalous data.Furthermore, unlike previous frameworks that focus solely on augmenting anomalous images, our framework applies augmentation to both normal and anomalous images.Finally, our contrastive learning framework maximizes the embedding distance between normal and anomalous data by leveraging NCE loss and cosine similarity loss.Figure 14 shows the augmentations used in the adjacent framework.The detailed process of the adjacent framework is shown in Figure 15.

Adjacent Framework
The adjacent framework encompasses both image augmentation and the learning process.Traditional anomaly detection contrastive learning frameworks typically employ straightforward augmentations and a single loss function.In contrast, our framework introduces novel augmentations and utilizes two distinct loss functions to enhance the learning of features from both normal and anomalous data.Furthermore, unlike previous frameworks that focus solely on augmenting anomalous images, our framework applies augmentation to both normal and anomalous images.Finally, our contrastive learning framework maximizes the embedding distance between normal and anomalous data by leveraging NCE loss and cosine similarity loss.Figure 14 shows the augmentations used in the adjacent framework.The detailed process of the adjacent framework is shown in Figure 15.
Our contrastive learning framework focuses on learning effective representations by embedding similar (positive) samples closer together in a latent space, while pushing dissimilar (negative) samples farther apart.Specifically, we augment each anchor image with both positive samples (such as Weak Overall and Strong Overall augmentations) and negative samples (synthetic anomaly images).We utilize losses like NCE loss and cosine similarity loss to ensure that positive pairs are closely aligned, and negative pairs are distinct within the feature space.This approach not only improves the model's ability to distinguish between normal and anomalous data but also addresses class imbalance by leveraging synthetic anomaly images.Our contrastive learning framework focuses on learning effective representations by embedding similar (positive) samples closer together in a latent space, while pushing dissimilar (negative) samples farther apart.Specifically, we augment each anchor image with both positive samples (such as Weak Overall and Strong Overall augmentations) and negative samples (synthetic anomaly images).We utilize losses like NCE loss and cosine similarity loss to ensure that positive pairs are closely aligned, and negative pairs are distinct within the feature space.This approach not only improves the model's ability to distinguish between normal and anomalous data but also addresses class imbalance by leveraging synthetic anomaly images.
Our adjacent framework leverages synthetic images in conjunction with the anomaly detection benchmark training dataset.This framework employs a self-supervised learning method known as contrastive learning.In this approach, artificial labels are generated from the data to train the model, enabling the effective utilization of unlabeled data.The framework enhances similarity between positive pairs while reducing similarity between negative pairs.The contrastive learning loss function aims to maximize the similarity of positive sample pairs by minimizing their distance.Conversely, for negative sample pairs, the objective is to maximize the distance, thereby minimizing their similarity.By optimizing this loss function, the model learns meaningful representations, effectively distinguishing between similar and dissimilar data points [15].designated as negative pairs.Furthermore, all training images within the batch are considered positive pairs.We employ InfoNCE loss and cosine similarity loss to train the model.The InfoNCE loss function encourages the model to draw the anchor and positive pair representations closer together while pushing the anchor and negative pair representations further apart [14,21].The cosine similarity loss function maximizes the similarity between positive pairs and minimizes the similarity between negative pairs.Generating synthetic normal data assists the model in learning detailed features.Anchors and strong overall samples are set as positive pairs, with InfoNCE loss bringing them closer together, thereby reducing sensitivity to environmental changes.Anchors and weak overall samples are also set as positive pairs, while synthetic anomaly data are designated as negative pairs, with cosine similarity loss managing these relationships [14,30].Each training image serves as an anchor, generating Strong Overall, Weak Overall, and negative samples.The framework ensures that anchors, training data, and Strong Overall samples are closely embedded, while anchors and synthetic anomaly images are kept distinct.In summary, our adjacent augmentations and framework generate synthetic Our adjacent framework leverages synthetic images in conjunction with the anomaly detection benchmark training dataset.This framework employs a self-supervised learning method known as contrastive learning.In this approach, artificial labels are generated from the data to train the model, enabling the effective utilization of unlabeled data.The framework enhances similarity between positive pairs while reducing similarity between negative pairs.The contrastive learning loss function aims to maximize the similarity of positive sample pairs by minimizing their distance.Conversely, for negative sample pairs, the objective is to maximize the distance, thereby minimizing their similarity.By optimizing this loss function, the model learns meaningful representations, effectively distinguishing between similar and dissimilar data points [15].
m: Minimum distance between negative pairs L Negative x i , x j = −log exp(m−Similarity(x i , x j )/τ) In our framework, training images paired with synthetic normal images are designated as positive pairs, while training images paired with synthetic anomaly images are designated as negative pairs.Furthermore, all training images within the batch are considered positive pairs.We employ InfoNCE loss and cosine similarity loss to train the model.The InfoNCE loss function encourages the model to draw the anchor and positive pair representations closer together while pushing the anchor and negative pair representations further apart [14,21].The cosine similarity loss function maximizes the similarity between positive pairs and minimizes the similarity between negative pairs.Generating synthetic normal data assists the model in learning detailed features.Anchors and strong overall samples are set as positive pairs, with InfoNCE loss bringing them closer together, thereby reducing sensitivity to environmental changes.Anchors and weak overall samples are also set as positive pairs, while synthetic anomaly data are designated as negative pairs, with cosine similarity loss managing these relationships [14,30].
Each training image serves as an anchor, generating Strong Overall, Weak Overall, and negative samples.The framework ensures that anchors, training data, and Strong Overall samples are closely embedded, while anchors and synthetic anomaly images are kept distinct.In summary, our adjacent augmentations and framework generate synthetic images and utilize them for contrastive learning.The training image passes through the encoder to become a representation, which then undergoes projection and normalization.This process enables the model to effectively learn the differences between normal and anomalous images, addressing class imbalance and enhancing anomaly detection performance.
Algorithm 1 summarizes the proposed method.Algorithm 1 augments one anchor with three samples (Weak Overall sample, Strong Overall sample, and Negative sample).The anchor and positive samples are embedded closer together using NCE loss and cosine similarity loss.The anchor and negative sample are embedded farther apart using cosine similarity loss.

Experiments
In our experiments, we employed a ResNet50 backbone network pre-trained on the ImageNet 1K dataset, with output classes designated as normal and anomaly.Input images were resized to 512 × 512 pixels and subsequently augmented.We used the NVIDIA GeForce RTX 2080 Ti GPU for computational efficiency.The hyperparameters were configured identically to those used in the Spot-the-Difference experiments [14].The Adam optimizer was utilized with a learning rate of 0.0001 and a weight decay of 0.00003.Additionally, we applied the Cosine Annealing Learning Rate method to gradually decrease the optimizer's learning rate following a cosine curve.The batch size was set to 16, and the temperature parameter was set to 0.1.Training was conducted for 800 epochs, with model evaluation performed after each epoch using the test dataset.We saved the model when the accuracy, AU-ROC curve, and AU-PR curve achieved their highest values.The model was trained on a single category at a time, ensuring a one-to-one correspondence between the category and the model.We conducted experiments under these conditions and compared the results by applying different augmentations within the adjacent framework.We report the maximum Area Under the Receiver Operating Characteristics (AU-ROC) and Area Under the Precision-Recall (AU-PR) curves for each category in the MVTec-AD dataset.Finally, ζ is a parameter that controls the size of mosaic anomaly patterns, and η is a parameter that controls the size of Liquify anomaly patterns.Figure 16 shows images of real-world defects alongside images generated using adjacent augmentation.Additionally, Appendix A provides explanations and experimental results that were not included in the main paper.Figure A1 shows the change in loss over 500 epochs of training.Figure A2 shows the changes in accuracy curves over 500 epochs of training.Figure A3 shows the changes in ROC curves over 500 epochs of training.Figure A4 provides a brief explanation of the evaluation metrics we use.
Table 3 illustrates the anomaly detection performance of models trained with Liquify augmentation within the adjacent framework.Our framework designates training data within the batch as positive pairs, thereby standardizing the features of normal data.We compared the performance of various augmentations based on the adjacent framework with those of the SimCLR framework.4 compares synthetic anomaly images generated by previous augmentation methods with those generated by highly correlated adjacent augmentation.Parameters ζ and η indicate the degree of transformation applied by the adjacent augmentation.We present the maximum AU-ROC and AU-PR for 10 categories in the MVTec-AD dataset.Table 5 provides an ablation study on the impact of excluding synthetic anomaly data as negative samples within the adjacent framework.Synthetic anomaly images generated by adjacent augmentations have contours like real-world defects.By using these synthetic anomaly images as negative samples, the model learns improved anomaly features.The 'none' column represents learning without generating negative samples from the adjacent framework.Figure 17 illustrates the size of the Liquify pattern according to the parameter η, which controls the distance that a point moves.Table 6 shows the relationship between anomaly detection performance and the size of Liquify patterns.We provide maximum AU-ROC and AU-PR for 15 categories in the MVTec-AD dataset.Finally, we compared our method with various anomaly detection algorithms.Our adjacent framework, incorporating synthetic images and contrastive learning, demonstrated superior performance across multiple categories, highlighting its effectiveness in addressing class imbalance and improving anomaly detection.Table 7 shows the results of applying our method to the VisA dataset.Table 8 compares our proposed method with various anomaly detection approaches.

Summary of Findings
In this paper, we introduce the adjacent augmentation technique and its framework to address the persistent challenge in anomaly detection.Our method integrates image augmentation with a learning framework to improve the recognition and identification of anomalous patterns.Adjacent augmentation addresses class imbalance by generating high-quality anomalous image features that retain shape while distorting contours, thus enhancing correlation with normal images.The adjacent framework standardizes the distribution of normal features by treating all training data within a batch as positive pairs and effectively learns the distinctions between normal and anomalous features using synthetic images.In other words, our augmentation methods simulate real-world defect patterns by introducing controlled distortions that resemble actual anomalies.For instance, as shown in Figure 15, positive samples generated through adjacent augmentations are embedded closer to the anchor using NCE loss and cosine similarity loss.In contrast, negative samples generated by the Mosaic, Liquify, and Mosiquify methods are embedded

Summary of Findings
In this paper, we introduce the adjacent augmentation technique and its framework to address the persistent challenge in anomaly detection.Our method integrates image augmentation with a learning framework to improve the recognition and identification of anomalous patterns.Adjacent augmentation addresses class imbalance by generating high-quality anomalous image features that retain shape while distorting contours, thus enhancing correlation with normal images.The adjacent framework standardizes the distribution of normal features by treating all training data within a batch as positive pairs and effectively learns the distinctions between normal and anomalous features using synthetic images.In other words, our augmentation methods simulate real-world defect patterns by introducing controlled distortions that resemble actual anomalies.For instance, as shown in Figure 15, positive samples generated through adjacent augmentations are embedded closer to the anchor using NCE loss and cosine similarity loss.In contrast, negative samples generated by the Mosaic, Liquify, and Mosiquify methods are embedded farther from the anchor using cosine similarity loss.The advantage of our generation method lies in its ability to produce a wide range of realistic anomalies that closely mimic real-world defects.This enhances the model's ability to distinguish between normal and anomalous data, making it more robust compared to other generation methods that might only focus on simpler or less varied synthetic defects.

Comparison with Existing Methods
CutPaste and SmoothBlend are effective in generating synthetic anomalies, but they primarily rely on simple cut-and-paste operations or blending techniques, which may not fully capture the complexity of real-world defects.These methods often struggle to simulate the intricate anomaly patterns found in diverse industrial settings, and they can sometimes introduce unrealistic artifacts that hinder the model's generalization ability.
In contrast, our proposed methods-Mosaic, Liquify, and Mosiquify-create more complex and realistic synthetic anomalies that better resemble real-world defects.By focusing on both local and global image distortions, our methods effectively simulate a broader range of anomaly types.Furthermore, our adjacent framework leverages the correlation between anomalous patterns and surrounding pixels, leading to more robust learning and better detection performance.
Building on this, traditional approaches like CutPaste and SmoothBlend often suffer from low correlation between the anomalous pattern and its surrounding area, which can result in an ineffective learning of anomalies.In contrast, our adjacent augmentation technique generates highly correlated anomalous patterns, facilitating more effective integration into normal images.This was evidenced by our experiments, which demonstrated the significant impact of these highly correlated patterns on anomaly detection performance.As a result, our method outperformed existing techniques such as CutPaste and SPD, significantly improving AU-ROC and AU-PR scores across various categories in the MVTec-AD dataset.

Impact of Deep Learning Architecture
The effectiveness of anomaly detection is significantly influenced by the choice of deep learning architecture.In our study, we employed the ResNet50 backbone network, renowned for its capability to learn complex representations in image data.The residual connections in ResNet50 mitigate the vanishing gradient problem and enable the training of deeper networks, thereby capturing intricate patterns in the data.Furthermore, our framework utilizes contrastive learning to enhance the model's ability to learn meaningful representations.By maximizing the similarity between positive pairs and minimizing it between negative pairs, the model can effectively discriminate between similar and dissimilar data points.This approach aligns with advancements in self-supervised learning, which have demonstrated superior performance across various tasks.

Limitations
Our adjacent augmentation method is currently focused on image-based anomaly detection, thereby limiting its applicability to other types of datasets, such as text, time-series, or video data.Achieving comparable performance on these data types may necessitate additional research and the development of specialized augmentation techniques.Furthermore, feature matching-based anomaly detection methods can offer more accurate detection through advanced feature extraction and matching algorithms.However, our method does not fully integrate these complex matching techniques, which may constrain its ability to detect subtle differences in high-dimensional feature spaces.Specifically, our approach may not perform optimally in scenarios where detecting subtle anomalous patterns that closely resemble normal patterns is critical.Considering these limitations, our research introduces a novel approach to image-based anomaly detection, but further investigation and improvements are required to extend its applicability to diverse data types and more complex anomaly detection scenarios.Future work will focus on overcoming these limitations and enhancing our method to increase its applicability across various domains.

Conclusions
Our adjacent augmentation method enhances anomaly detection performance by generating high-quality synthetic anomalies that are closely correlated with their surroundings.Through extensive experiments, we demonstrated the effectiveness of our approach in alleviating class imbalance and improving model performance.By leveraging contrastive learning and robust deep learning architectures, our framework makes significant contributions to the field of anomaly detection.The potential applications of our method are vast, offering improved reliability and accuracy in various industrial contexts.

Figure 1 .
Figure 1.Images from the MVTec-AD dataset.This dataset comprises object and texture classes.Normal images feature a green border, while anomaly images are outlined in red.Defects in this dataset are indicated by a red border.

Figure 1 .
Figure 1.Images from the MVTec-AD dataset.This dataset comprises object and texture classes.Normal images feature a green border, while anomaly images are outlined in red.Defects in this dataset are indicated by a red border.

Figure 2 .
Figure 2. Utilizing deep one-class classification for anomaly detection.This algorithm determines the normalcy of input data by assessing whether they resides within a hypersphere formed by normal data.The figure illustrates the process of constructing such a hypersphere using a neural net1work to discern the features characteristic of normal data.

Figure 3 .
Figure 3.This figure compares ℓ -autoencoder and SSIM-autoencoder for anomaly detection using autoencoders.An autoencoder trained on normal data compresses input fabric textures and then reconstructs them as normal fabric textures.The ℓ -autoencoder removes defects during reconstruction, while the SSIM-autoencoder retains defects.Therefore, the SSIM-autoencoder shows better anomaly detection performance than the ℓ -autoencoder.

Figure 2 .
Figure 2. Utilizing deep one-class classification for anomaly detection.This algorithm determines the normalcy of input data by assessing whether they resides within a hypersphere formed by normal data.The figure illustrates the process of constructing such a hypersphere using a neural net1work to discern the features characteristic of normal data.

Figure 2 .
Figure 2. Utilizing deep one-class classification for anomaly detection.This algorithm determines the normalcy of input data by assessing whether they resides within a hypersphere formed by normal data.The figure illustrates the process of constructing such a hypersphere using a neural net1work to discern the features characteristic of normal data.

Figure 3 .
Figure 3.This figure compares ℓ -autoencoder and SSIM-autoencoder for anomaly detection using autoencoders.An autoencoder trained on normal data compresses input fabric textures and then reconstructs them as normal fabric textures.The ℓ -autoencoder removes defects during reconstruction, while the SSIM-autoencoder retains defects.Therefore, the SSIM-autoencoder shows better anomaly detection performance than the ℓ -autoencoder.

Figure 3 .
Figure 3.This figure compares  2 -autoencoder and SSIM-autoencoder for anomaly detection using autoencoders.An autoencoder trained on normal data compresses input fabric textures and then reconstructs them as normal fabric textures.The  2 -autoencoder removes defects during reconstruction, while the SSIM-autoencoder retains defects.Therefore, the SSIM-autoencoder shows better anomaly detection performance than the  2 -autoencoder.

Figure 4 .
Figure 4. Utilizing memory bank for anomaly detection.The memory bank retains features extracted from normal patches.The model then compares the features of the input image with those stored in the memory bank.If there's at least one discrepancy between the input patches and the stored normal patches, the model classifies the input image as an anomaly.

Figure 5 .
Figure 5. Illustration of the class imbalance problem.In anomaly detection, class imbalance occurs when the quantity of normal data points greatly surpasses that of anomaly data points.This imbalance poses challenges for both model training and performance assessment.Particularly, when anomaly data are scarce, the model may struggle to differentiate between normal and anomaly instances.

Figure 4 .
Figure 4. Utilizing memory bank for anomaly detection.The memory bank retains features extracted from normal patches.The model then compares the features of the input image with those stored in the memory bank.If there's at least one discrepancy between the input patches and the stored normal patches, the model classifies the input image as an anomaly.

Figure 4 .
Figure 4. Utilizing memory bank for anomaly detection.The memory bank retains features extracted from normal patches.The model then compares the features of the input image with those stored in the memory bank.If there's at least one discrepancy between the input patches and the stored normal patches, the model classifies the input image as an anomaly.

Figure 5 .
Figure 5. Illustration of the class imbalance problem.In anomaly detection, class imbalance occurs when the quantity of normal data points greatly surpasses that of anomaly data points.This imbalance poses challenges for both model training and performance assessment.Particularly, when anomaly data are scarce, the model may struggle to differentiate between normal and anomaly instances.

Figure 5 .
Figure 5. Illustration of the class imbalance problem.In anomaly detection, class imbalance occurs when the quantity of normal data points greatly surpasses that of anomaly data points.This imbalance poses challenges for both model training and performance assessment.Particularly, when anomaly data are scarce, the model may struggle to differentiate between normal and anomaly instances.
to slightly modify the concepts of the SimCLR framework.Our adjacent framework generates two synthetic normal images and one synthetic anomaly image from each training image.The training image and synthetic normal images are set as positive pairs, while each training image and synthetic anomaly image is set as a negative pair.Additionally, all training images within the batch are treated as positive pairs, helping to establish a robust normal image distribution.This framework enhances the learning of distinctions between normal and anomalous images and employs synthetic anomaly images to address class imbalance.Figure 6 compares our framework with the SimCLR framework.anomaly image from each training image.The training image and synthetic normal images are set as positive pairs, while each training image and synthetic anomaly image is set as a negative pair.Additionally, all training images within the batch are treated as positive pairs, helping to establish a robust normal image distribution.This framework enhances the learning of distinctions between normal and anomalous images and employs synthetic anomaly images to address class imbalance.Figure 6 compares our framework with the SimCLR framework.

Figure 6 .
Figure 6.The difference between (a) SimCLR framework and (b) adjacent framework.While the SimCLR framework designates the training data within the batch as negative pairs, the adjacent framework pairs them as positive pairs.Notably, the adjacent framework embeds the features of normal data into the hypersphere space, resulting in improved discrimination between the features of normal data and those of anomaly data.

Figure 6 .
Figure 6.The difference between (a) SimCLR framework and (b) adjacent framework.While the SimCLR framework designates the training data within the batch as negative pairs, the adjacent framework pairs them as positive pairs.Notably, the adjacent framework embeds the features of normal data into the hypersphere space, resulting in improved discrimination between the features of normal data and those of anomaly data.

Figure 7 .
Figure 7. Depicted here is an image with Weak Overall augmentation.Weak Overall augmentation involves subtle adjustments to the anchor's size and a mild application of Gaussian blur.Additionally, horizontal flipping occurs randomly with a specific probability.These Weak Overall samples aid in reducing sensitivity to minor overall changes.

Figure 7 .
Figure 7. Depicted here is an image with Weak Overall augmentation.Weak Overall augmentation involves subtle adjustments to the anchor's size and a mild application of Gaussian blur.Additionally, horizontal flipping occurs randomly with a specific probability.These Weak Overall samples aid in reducing sensitivity to minor overall changes.

Figure 8 .
Figure 8. Depicted here is an image with Strong Overall augmentation.Strong Overall augmentation significantly alters the size and color of anchor images.Moreover, Gaussian blur, horizontal flipping, and grayscale are applied with varying probabilities.Strong Overall samples promote the learning of intricate features within normal images.

Figure 9 .
Figure 9. Depicted here is an image with CutPaste augmentation.CutPaste augmentation entails cutting a square patch from the anchor image and pasting it onto the original image.These CutPaste samples, which distort continuous patterns in normal images, facilitate the learning of discontinuous features present in anomaly data.

Figure 8 .
Figure 8. Depicted here is an image with Strong Overall augmentation.Strong Overall augmentation significantly alters the size and color of anchor images.Moreover, Gaussian blur, horizontal flipping, and grayscale are applied with varying probabilities.Strong Overall samples promote the learning of intricate features within normal images.

Figure 8 .
Figure 8. Depicted here is an image with Strong Overall augmentation.Strong Overall augmentation significantly alters the size and color of anchor images.Moreover, Gaussian blur, horizontal flipping, and grayscale are applied with varying probabilities.Strong Overall samples promote the learning of intricate features within normal images.

Figure 9 .
Figure 9. Depicted here is an image with CutPaste augmentation.CutPaste augmentation entails cutting a square patch from the anchor image and pasting it onto the original image.These CutPaste samples, which distort continuous patterns in normal images, facilitate the learning of discontinuous features present in anomaly data.

Figure 9 .
Figure 9. Depicted here is an image with CutPaste augmentation.CutPaste augmentation entails cutting a square patch from the anchor image and pasting it onto the original image.These CutPaste samples, which distort continuous patterns in normal images, facilitate the learning of discontinuous features present in anomaly data.

Figure 10 .
Figure 10.Depicted here is an image with SmoothBlend augmentation.SmoothBlend augmentation involves cutting a small, round patch from the anchor image and pasting it onto the original image.These SmoothBlend samples, which distort local detailed patterns in normal images, encourage the learning of detailed features found in anomaly data.

Figure 10 .
Figure 10.Depicted here is an image with SmoothBlend augmentation.SmoothBlend augmentation involves cutting a small, round patch from the anchor image and pasting it onto the original image.These SmoothBlend samples, which distort local detailed patterns in normal images, encourage the learning of detailed features found in anomaly data.

Figure 11 .
Figure 11.Depicted here is an image with Mosaic (ζ = 20) augmentation.Mosaic augmentation transforms color and resolution by specifying circular areas in anchor images.These Mosaic samples, which distort the resolution and color patterns of normal images, encourage the learning of natural and small defects present in anomaly data.

3. 1
.6.Liquify Liquify augmentation distorts image contours by displacing random points, thereby generating patterns reminiscent of liquid flow.This technique aids the model in learning to classify distorted contour patterns, effectively simulating natural defects such as scratches and sagging.Figure12shows a Liquify sample.Algorithm to generate Liquify samples:1.The first step is to apply the Weak Overall augmentation.2. The second step is to assign a random point to the image.3.The third step specifies each coordinate of the four triangles centered around the designated point.4. The fourth step moves the specified point to a random location at a distance of the image size × (1/η)%.5.In the final step, four triangles move as the point moves, creating contour distortion.

Figure 11 .
Figure 11.Depicted here is an image with Mosaic (ζ = 20) augmentation.Mosaic augmentation transforms color and resolution by specifying circular areas in anchor images.These Mosaic samples, which distort the resolution and color patterns of normal images, encourage the learning of natural and small defects present in anomaly data.

Figure 12 .
Figure 12.Depicted here is an image with Liquify (η = 0.03) augmentation.Liquify augmentation randomly selects a point on the training image and transforms its contours as they move.These Liquify samples maintain the shape of the normal image while distorting the contours, facilitating the learning of unnatural contours present in anomaly data.

Figure 12 .
Figure 12.Depicted here is an image with Liquify (η = 0.03) augmentation.Liquify augmentation randomly selects a point on the training image and transforms its contours as they move.These Liquify samples maintain the shape of the normal image while distorting the contours, facilitating the learning of unnatural contours present in anomaly data.

Figure 12 .
Figure12.Depicted here is an image with Liquify (η = 0.03) augmentation.Liquify augmentation randomly selects a point on the training image and transforms its contours as they move.These Liquify samples maintain the shape of the normal image while distorting the contours, facilitating the learning of unnatural contours present in anomaly data.3.1.7.MosiquifyMosiquify augmentation synergistically combines the effects of Liquify and Mosaic augmentations, thereby distorting contour, resolution, and color.This technique introduces complex and varied anomalies, facilitating the model's ability to recognize a wide range of anomalous features.Figure13shows a Mosiquify sample.Algorithm to generate Mosiquify samples:1.The first step is to apply the Weak Overall augmentation.2.The second step is to apply the Mosaic (ζ = 20) augmentation.3.The final step is to apply the Liquify (η = 0.05) augmentation.

Figure 13 .
Figure 13.Depicted here is an image with Mosiquify augmentation.Mosiquify augmentation applies both Mosaic (ζ = 20) and Liquify (η = 0.03) augmentations to images.These Mosiquify samples, including two distorted anomalous patterns, promote the learning of various features from the anomaly images.

Figure 13 .
Figure 13.Depicted here is an image with Mosiquify augmentation.Mosiquify augmentation applies both Mosaic (ζ = 20) and Liquify (η = 0.03) augmentations to images.These Mosiquify samples, including two distorted anomalous patterns, promote the learning of various features from the anomaly images.Algorithm to generate Mosiquify samples: 1.The first step is to apply the Weak Overall augmentation.2. The second step is to apply the Mosaic (ζ = 20) augmentation.3. The final step is to apply the Liquify (η = 0.05) augmentation.

Figure 14 .
Figure 14.Presented here are images generated by adjacent augmentations.This figure showcases images created through adjacent augmentations, where Strong Overall augmentation and Weak Overall augmentation produce synthetic normal data, while Mosaic augmentation, Liquify augmentation, and Mosiquify augmentation generate synthetic anomaly data.

Figure 14 .
Figure 14.Presented here are images generated by adjacent augmentations.This figure showcases images created through adjacent augmentations, where Strong Overall augmentation and Weak Overall augmentation produce synthetic normal data, while Mosaic augmentation, Liquify augmentation, and Mosiquify augmentation generate synthetic anomaly data.

Figure 15 .
Figure 15.Overview of the adjacent augmentation and its framework. : normal image serving as anchor. : positive sample generated by Strong Overall. : positive sample generated by Weak Overall.̃ : negative sample generated by mimicking actual defects.The image passes through the encoder (f (•)) to become a representation (h).The representation (h) passes through the projector (g (•)), and then l2 normalization is applied to the projection (z).

Figure 15 .
Figure 15.Overview of the adjacent augmentation and its framework.x i : normal image serving as anchor.xi : positive sample generated by Strong Overall.∼ x + i : positive sample generated by Weak Overall.∼ z − i : negative sample generated by mimicking actual defects.The image passes through the encoder (f (•)) to become a representation (h).The representation (h) passes through the projector (g (•)), and then l2 normalization is applied to the projection (z).

Figure 16 .
Figure 16.Comparison of real-world defects and synthetic anomaly images.Table 4 compares synthetic anomaly images generated by previous augmentation methods with those generated by highly correlated adjacent augmentation.Parameters ζ and η indicate the degree of transformation applied by the adjacent augmentation.We

Figure 16 .
Figure 16.Comparison of real-world defects and synthetic anomaly images.

Figure A3 .
Figure A3.ROC curves for each augmentation in the Bottle category.

Figure A4 .
Figure A4.The confusion matrix is used to calculate performance metrics such as accuracy, precision, recall, and F1 score.

Figure A3 . 23 Figure A3 .
Figure A3.ROC curves for each augmentation in the Bottle category.

Figure A4 .
Figure A4.The confusion matrix is used to calculate performance metrics such as accuracy, precision, recall, and F1 score.

Table 1 .
Comparison of augmentation speeds across various methods.This table presents the speeds of five augmentations: CutPaste, SmoothBlend, Mosaic, Liquify, and Mosiquify.Our proposed adjacent augmentation offers a simple augmentation approach with a speed comparable to previous methods.

Table 2 .
Detailed information regarding the MVTec-AD dataset.This table outlines the quantity of normal images in the training set and the count of normal and anomaly images in the test set.Additionally, it specifies the number and types of defects within each category.Although this dataset represents an improvement over previous ones, there remains a shortage of anomaly data.

Table 2 .
Detailed information regarding the MVTec-AD dataset.This table outlines the quantity of normal images in the training set and the count of normal and anomaly images in the test set.Additionally, it specifies the number and types of defects within each category.Although this dataset represents an improvement over previous ones, there remains a shortage of anomaly data.

Table 3 .
Differences in anomaly detection performance between SimCLR framework and adjacent framework (A.F.).

Table 4 .
Maximum Area Under the Receiver Operating Characteristics (AU-ROC) and maximum Area Under the Precision-Recall (AU-PR) curves when applied with various augmentations.

Table 5 .
Ablation study of negative samples.

Table 6 .
Relationship between the size of the Liquify anomalous pattern and anomaly detection.

Table 7 .
[14]mum Area Under the Receiver Operating Characteristics (AU-ROC) and maximum Area Under the Precision-Recall curve (AU-PR) when various augmentation is applied in the Visual Anomaly (VisA) dataset[14].

Table 8 .
Compare Liquify and various anomaly detection methods.