Detection of Mask Usage Using Image Processing and Convolutional Neural Network (CNN) Methods

ABSTRACT


INTRODUCTION
The World Health Organization (WHO) classified COVID-19 as a worldwide pandemic in mid-March 2020.The virus, which originated in the Wuhan area of China, spread fast to 123 nations in just three months [1].The first positive case report in Indonesia was announced in March 2020.Since these findings, the addition of positive COVID-19 cases has been increasing every day [2].It was recorded that on January 26, 2022, 4,301,193 cases of patients were confirmed positive for COVID-19.Of these cases, 96% or around 4,127,662 patients have been declared cured, and 3.4% or approximately 144,254 people have died [3].
To avoid the spread of the COVID-19 case, the government has implemented a largescale social restriction program in various districts, limiting citizens' movement.However, the virus's transmission is still on the rise [4].In addition, the government is increasingly encouraging the government to comply with health protocols to raise public awareness in protecting themselves to prevent transmission and break the chain of the spread of the virus [5] [6].https://doi.org/10.24036/tip.v15i1 2 P.ISSN: 2086 -4981 E.ISSN: 2620 -6390 tip.ppj.unp.ac.idVarious health protocols have often been promoted through various media, such as reducing visits to crowded, narrow, and closed places, maintaining social distance, always using face masks, and diligently washing hands properly [7].Instructions on properly washing hands and using the correct mask are everywhere.But there are still many people who do not heed it.For example, in using a face mask, many still see it but do not meet the specified requirements, such as covering the nose and chin [8].
This research departs from the concern that many people have not adhered to the proper use of masks as recommended.Advances in information technology are snowballing, including the development and processing of digital information.A system with digital image classification processing is needed to provide the best solution using technology to overcome these problems.Image classification aims to classify images into specific categories [9].Image classification is currently one of the problems that have long sought a solution in computer vision [10], how to replicate the human ability to comprehend digital picture information such that computers, like humans, can detect items in the image.The feature engineering technique is often quite constrained, as it can only be applied to specific datasets and has no generalizability [11].
Because of its enormous skills in modeling complicated data such as pictures and sounds, Deep Learning has recently become one of the hottest subjects in the area of Machine Learning.The CNN is the Deep Learning approach that is presently producing the most significant results in image recognition (CNN) [8][9], also frequently utilized with picture data.Essentially, CNN is regarded as one of the finest models for solving object detection and identification issues [13].On some datasets, the CNN study showed digital image recognition with accuracy equivalent to humans in 2012.CNN typically leverages the convolution method by applying a specified size convolution kernel (filter) on an image.By multiplying that area of the image with the filter, the computer acquires new representative information [15].websites, and scientific works related to this research.b.In this analysis and design stage, it is explained about the software and hardware needed and the preparation of tools for prototype development c.The prototype manufacturing stage will produce a system prototype that will be tested d.The testing phase of the mask detection prototype aims to determine whether the tool built is functioning or not.If there are problems, repairs are carried out as early as possible e.If the prototype has been tested, the next step is to implement the actual conditions and check whether the system output is the expected target.

System Design 3.2.1. Application Block Diagram Design
Numerous components in the system architecture assist the attainment of a device where the camera as input is then processed using a laptop and provides an output shown on the laptop monitor.

Flowchart Model Convolutional Neural Network
After gathering the data, the CNN model must be trained.In general, CNN comprises two stages: feature learning and classification.The CNN model's image input utilizes a 64x64x3 picture.The number three represents a picture with three channels: Red, Green, and Blue (RGB).The input image will then be processed first through the convolution and pooling processes at the Feature Learning stages.The complete flowchart can be seen in Figure 3

System Implementation 3.2.1. Data Set
A data set is a collection of objects representing data and their relationships in storage media.There are various ways to describe datasets, such as attributes used to define types of things, both qualitative and quantitative.This research uses a public dataset, which P.ISSN: 2086 -4981 E.ISSN: 2620 -6390 tip.ppj.unp.ac.id means using a dataset from a public repository that has been approved by previous researchers and can be accessed globally [15].
The data utilized to evaluate the mask detection system is two-dimensional picture data.For this experiment, 1484 images were used, which were separated into training data and testing data.Training data is used to train machine learning; in this example, the CNN method employed is consistent with the predicted predictions.At the same time, data testing is used to test the accuracy and performance of machine learning.
The dataset used is a public type created by a researcher named Prajna Bhandary.The dataset can be accessed at https://github.com/prajnasb/observations.The dataset consists of 1484 images and is classified into 2 classes, namely the wear class and the class without a mask.Table 1 shows the amount of data for each class for training data, and Table 2 shows the amount of data for each class for data testing

Mask Detection System
At this training stage, it was using pre-trained CNN.The stage that will be carried out is training for the face dataset wearing a mask and not in each class with pre-trained CNN.In the data, 1484 images have been divided to test the accuracy of the dataset.The first process is dividing the facial dataset into train and test data.The following process is training using a network that has been trained or previously trained (pretrained) to determine the accuracy of the dataset using transfer learning.
The introduction of the use of face masks is being tested, with the categories of wearing and wearing a mask.The technique is carried out by taking an image from the camera's video.Figure 6 depicts an example of facial recognition findings.From this test, it was concluded that the system developed was not only able to recognize the use of masks for one face object but could be more than that.

Determination of Test Parameters
The parameters for testing in this CNN algorithm model are determined to produce the best system.The parameters used are the value of the accuracy per class, accuracy per epoch, and loss per epoch value.

Testing Results
The following are the results of several test parameters tested on 1260 image data consisting of 703 class data for wearing masks and 557 classes without masks.The results are as follows: 1. Accuracy per class Accuracy is defined as the ratio of correct predictions (both positive and negative) to total data, This is stated as follows: Where  The learning rate is one of the training parameters used to determine the weight correction value during the training process.In this test, the epoch value is 50, the batch size is 16, and the learning rate is 0.001.Figure 7 depicts the accuracy value for each epoch:    From the same picture seen for the testing data (yellow line), the loss value decreases as the epoch value increases and finally close to 0 at the epoch value of 50.This proves that the system works well (good fit) on the testing data.

CONCLUSION
The CNN approach is employed in this study utilizing 1484 image data, divided into 1260 training data and 224 testing data, and divided into two classes, wearing and not wearing a mask.The developed system model is capable of identifying and distinguishing several face objects from an image/video source displayed on the camera screen.Each object can be recognized and grouped based on 2 predefined classes.From the results of training data processing, an accuracy value of 1 means that all predictions are by the actual.In other words, the system can recognize all images and correctly determine the class of the picture.The accuracy value per epoch on the training data is close to or equal to 1.00 at the time of epoch 10 and testing on the testing data.In the training data, at epoch 10 the loss value is already 0.While in the testing data, there is a slight difference because the loss value 0 was only obtained at epoch 50, it is not much different because the loss value is almost close to the value 0, in the sense that the developed model is already in good fit condition and not included in the underfitting or overfitting category.

Figure 1
Figure 1 below shows the steps taken in completing this research.

Figure 3 .
Figure 3. Flowchart of Object Detection with CNN

Figure 4
Figure 4 below is an example of a dataset used for the data training process.The data set is divided into 2, namely images with masks and without masks.The dataset has been labeled data using the LabelImg module in Python.

Figure 8 .
Figure 8. Loss per epoch value

Figure 8
Figure 8 indicates that by epoch 10, the loss value for the training data (blue line) is already zero.A loss value test is also performed on the testing data to prevent overfitting.Overfitting happens when a model is developed that is overly focused on a specific training dataset, such that it cannot generate accurate predictions when given other comparable datasets.From the same picture seen for the testing data (yellow line), the loss value decreases as the epoch value increases and finally close to 0 at the epoch value of 50.This proves that the system works well (good fit) on the testing data.

Table 1 .
Details of Total Training Data

Table 2 .
Details of Total Testing Data

Table 3 .
Accuracy Results per Class Accuracy per epoch The epoch parameter specifies how many times the algorithm will iterate through the full dataset.A one-time epoch occurs when all datasets are put through a single forward and backward method across all neural network nodes at the same time.The amount of training datasets utilized in one batch is called batch size.In the Neural Network training phase, one Epoch is too long since all data is incorporated; hence, it will take a long time.Typically, the dataset is separated into batches to simplify and speed up the training process.