Harish Kumar Saini, Monika Poriye
Department of Computer Science and Applications, Kurukshetra University, India
ABSTRACT
The Internet of Things (IoT) is the fastest-growing collection of physical entities embedded with technologies to sense and exchange information with other connected devices over the Internet.Since IoT systems are resource-constrained and ad hoc, they are an obvious target for cyberattacks. IoT system security thus requires continual observation and analysis. The application of machine learning (ML) to IoT security holds particular promise for identifying any anomalies in the system’s typical operation. In this paper, we propose to design a Random Forest-Support Vector Machine (RF-SVM) based Anomaly detection framework for IoT. The RF classifier is applied for selecting the optimal features from the extracted traffic data. It includes removing the outliers, redundant data, and choosing the best features with high weight values. Then, SVM is applied for classifying the extracted features and detecting the anomalies. The fitness function is derived in terms of true positives, false positives, and false negatives. From the detected anomalies, the attack type is then determined, and a corresponding warning is sent to the monitoring nodes.In the experimental results, it is shown that the proposed RF-SVM classifier attains
increased detection accuracy with reduced detection overhead and packet drops
KEYWORDS
IoT, Machine Learning, Ensemble, Anomaly detection
1.INTRODUCTION
The Internet of Things (IoT) is the network of physical devices, such as smartphones, and other smart objects that exchange information and provide useful services online. The Internet of Things is a global revolution. It provides the potential for use in a wide range of application areas. It has been widely used in retail, agriculture, smart cities, smart homes, smart industries, and environment monitoring, among other areas. Connecting devices is the aim of the Internet of Things. Wireless Sensor Network (WSN) upgrades are very common. WSNs connect IoT devices to gather environmental data. Due to its limited energy, memory, and processing capabilities, IoT is resource-constrained [1]
Due to IoT systems’ ad hoc and limited resources, they are an obvious target for cyber-attacks. As a result, protecting IoT systems requires constant monitoring and analysis. Prior to an attack, it’s critical to know what to do in the event of an unforeseen situation, take precautions, protect important data, and assure continuity [2]. During the routing of data packets, data packets are quite likely to be exposed. The data packets would be lost if the rogue node invaded the nodes.As a result, the security of data packets in IoT-constrained devices has a significant impact because it is linked to the users [3]. For resource-constrained environments, standard security measures are prohibitively expensive. [4]
In IoT based sensor network, a Distributed Denial of Service (DDoS) attack is feasible whose main purpose is to interrupt the data transfer between end users. This exploit generates malicious
traffic flooding, causing other valid nodes to receive unnecessary packets. Actions of these attackers contribute to the deterioration of the network in terms of greater bandwidth usage, memory utilization, energy consumption, etc. [5]. Malicious Control, Malicious Operation, and Wrong Setup are just some of the additional assaults and abnormalities that might cause an IoT device to malfunction [6]
An intrusion-detection system (IDS) is a must for any IoT traffic environment that is particularly sensitive. A majority of current research on IDS for the Internet of Things is based on rule-based detection. Anomaly-based detection methods are crucial in IoT environments [7] for efficient threat detection
Any anomalies in the system’s behaviour can be detected by using machine learning (ML) for IoT security. An aberrant situation can be detected and protected by a variety of machine learning algorithms. [6][7].
Among the ML algorithms,SVM and RF have been widely used in recent years to suggest feasible solutions to the IDS problem. SVM can provide good decision surfaces by maximizing margins using soft-margin approaches. Though SVM is slightly more accurate,it consumes more time. RF produces similar accuracy in a much faster manner if given modelling parameters. Hence,by combining these two classifiers and creating a hybrid RF-SVM classifier will result in increased accuracy in less time [11]
In this paper, RF-SVM classifier is designed to detect the IoT network traffic anomalies. In contrast to the existing RF-SVM classifiers, here RF classifier is applied for extracting the optimal features from the network traffic data and the SVM is applied for classifying the extracted features and detecting the anomalies.
This paper is organized as follows. Section 2 presents the related works on anomaly detection using ML classifiers and RF-SVM classifiers. Section 3 presents the detailed methodology of the proposed RF-SVM classifier. Section 4 presents the experimental results and Section 5 presents the conclusion
2.RELATED WORKS
2.1. Anomaly Detection using Machine learning (ML) classifiers
IoT attacks and anomalies can be detected using a group of ML classifiers [6]. Decision Tree (DT), Logistic Regression (LR), Support Vector Machine (SVM), Neural Network (ANN). Final models were created using an optimization method based on the training datasets
This system uses a Deep Learning [7] algorithm to identify fraudulent traffic in IoT networks. Network traffic is organised into sessions and anomalous activity is examined. During the training phase, the work is done offline and spans a long period of time. When the data is preprocessed, tuples of features are generated and used to train the model. The perceptual learning model utilises information gained at each perceptual layer to filter out the preferred traits before feeding it to the next perceptual layer
To combat DDoS attacks, a machine-learning framework [8] was developed. The IoT device traffic capture mechanism is capable of capturing a wide range of data. Categorizing and retrieving features based on IoT activity has been done and as a final step, a variety of binary
classification techniques were used to correctly distinguish between normal communication and DoS communications.
For low-resource IoT devices, a game theory-based lightweight anomaly detection approach has been proposed [9]. IoT security has been shown as a game between IDS agents and the attackers in this approach. There are new attack patterns that need to be tracked down by the IDS agent. The training, classification, and rule-making phases of the anomaly detection process are all included.
Using deep migrating learning, a new IoT data feature extraction and IDS has been developed [10]. This document outlines the migration learning model and data feature extraction. Migrating from one subject or activity to another is a process of acquiring new knowledge. Research shows that an IDS model can effectively shorten clustering times while retaining the accuracy required to identify intrusions. However, the accuracy of categorisation can suffer throughout the compression process
2.2. RF-SVM Classifiers for Anomaly Detection
Shanmuga sundari et al [12] have shown that fraud detection using RF and SVM techniques may be compared in terms of accuracy. Using data mining algorithms, they are able to identify both normal and fraudulent transactions based on the past information, including exchanges that have been misrepresented.
Prithi et al [13] have proposed a two-stage hybrid classification technique for intrusion detection. Anomaly detection is done using SVM, while abuse detection is done using (RF)/Decision Tree (DT). In the beginning, the abnormalities are spotted. Second-stage investigation recognises the most common types of DoS and Probe, as well as recognised R2L assaults and User to Root (U2R) assaults.
Two categorization models have been developed by Md. Al MehediHasan et al [14]. The SVM and RF model are used for each. Experiments have shown that any classifier works here. SVM is a little more accurate, but it takes a lot longer to run. RF provides the same level of precision in a considerably faster manner if the model parameters are provided. These classifiers can help improve the accuracy of an IDS system. The KDD’99 Dataset is employed in this study in order to determine which intrusion detector is more effective on this dataset.
In [15], RF-SVM classifier has been applied to classify the gene expression data in Chronic Kidney Disease (CKD). Here, RF is highly accurate and is interpretable and SVM effectively predicts the gene expression data with very high dimensions.
2.2.1. Challenges
Developing an effective and efficient anomaly detection model using machine learning algorithms is a challenging process because of the following reasons[16]
In this paper, we propose to design RF-SVM-basedAnomaly detection mechanism for IoT. Here RF-SVM classifier module is applied to detect the IoT network traffic anomalies. Although there are hybrid RF–SVM pipelines, our work is different in three significant ways: A fitness-driven SVM objective that explicitly optimizes TP, FP, and FN to prioritize anomalous-event recall under IoT IDS constraints, a gateway-centric RF feature selection with an empirically tuned stability threshold that filters redundant/noisy traffic fields before classification, and a deployment-oriented evaluation in NS-2 with TwoRayGround propagation and workload sweeps (monitoring interval and attack-rate) that reports accuracy as well as detection delay and packet drop as first-class metrics pertinent to IoT networks. Together, these components represent a novel experimental and methodological contribution that goes beyond the simple statement, “RF for features, SVM for classification. The RF classifier is applied for selecting the optimal features from the extracted traffic data. It includes removing the outliers, redundant dataand choosing the best features with high weight values. Then, SVM is applied for classifying the extracted features and detecting the anomalies. The fitness function is derived in terms of true positives, false positives and false negatives. From the detected anomalies, the attack type is then determined, and a corresponding warning will be sent to the monitoring nodes.
The decision trees adopt a top-down approach in splitting the data samples in smaller subsets basedd on different decision criteria, which will be discussed. The root node is considered the best predictor. The decision node is the attribute where the highest splitting criterion (information gain for example) is achieved. At a terminal node or leaf node, the splitting process halts; it represents a decision. In this case, a splitting criterion such as the information gain is equal to zero
A major type of decision tree is an ensemble-based decision tree or Random Forest (RF)
3.2.1. Random Forest (RF) Algorithm
RF algorithm is a set of trees and a supervised classification algorithm that generates each tree using a basic sample of the exclusive training data. In order to classify a new item from an input trace, the trace will be positioned beneath each tree in the forest. There is a direct correlation between the number of trees in a forest and the potential fallout; that is, the more trees there are, the more accurate the result. Each and every tree offers a vote to indicate the tree’s preference regarding the item’s category. The group that generates the most votes out of all the trees is chosen by the forest. [14][17].
There are two stages in the RF algorithm: (i) RF creation, (ii) creating a calculation from the arbitrary forest classifier made in the preliminary phase [17].
The whole process is shown below:
(i) Arbitrary forest formation
(ii) Arbitrary forest estimation
3.2.2. RF-based Feature Selection
RF uses a technique called bootstrap aggregation (Bagging), which samples the data set used in the classification task, randomly with replacement. The bootstrap method is a resampling technique to generate slightly different data sets from the original training data set, and bagging combines many classifiers trained with slightly.
Let m represent how many instances there are in the real training set. Create a bootstrap model of dimension m using the real training data. Let m represent the total number of input structures found in the real training set. For each tree where k<m, only k features are randomly chosen from the bootstrap model data. At each node of the tree, the traits from this group form the best possible fragment. The value of m should remain constant throughout the forest’s ascent [18].
The original Packet Capturing Files (PCAP),which have the network packages were primarily changed and characterized in a Packet Description Markup Language (PDML) format
The features that reflect the device characteristics and behaviour related to various attacks are only considered. Each feature has been assigned a weight value. The best subset of features with higher weight values is selected by RF algorithm
Table 1.Extracted features and assigned weights
Let S be the source node and {F} be the set of selected features. Let DS(A) be the actual data set, BSm(A) represents the bootstrap model of m features, and DS(C) represent the collected dataset by each gateway . The overall weight value is the number of most repeated features divided by the total number of features. Let MinWbe the minimum threshold or lower bound of W. In our work, the value of MinW is kept as 3 based on sensitivity analysis by testing the MinW value (1- 5) and observing that MinW = 3 provided the best detection accuracy without an increase in false positives.The features which are having weight values greater than MinW are included for classification. The features that were given the highest weight values, as shown in Table 1, were rated the highest due to being accountable, through the Random Forest feature selection and importance ranking, for the greatest strength in distinguishing normal traffic from compromised traffic or exhibiting the most discriminative power. These features (e.g., repeated failed attempts or session errors) are clearly related to attack behavior and are therefore useful signals for the SVM classifier. The selection of these parameters, along with their high weighting, is corroborated by previous research and domain knowledge, confirming that anomalies found in these domains are significant indicators of malicious activity or intrusion attempts in IoT environments.
The proposed RF-based feature selection algorithm is presented below:
Algorithm: RF-based feature selection
According to this algorithm, the optimum features selected by the RF classifier algorithm are: serv_error_rate, Dst_host_serror_rate, No_failed_attempts, No_file_creations, No_access_files, and No_compromized.
3.3. SVM Classifier
The basic principle of SVM is finding the optimal linear hyperplane in the feature space that maximally separates the two target classes [16]. Geometrically, the SVM modelling algorithm finds an optimal hyperplane with the maximal margin to separate two classes,
In SVM, the training set is provided as
Here, xj is the input characteristic vector of thej thmodel, and yj is the output catalogue = +1 or -1.
SVM splits the +ve and -ve instances by means of a hyperplane as
Here, w.x signifies the dot product of w & x.
SVM calculates the finest hyperplane by exploiting the border.
The choice function f(x) = sgn(g(x)) for an event is provided as
where i is the numerical limit for imminent vector xi
Figure 3. Concept of SVM
3.4. Detection Algorithm using SVM
For classifying the extracted features and detecting the anomalies, we use SVM. The fitness function (F) is derived in terms of True positives (TP), False positives (FP) and False negatives (FN) as follows:
Since the algorithm focuses on anomalous event prioritization and minimization of missed detections True negatives are excluded, as recommended in IoT IDS literature.
By means of the training vector in two sets and the label vector y, the provision vector method needs the resolution of the succeeding issue
subject to
where w Rd is the mass vector
C R+ is the regularization constant
is the mapping function that projects the training data into a suitable feature space so as
to allow non-linear decision surfaces.
The following algorithm shows the process of the SVM classifier to classify the malicious traffic
flows
Procedure: SVM Classification
The features of legitimate devices will have unique fitness values, whereas the features of compromised devices will deviate from others. Hence, the fitness function is applied to the
extracted features and the, features having the least fitness values are considered as anomalies. The corresponding device or user is fetched from the PCAP history and blocked from further operations.
4.EXPERIMENTAL RESULTS
4.1. Dataset and PDML
The training and testing were performed on the DARPA 2009 IDS dataset. Though the DARPA 2009 dataset is old and not specific to IoT, it is still used in some research for baseline benchmarking and method validation purposes. The dataset is well-structured and the labeled traffic is publicly available, allowing researchers to compare their results with a large number of previous studies that were conducted using the same dataset and its labeled traffic. In doing so, the DARPA 2009 dataset allows researchers to have a common dataset to refer to when evaluating a new algorithm against the established results from a long-standing body of intrusion detection research. Additionally, the DARPA 2009 dataset can also serve as a useful avenue for testing the general detection capacity of models as they are deployed and applied to more complex and heterogeneous datasets after testing them first under a controlled and wellestablished environment.
The dataset comprises about 7000 PCAP files. The dataset comprises a variety of security events and attack types. PCAP files are data files generated using tools such as Libpcap of Linux. These files contain packet data of a network and are used to analyze the network characteristics. They alsocontribute to controlling the network traffic and determining network status.
Wireshark can save network packet dissections in a PDML file. PDML conforms to the XML standard and contains details about the packet analysis.
4.2. Comparison with Existing Techniques
The proposed RF-SVM based anomaly detection framework is simulated in NS2 and compared with the existing Lightweight Anomaly Detection (LAD) [10] and Deep Migration Learning (DML) based IDS [11]. The performance is evaluated in terms of detection delay, detection accuracy, and packet drop. Table 3 shows the experimental parameters used in the simulation.
Table 3.Experimental parameters
A. Varying the Monitoring interval
In this first experiment we vary the simulation time as 20,40,60,80 and 100sec.
Table 4.Results of detection delay for various intervals
Figure 7.Detection Delay for various intervals
Table 4 and Figure 7 show the detection delay measured for RF-SVM, LAD and DML when the monitoring intervals flows are varied. As we can see from the figure, the delay of RF-SVM is 25% of lesser than DML and 27% lesser than LAD
Table 5.Results of detection accuracy for various intervals
Figure 8.Detection Delay for various intervals
Table 5 and Figure 8 show the detection accuracy measured for RF-SVM, LAD and DML when themonitoring intervals are varied. As we can see from the figure, the detection accuracy of RFSVM is 28% higher when compared to DML and 65% higher than LAD.
Table 6 .Results of packet drop for various intervals
Figure 9.Packet drop for various intervals
Table 6 and Figure 9 show the packet drop measured for RF-SVM, LAD and DML when the monitoring intervals are varied. As we can see from the figure, the packet drop of RF-SVM is 50% less when compared to DML and 93% lesser than LAD.
B. Based on Attack Frequency
In our second experiment we vary the frequency of attacks from 25Kb/s to 125Kb/s
Table 7 .Results of detection delay for various attack frequency
Figure 10.Detection delay for various attack frequency
Table 7 and Figure 10 show the detection delay measured for RF-SVM,LAD and DML when the attack frequencies are varied. As we can see from figure, the detection delay of RF-SVM is 27% lesser then DML and 44% lesser than LAD.
Table 8 .Results of detection accuracy for various attack frequency
Figure 11. Detection accuracy for various attack frequency
Table 8 and Figure 11 shows the detection accuracy measured for RF-SVM, LAD and DML when the attack frequencies are varied. As we can see from figure, the detection accuracy of RFSVM is 19% high when compared to DML and 42% higher than LAD.
Table 9 .Results of packet drop for various attack frequency
Figure 12. Packet drop for various attack frequency
Table 9 and Figure 12 show the packet drop measured for RF-SVM, LAD and DML when the flows attack frequencies are varied. We can see that, the packet drop of RF-SVM is 69% less when compared to DML and 92% lessthan LAD.
5.CONCLUSION
In this paper, we propose to develop a Random Forest-Support Vector Machine (RF-SVM) based Anomaly detection mechanism for IoT. The RF classifier is applied for selecting the optimal features from the extracted traffic data. It includes removing the outliers, redundant data, and choosing the best features with high weight values. Then, SVM is applied for classifying the extracted features and detecting the anomalies. The fitness function is derived in terms of true positives, false positives, and false negatives. From the detected anomalies, the attack type is then determined, and a corresponding warning is sent to the monitoring nodes.
The proposed RF-SVM attains the highest accuracy, precision, recall, and the F1-score when compared to these algorithms. The proposed RF-SVM-based anomaly detection framework is simulated in NS2 and compared with the existing LAD and DML-based IDS techniques. The performance is evaluated in terms of detection delay, detection accuracy, and packet drop. In the experimental results, it is shown that the proposed RF-SVM classifier attains increased detection accuracy with reduced packet drops.
CONFLICTS OF INTEREST
The authors have no competing interests to declare that are relevant to thecontent of this article.
There is no conflict of interest.
REFERENCES
[1] R. Stephen and L. Arockiam, “RIAIDRPL: Rank Increased Attack (RIA) Identification Algorithm for Avoiding Loop in the RPL DODAG,” Int. J. Pure Appl. Math., vol. 119, no. 16, 2018.
[2] F. Y. Yavuz, D. Ünal, and E. Gül, “Deep learning for detection of routing attacks in the Internet of Things,” Int. J. Comput. Intell. Syst., vol. 12, pp. 39–58, 2018.
[3] Z. A. Almusaylim, N. Z. Jhanjhi, and A. Alhumam, “Detection and mitigation of RPL rank and version number attacks in the Internet of Things: SRPL-RP,” Sensors, vol. 20, 2020.
[4] A. Aris, S. F. Oktug, and B. O. Yalcin, “RPL version number attacks: In-depth study,” in Proc. IEEE Conf., 2016.
[5] B. A. Alabsi, M. Anbar, S. Manickam, and O. E. Elejla, “DDoS attack aware environment with secure clustering and routing based on RPL protocol operation,” IET Circuits Devices Syst.,2019.
[6] M. Hasan, M. M. Islam, M. I. Islam Zarif, and M. M. A. Hashem, “Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches,” Internet Things, vol. 7, 2019.
[7] G. Thamilarasu and S. Chawla, “Towards deep-learning-driven intrusion detection for the Internet of Things,” Sensors, 2019.
[8] R. Doshi, N. Apthorpe, and N. Feamster, “Machine learning DDoS detection for consumer Internet of Things devices,” arXiv:1804.04159v1 [cs.CR], 2018.
[9] H. Sedjelmaci, S. M. Senouci, and M. Al-Bahri, “A lightweight anomaly detection technique for low-resource IoT devices: a game-theoretic methodology,” in IEEE Int. Conf. Commun. (ICC), Mobile and Wireless Networking Symp.,2016.
[10] D. Lia, L. Deng, M. Lee, and H. Wang, “IoT data feature extraction and intrusion detection system for smart cities based on deep migration learning,” Int. J. Inf. Manag., vol. 49, pp. 533–545, 2019
[11] F. Huang, J. Shen, Q. Guo, and Y. Shi, “eRFSVM: A hybrid classifier to predict enhancers—
integrating random forests with support vector machines,” Hereditas, 2016.
[12] M. Shanmugasundari and R. K. Nayak, “Master card anomaly detection using random forest and support vector machine algorithms,” J. Crit. Rev., vol. 7, no. 9, 2020.
[13] S. Prithi and S. Sumathi, “Intrusion detection system using hybrid SVM-RF and SVM-DT in wireless sensor networks,” Int. J. Recent Technol. Eng., vol. 8, no. 2S8, 2019.
[14] M. A. M. Hasan, M. Nasser, B. Pal, and S. Ahmad, “Support vector machine and random forest modelling for intrusion detection system (IDS),” J. Intell. Learn. Syst. Appl., vol. 6, no. 1, Feb. 2014.
[15] Z. Rustam, E. Sudarsono, and D. Sarwinda, “Random-Forest (RF) and support vector machine (SVM) implementation for analysis of gene expression data in chronic kidney disease (CKD),” in Proc. 9th Annu. Basic Sci. Int. Conf., 2019.
[16] A. Diro et al., “A comprehensive study of anomaly detection schemes in IoT networks using machine learning algorithms,” Sensors, vol. 21, no. 24, Art. no. 8320, 2021.
[17] Pughazendi N, Valarmathi K, Rajaraman PV, Balaji S. RETRACTED: Reliable cluster based data collection framework for IoT-big data healthcare applications. Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology. 2023;0(0). doi:10.3233/JIFS-233505.
[18] B. Duraisamy, S. Gopalakrishnan, S.-Y. Hsieh, and S.-L. Peng, Intelligent Computing and Innovation on Data Science: Proceedings of ICTIDS 2021. Springer, 2021.