Automated Microsegmentation for Lateral Movement Prevention in Industrial Internet of Things (IIoT)

The integration of the IoT network with the Operational Technology (OT) network is increasing rapidly. However, this incorporation of IoT devices into the OT network makes the industrial control system vulnerable to various cyber threats. Hacking an IoT device at the network edge, an attacker can move laterally to compromise the main control server and manipulate the whole control system of the industrial infrastructure. In this paper, we have proposed an automated Micro-segmentation (MS) model based on Machine Learning (ML) algorithms to reduce the lateral movement of an attacker or malware. The proposed model generates the micro-segments based on network traffic and blocks the malicious traffic at each segment. We have taken UNSW-NB15 and IoTID20 datasets for our experiments. Experimental results show that after generating micro-segments and separating the normal traffic, the model limits redundant links and blocks malicious traffic. Limiting the usage of redundant links reduces the lateral movement or spreading of malware. We also considered the deterministic epidemic model to analyze the device infection rate due to lateral movement or malware propagation.


I. INTRODUCTION
Information technology (IT) is the application of computers, networking devices, communication technologies for collecting' processing, storing, and communicating digital data [1]. On the contrary, OT involves industrial infrastructures, which use SCADA or control system networks for direct monitoring and controlling the industrial equipment [2]. Unlike the IT, the OT network, which includes devices like Programmable Logic Controllers (PLC), has power issues, slower processing capability, low memory and a much longer upgrade cycle [3]. The integration of the Industrial Internet of Things (1IoT) in the industrial manufacturing environment converges the IT and OT networks.The convergence of IT and OT offers various benefits including improved safety, increased productivity, efficiency, and predictive maintenance [4].
Along with these benefits, the convergence of IT and OT networks faces severe security risks. Due to the connection with the IloT network, the OT network becomes accessible 978-1-7281-9266-6/21/$31.00 ©2021 IEEE 3 rd Sergei Petrovski

School of Electric Stations Samara State Technical University
Samara, Russian Federation petrovski.sv @samgtu.ru throughout the Internet [5]. Moreover, the OT devices like PLCs or other controlling devices were not designed with the consideration of security vulnerabilities [5]. An attacker can gain access to the OT network by bypassing the loT network using lateral movement. Lateral movement or eastwest traffic enables an attacker to compromise the entire network, including internal servers and other devices [6]. This compromisation of controlling devices may result in massive damage in the industrial domain. For instance, an attacker took control over the main server of Oldsmar's water treatment plant, Florida, the USA, in February 2021 [7]. By taking control at the water treatment plant, the attacker abnormally increased the amount of NaOH in the water, which may cause vision problems, pain, shock if consumed. Securing the loT devices may prevent lateral movement. However, the loT network is vulnerable to various security threats [8], and these vulnerabilities create loopholes for lateral movement. Moreover, replacing the cloud network with edge devices cut the centralized control over the loT devices. An attacker can hack or snitch the loT devices at the network edge and inject malware. Without special security measures, any device in the network can access any other device like in Mesh topology [9], which enables the malware to reach anywhere in the network. This malware enables an attacker to compromise the internal servers. Therefore, securing the loT network for preventing lateral movement has become indispensable. But according to a survey, 99% of security professionals are struggling to secure the loT devices and facing challenges to update security patches using firmware update [10].
Network MS is a promising way to prevent lateral movement throughout the loT network. MS prevents lateral movement and reduces the attack surface by splitting a large network into several smaller network segments [11]. Then, the access control of each device in a micro-segment is restricted within the segment perimeter by imposing specific security rules. Therefore, the devices within a micro-segment cannot communicate with other devices outside of its restricted perimeter. Restricting the access can confine a malware or an attacker within the segment and reduce further movement outside the compromised device's segment. Although MS is widely applied to secure the cloud and workloads of servers [11], it is challenging for the loT networks due to several reasons. First of all, the loT network is large and dynamic, which creates difficulty in identifying proper segments. Secondly, it is difficult to maintain and update a large number of micro-segments with the security rules periodically. Intelligent algorithms can be used to overcome these tedious jobs of maintaining MS and security policies for the loT networks.
In this work, we have proposed an automated MS procedure and security rules generation for each segment based on ML algorithms. The micro-segments are generated through the OPTICS clustering algorithm. Then a Decision Tree (DT) classification algorithm is used to separate the malicious network traffic from the legitimate traffic data. These traffic data are then used to generate packet filtering policy.
In section II we have discussed the related works. Section III presents the system model including, the network model, threat model, and proposed MS process. Section IV demonstrates the experimental results. In section V we have analyzed the security enhancement by MS, and finally, section VI concludes this paper with future works.

II. LITERATURE REVIEW
A very few research works have been conducted for preventing lateral movement in loT network domain -some related studies are discussed in this section. The authors in [6] proposed a micro-segmentation technique based on edge cloud architecture for smart home loT networks, using Open flow rules. The proposed model blocks attackers from accessing the LAN and WAN of the smart home loT network. However, the open flow rules are static and need to be updated manually. Also, the approach applied for smart homes is not suitable for large scale and dynamic IloT networks.
The authors in [12] proposed an evidence reasoning lateral movement detection technique for the cloud-edge environment. The authors also introduced vulnerability correlation process in lateral movement detection. However, this model is not appropriate for networks which replace the cloud architecture with only edge computing devices.
A micro-segmentation technique is proposed in [13] based on K-means clustering algorithm for enterprise network. However, it is required to define the number of clusters initially for the K-means algorithm, which is not effective for a large scale network like industrial loT or other sensor networks.
The MITRE ATT&CK framework also takes into consideration lateral movements. MITRE ATT&CK can be defined as the set of individual techniques performed by an attacker to accomplish malicious tasks. It was shown in [14] that MITRE ATT&CK encompasses 440 attack techniques belonging to 27 different tactics. These malicious activities may include gaining access to the loT network through the use of phishing links that may compromise other devices through lateral movements. Furthermore, a public repository ( referred to as the MITRE ATT&CK Framework) is available which contains adversary tactics, techniques and procedures based on realworld observations [15]. This publicly available knowledge base provides a rich resource for the development of specific attack detection, prediction and mitigation models.

A. Network Model
In the traditional OT network like SCADA, all the data are collected and analyzed in the centralized server. However, the IloT network improves the SCADA network by introducing edge devices at the network edge. Figure 1 shows the IloT and edge enabled SCADA network, where edge devices are connected with the RTUs. These edge devices then receive data from the sensors, which are connected with the industrial equipment. After receiving data, the edge devices process and provide a real-time decision for maintaining the industrial machinery. An administrator can control the whole network from the control centre and send commands through the RTUs. Also, the data are stored in the central servers for future analysis and optimization [16]. This integration of IloT and edge devices enable the administrators to monitor and control the industrial control system remotely.

B. Threat Model
In this work, we considered the threat due to lateral movement by an attacker. Advanced Persistent Threats (APT) [17] are severe and long-lasting cyber attacks, where lateral movement is an attack phase in which the attacker moves from the compromised devices to other devices [18] [19]. APT can be defined as the theft of intellectual property or espionage as opposed to achieving immediate financial gain and are prolonged, stealthy attacks [20] [21] . For taking control of the main server of the industrial control system, the attacker moves deeper inside the network after hacking an loT device. Therefore, the attacker can gradually compromise the whole network. This compromisation may result in a devastating situation. Moreover, an internal employee may intentionally try to compromise a device and achieve a malicious goal.
C. Background on ML algorithm 1) OPTICS Clustering algorithm: OPTICS is the upgraded version of the DBSCAN algorithm. It was demonstrated in [22] that DBSCAN performs well in clustering network traffic compared to other models. However, unlike DBSCAN, OPTICS is better suited for large scale dataset [23] and do not require the epsilon parameter (the domain knowledge). For these reasons, here we chose the OPTICS clustering algorithm.
2) DT algorithm: A DT is a supervised classification technique that includes internal nodes, which represent the features of the traffic data (for instance, IP address, Flow ID); branches represent the decision rules, and the leaf nodes represent the outcomes (Malicious or Normal). This algorithm uses various feature selection measures like information gain or Gini index to select the best features as the root node or the internal nodes. Information gain (IG) can be defined as in equation (1) [24], which tells us how much a feature provides information about a class.  (1) where, n = number of attributes A, ISil = number of cases in partition Si, lSI =total cases and E is the Entropy as defined below: In this subsection, we will discuss the MS generation process using ML algorithms discussed in the previous subsection.  Figure 2 shows the proposed MS creation model based on ML algorithms. As shown in [11], MS implementation consists of several steps.
Firstly, we need to identify and group the devices which show similar functionalities or behavior. Here, we have chosen the similarity of traffic data to group the loT devices through the OPTICS clustering algorithm. Each group of loT devices will then work as a micro-segment.
After generating the groups of similar devices, the traffic information of each group of devices will be classified as malicious or normal for creating the security policies. For classification tasks, we have considered the DT classifier algorithm. After classifying, the algorithm will look for multiple connection of each loT device and restrict the access of redundant links except one link for each loT nodes. Upon failure of the current link, the algorithm will make one of the restricted link available for use. This will result in blocking the malicious traffics as well.

c. Training and testing
The OPTICS clustering algorithm and the DT classifier are implemented using the pythons sci-kit learn library. We took 1000 samples from each dataset randomly to conduct OPTICS clustering operations since our experimental configuration fails to do clustering for the entire dataset. We set min_samples=2, max_eps=np.inf, metric=' chebyshev', cluster_method ='xi' for the OPTICS clustering method's parameters. We found that for the 'chebyshev' distance metric the OPTICS yields good results.
On the other hand, for classification algorithm, our environment supported the entire dataset. We split the entire dataset to a 70 : 30 ratio for training and testing the DT classifier.

D. Results
After performing the OPTICS clustering algorithm, we got 178 clusters (micro-segments) for UNSW-NB15 dataset and 295 clusters (micro-segments) for loTID20 dataset based on the random 1000 samples of each dataset. Table I shows the clustering results.  After training and testing the DT classifier on both of the dataset, we have computed the Accuracy, Sensitivity and Specificity metrics. Table II shows the evaluation results. From this table, we can see that the DT classifier performed similarly on both datasets in terms of Accuracy and Specificity, but the Sensitivity for UNSW-NB dataset is slightly lower than the one for the loTID20 dataset. Figures 3 and 4 show the confusion matrices of the DT classifier for loTID20 dataset and UNSW-NB15 dataset respectively. Then, we have used this trained DT classifier to differentiate between the normal and malicious traffic in each cluster or micro-segment (as depicted in Figure  2). Table III shows the security policies for a security group generated by a clustering algorithm. The MS model with the DT classifier will block the traffics generated outside of a security perimeter from entering into the micro network bestowed by that perimeter. Also, any malicious traffics will be blocked. From  6. Hence, a single device is restricted to access the redundant links (Section V explains in more detail). The malicious traffics will be blocked automatically.

A. Dataset
For the experiment, we have taken the UNSW-NB 15 [25] and loTID20 [26] datasets. These datasets contain various features of network traffic, including anomalous and normal data. The UNSW-NB 15 dataset contains 48 features of the network traffic. The last feature of this dataset is the class label that is either 0 for normal and 1 for malicious traffic. The loTID20 dataset comprises 80 network features including, three class labels. The Normal and Anomaly class are subdivided based on various cyber attacks.

B. Data Pre-processing
Before applying the ML models on the datasets, we performed data preprocessing. First, the categorical features are encoded using LabeIEncoder() function and normalized using StandardScaler() function. Both of the datasets are high dimensional. Therefore, we conducted a correlation analysis and found that 8 pairs of features are highly correlated with each other in the UNSW-NB15 dataset. However, among the 8 pairs of features, only ('swin', 'dwin'), ('Stime', 'Ltime') pairs of feature showed 100% correlation. Therefore, from UNSW-NB15 dataset 'swin' and 'Stime' features are dropped. On the contrary, from the loTID20 dataset 21 highly correlated features which showed 100% correlation are dropped from the dataset. We choose the correlation threshold as 0.95.
To further reduce the dimensions of the datasets, we applied Principle Component Analysis (PCA). From PCA, we found it is sufficient to consider only the first 30 principal components to represent the overall information of the UNSW-NB 15 dataset. For the loTID20 dataset, the first 20 principal components are adequate. However, for the DT classifier, we did not conduct the PCA procedure.
where (3 is the infection rate and it is constant for specific malware, N is the total number of devices, and 1(t) is the number of infected devices at time t. However, from the above analysis, we can see that the parameter {3 is proportional to the number of links in the loT network for any malware. As the number of links increases, the probability of device infection also raises. Therefore, we have considered (3 as the link parameter. A close form equation of the epidemic model is also shown in [27] as, where, I (0) is the number of devices infected at t == 0 unit of time. Figure 6 (Log plot) shows the device infection rate of the Mesh network shown in figure 5 with 1(0) == 1, t == 15 time unit, {3 == 10 for without MS and {3 == 4 for with MS, and finally N == 5. We can see device infection rate is higher without applying MS than the infection rate after applying MS. Therefore, it is evident that, MS reduces device infection rate by declining lateral movements (at t == 14 almost 3 devices are infected without MS but only 2 devices are infected with MS). The device infection rate increases exponentially according to equation 4. Therefore, if we consider a large network instead of the simple Mesh network depicted in Figure  5, the difference between the two lines shown in Figure 6 will increase. After applying MS, it will take more time to move from the compromised device to the internal nodes. Therefore, the administrator will be able to identify and revoke the compromised devices before the attacker takes control of the main server or device.

VI. CONCLUSION
In this work, we have proposed an automated MS model based on the OPTICS clustering algorithm and a DT classifier for preventing lateral movement in IloT. We have considered ML algorithms to automate the micro-segmentation process 2. Therefore, MS restricts the access of the links numbered 1, 3 and 4 for device Dl. Similarly, after restricting all the redundant links for other devices, the total number of allowed links in this security group will be reduced from 10 to 4. Now, we can use the deterministic epidemic model to figure out the loT device infection rate for MS and for without MS. The epidemic model can be defined as [27]- If any restriction is not imposed explicitly, an loT device can communicate with multiple other devices like in Mesh topology [9]. Therefore, multiple links help malware to spread more rapidly within a network. The attackers get more paths to move laterally within the network and compromise the devices. It is not acceptable to block the redundant links of an loT device since loT devices must communicate through other available links if the current link fails. However, we can control the number of links to reduce the spreading of malware through lateral movement. MS has the potential to minimize the spreading of malware over the network by imposing specific security policies. In this section, we will theoretically analyze the effectiveness of MS in terms of reducing malware dissemination. As an example, let us consider a segment of the loT network shown in Figure 5, where D5 is the gateway node and Dl, D2, D3 and D4 are the sensor nodes. The devices are connected in a Mesh topology. Without any specific security measures, the malware may spread through all the links. The number of links of this mesh topology is since it is difficult and tedious to maintain micro-segmentation for large-scale loT networks. We have considered the network traffic to find and group similar loT devices using the OPTICS clustering algorithm. The loT devices which produce similar traffic information can be grouped together. Then, we have trained a DT classifier and used the DT model obtained to separate the normal traffic from the malicious one. The model will restrict accessing the redundant links of each loT device, which will reduce the spreading of malware. MS will also reduce the lateral movement of an attacker or malware over the entire IloT network by imposing security rules. Furthermore, we have analyzed the effectiveness of MS in the IloT network and showed MS reduces device infection rate.
However, in the security analysis section only a static Mesh topology of loT devices is considered. In reality, the loT network is more complex, heterogeneous, and dynamic. Therefore, in future work, we will apply statistical distribution for modeling the dynamic nature of large scale IloT networks. Also, we intend to integrate a malware detection model with the MS process to identify and revoke the infected device before an extensive portion of the network becomes compromised through lateral movement. We also believe our work will open the door to further experiments of lateral movement prevention using ML in loT networks.