Sequence-based detection of sleeping cell failures in mobile networks

This article presents an automatic malfunction detection framework based on data mining approach to analysis of network event sequences. The considered environment is long term evolution (LTE) of Universal Mobile Telecommunications System with sleeping cell caused by random access channel failure. Sleeping cell problem means unavailability of network service without triggered alarm. The proposed detection framework uses N-gram analysis for identification of abnormal behavior in sequences of network events. These events are collected with minimization of drive tests functionality standardized in LTE. Further processing applies dimensionality reduction, anomaly detection with K-Nearest Neighbors, cross-validation, postprocessing techniques and efficiency evaluation. Different anomaly detection approaches proposed in this paper are compared against each other with both classic data mining metrics, such as F-score and receiver operating characteristic curves, and a newly proposed heuristic approach. Achieved results demonstrate that the suggested method can be used in modern performance monitoring systems for reliable, timely and automatic detection of random access channel sleeping cells.


Introduction
Modern cellular mobile networks are becoming increasingly diverse and complex, due to coexistence of multiple Radio Access Technologies (RATs), and their corresponding releases. Additionally, small cells are actively deployed to complement the macro layer coverage, and this trend will only grow. In the future this situation is going to evolve towards even higher complexity, as in 5th Generation (5G) networks there will be much more end-user devices, served by different technologies, and connected to cells of different types [21,33,59,62,63]. New applications and user behavior patterns are daily coming into play. In such environment, network performance and robustness are becoming critical values for mobile operators. In order to achieve these goals, efficient flow of Quality and Performance Management (QPM) [36], which is a sequence of fault detection, diagnosis and healing, should be developed and applied in the network in addition to other optimization functions.
The concept of self-organizing network (SON) [57,58] has been proposed to automate and optimize the most tedious manual tasks in mobile networks, including QPM. Automation is the key idea in SON and it has been proposed for self-configuration, self-optimization and selfhealing in LTE and UMTS networks [27,36,68]. In traditional systems detection, diagnosis and recovery of network failures is mostly manual task, and it is heavily based on pre-defined thresholds, aggregation and averaging of large amounts of performance data-so called Key Performance Indicators (KPIs). Self-healing [32,67] automates the functions of QPM process to improve reliability of network operation. Though, self-healing is still among the least studied functions of SON at the moment, and the developed solutions and use cases require improvement prior to application in the real networks. This is especially important for non-trivial network failures such as sleeping cell problem [13,14,36]. This is a special term used to denote a breakdown, which causes partial or complete degradation of network performance, and which is hard to detect with conventional QPM within reasonable time. Thus, in the research and standardization community automatic fault detection and diagnosis functions, enhanced with the most recent advancements in data analysis, are seen as the future of self-healing. Thus, development of improved self-healing functions for detection of sleeping cell problems, through application of anomaly detection techniques is of high importance nowadays. This article presents a novel framework based on N-gram analysis of MDT event sequences for detection of random access channel sleeping cells.
The rest of this paper is organized as follows. Section 2 describes common practices of quality and performance management in mobile networks, including MDT functionality, and advanced methods based on knowledge mining algorithms. Section 3 defines the concept of sleeping cell and its possible root cause failures. In Sect. 4 simulation environment, assumptions and random access channel problem are presented. Also Sect. 4 describes the generated and analyzed performance MDT data. Sect. 5 concentrates on the suggested sleeping cell detection knowledge mining framework. It includes overview of the applied anomaly detection methods: KNN anomaly outlier scores, N-gram, minor component analyses, postprocessing and data mining performance evaluation techniques. Section 6 is devoted to the actual research results. Data structures at different stages of analysis are shown, and efficiency of different postprocessing methods is compared. In Sect. 7 the concluding remarks regarding the findings of the presented research are given.

Quality and performance management in cellular mobile networks
Performance management in wireless networks includes three main components: data collection, analysis and results interpretation. Data gathering can be done either by aggregation of cell-level statistics-collection of KPIs, or collection of detailed performance data with drive tests. The main weaknesses in analysis of KPIs are that a lot of statistics is left out at the aggregation stage, due to averaging over time, element and because fixed threshold values are applied. Even though drive test campaigns provide far more elaborate information regarding network performance, they are expensive to carry out and do not cover overall area of network operation. Root cause analysis is done manually in majority of cases, and because of that there is a room for more intelligent approaches to detection and diagnosis of network failures, e.g. with data mining and anomaly detection techniques. This would provide possibility to automate performance monitoring task furthermore.

Minimization of drive tests
Yet another way to improve network QPM is to collect a detailed performance database. This is enabled with MDT functionality standardized in 3rd Generation Partnership Project (3GPP) [28]. MDT is designed for automatic collection and reporting of user measurements, where possible complemented with location information. Collected data is then reported to the serving cell, which in turn sends it to MDT server [39]. Thus, large amount of network and user performance is available for analysis. This is where the power of data mining and anomaly detection can be applied. Specification describes several use cases for MDT: improvement of network coverage, capacity, mobility robustness and end user quality of service [36]. According to the standard, MDT measurements and reporting can be done both in idle and connected Radio Resource Control (RRC) modes. In logged MDT, User Equipment (UE) stores measurements in memory, and reporting is done at the next transition from idle to connected state. In immediate MDT, measurements are reported as soon as they are done through existing connection. In turn, there are two measurement modes in immediate MDT: periodic and event-triggered [39]. Periodic measurements are very useful for initial network deployment coverage and capacity verification as they provide detailed map of network performance, say in terms of signal propagation or throughput. The main disadvantage of periodic measurements is that they consume a lot of network and user resources. In contrast, event-triggered approach provides less information regarding the network status, but can be very efficient for mobility robustness and resource savings. In our study, immediate event-triggered MDT is used for collection of performance database. Table 1 presents the list of network events which triggered MDT measurements and reporting.

Location estimation in MDT
One of the important features of MDT is collection of geo-location information at the measurement time moments. Whenever UE location is provided in MDT report there are several ways to associated it with particular cell, such as: serving cell ID, dominance maps and a new approach based on target cell ID information.
Serving cell ID is available with MDT event-triggered report, even for early releases of LTE. However, in case of coverage hole or problems with new connection establishment, this approach can lead to mistakes in UE location association, because the faulty cell would never become serving in the worst case scenario. This limits the usage of serving cell method for sleeping cell detection. To overcome the problem presented above, a dominance maps method can be used. This is a map, which demonstrates the E-UTRAN Node B (eNB) 1 with the strongest radio signal in each point of the network, see Fig. 1. The main advantage of dominance maps is that mapping of cell ID to location coordinate of UE MDT measurement is very precise, and this results in higher accuracy of sleeping cell detection. The downside of dominance maps approach is that it requires a lot of detailed input measurement information. Though, MDT functionality is one of the efficient ways to create such maps [25].
The last method for cell ID and UE report location association uses target cell ID feature. This information is available in the network events A3 RSRP, HO COM-MAND, HO COMPLETE and RLF REESTABLISH-MENT. The strong side of this method is that detection of sleeping cell becomes possible with a very limited amount of information, as it is shown in Sect. 6.
The key aspects which should be taken into account when selecting a location association method are accuracy and amount of information to create mapping between cell and user location.

Advanced data analysis approaches in QPM
Studies in advanced data analysis for QPM can be divided to several groups. In certain studies, the data reported by the users is used for the analysis. For instance, in [55] authors suggest a method for detection of sleeping cells, caused by transmitted signal strength problem, on the basis of neighbor cell list information. Application of non-trivial preprocessing and different classification algorithms allowed to achieve relatively good accuracy in detection of cell hardware faults. However, the proposed anomaly detection system is prone to have relatively high false rate. In [71] a method based on analysis of TRACE-based user data with diffusion maps is presented. More extensive application of diffusion maps for network performance monitoring can also be found in [49].
Even though, user level statistics is more detailed, still majority of studies devoted to improvement of QPM rely on cell-level data. The first proposals of sleeping cell detection automation using statistical methods of network monitoring are presented in [13,14]. Preparation of normal cell load profile and evaluation of the deviation in observed cell behavior is suggested as a way for identification of problematic cells. The idea of statistical approach has been further studied in [61,70], where a profile-based system for performance monitoring is proposed. The strong side of this study is that real data from 3rd Generation (3G) network has been analyzed. Moreover a complete system for fault detection and diagnosis is developed. However, the disadvantage of the proposed method is substantial time needed for training the algorithm (about 4 days of observation), and the necessity to manually input diagnosis options. Though in [60], the latter drawback is overcome using Kolmogorov-Smirnov two-sample test [48] for  HO COMMAND-handover command received [69].
HO COMPLETE-handover complete received [69]. automatic creation of diagnosis profile database. Bayesian networks have also been applied for diagnosis and root cause probability estimation, given certain KPIs [3][4][5]51]. The complications here are preparation of correct probability model and appropriate KPI threshold parameters. More advanced data mining methods are applied to analysis of cell-level performance statistics, and novel method of using an ensemble of classification algorithms is proposed [17,20]. The idea is to use multiple algorithms for fault detection. At the training phase classification is done with real, manually labeled data, and the best performing methods are prioritized with higher weight. One of the core drawbacks of this approach is that rather extensive set of data is needed to achieve reliable detection. Data collection has been done for 2 months of network operation and 12 KPIs are observed. Labeling of the collected dataset is also a tedious manual task. In [18] application of classification and clustering methods for detection and diagnosis of strangely behaving network regions is presented. For this study a huge data collection campaign has been done: 4000 cells have been monitored for over 2.5 months, and 11 KPIs gathered. Authors manage to create a complete detection and diagnosis system. The largest achievement of this study is that no training or error free data is needed to find the anomalous/problematic cells. However, the critical question is the applicability of the presented method with a smaller input data set, both in terms of geographical scope and time scale. The continuation study [19] makes initial attempt to address changes in the network behavior, through adjustment of data mining model parameters on the fly. Some studies also consider neural network algorithms for detection of malfunctions [53,65].
Among the drawbacks of the reviewed studies on advanced performance monitoring is that collection of appropriate statistical base takes substantial amount of time (from days to months). This increases reaction time in case of outages and does not completely solve the problems of operators in optimization of their QPM. For some of the proposed methods a very large geographical scope of data collection is also required.
In order to overcome weaknesses of the traditional QPM systems and advanced approaches described above we propose a sequence-based analysis method. The scope of our study is concentrated at the analysis of the user-level data, collected with immediate MDT functionality [40,46]. Efficient detection and localization of the faulty cell is achieved through application of the knowledge mining framework based on N-gram analysis. Data collection for training and testing phases of the framework can be done within minutes of network operation. This becomes possible by configuring and running a compact managementbased MDT campaign. Overall detection execution, together with initial learning stage, is going to take in the order of tens of minutes, but not days or even weeks. Subsequent detection, where training is not involved must be even faster.
In the early works cell outage detection caused by signal strength problems (antenna gain failure) is studied [10,11,72]. This area matches the 3GPP use case called ''cell outage detection'' [32]. Identification of the cell, in malfunction condition is done by means of analysis of numerical properties of multidimensional dataset. Each data point represents either periodic or event-triggered user measurement. Such methods as diffusion maps dimensionality reduction algorithm, k-means clustering and k-nearest neighbor classification methods are applied.
To increase robustness of the proposed solutions in MDT data analysis and make the developed detection system suitable for application in real networks, a more sophisticated experimental setup is considered. Sleeping cell caused by malfunction of random access channel, discussed in Sect. 3, does not produce coverage holes from perspective of radio signal, but still makes service unavailable to the subscribers. This problem, which is an instance of physical channel malfunction, is considered to be one of the most complex for mobile network operators, as detection of such failures may take days or even weeks, and negatively affects user experience [36]. To make fault detection framework more flexible and independent from user behavior, such as variable mobility and traffic variation, analysis of numerical characteristics of MDT data is substituted with processing of network event sequences with N-gram method. Network events can include different mobility or signaling related nature, such as A2, A3 or handover complete message [44]. Initial results in this area are presented in [12].

Sleeping cell problem
Sleeping cell is a special kind of cell service degradation, which leads to network performance decrease, invisible for the operator, but affecting user Quality of Experience (QoE). On one hand, detection of sleeping cell problem with traditional monitoring systems is complicated, as in many cases KPI thresholds do not indicate the failure. On the other hand fault identification can be very sluggish, as creation of cell behavior profile requires long time, as it is discussed in the previous section. Regular, less sophisticated types of failures usually produce cell level alarms to performance monitoring system of mobile network operator. In contrast, for sleeping cells degradation occurs seamlessly and no direct notification is given to the service provider.
In general, any cell can be called degraded in case if it is not 100% functional, what negatively affects user experience. Classification of sleeping cells, depending on the extent of performance degradation from the lightest, to the most severe [13,15]: impaired or deterioratedsmallest negative impact on the provided service, crippled-characterized by a severely decreased capacity, and catatonic-kind of outage which leads to complete absence of service in the faulty area, such cell does not carry any traffic.
Degradation can be caused by malfunction of different hardware or software components of the network. Depending on the failure type, different extent of performance degradation can be induced. In this study the considered sleeping cell problem is caused by Random Access Channel (RACH) failure. This kind of problem can appear due to RACH misconfiguration, excessive load or software/firmware problem at the eNB side [1,73]. RACH malfunction leads to inability of the affected cell to serve any new users, while earlier connected UEs still get served. This problem can be classified to crippled sleeping cell type, and with time the affected cell tends to become catatonic. In many cases RACH problem becomes visible for the operator only after a long observation time or even due to user complains. For this reason, it is very important to timely detect such cells and apply recovery actions.

Random access sleeping cell
Malfunction of RACH can lead to severe problems in network operation as it is used for connection establishment in the beginning of a call, during handover to another cell, connection re-establishment after handover failure or Radio Link Failure (RLF) [69]. Malfunction of random access in cell with ID 1, is caused by erroneous behavior of T304 timer [30], which expires before random access procedure is finished. Modeling of this failure is done so that at certain moment of network operation cell 1 loses capability to successfully go through random access procedure. Thus, whenever UE tries to initiate random access to this cell, this attempt fails. Malfunction area covers around 5 % of the overall network (1 out of total 21 cells).

Experimental setup 4.1 Simulation environment
Experimental environment is dynamic system level simulator of LTE network, designed according to 3GPP Releases 8, 9, 10 and partly 11. Throughput, spectral efficiency and mobility-related behavior of this simulator are validated against results from other simulators of several companies in 3GPP [31,50,52].
Step resolution of the simulator is one Orthogonal Frequency-Division Multiplexing (OFDM) symbol. Methodology for mapping link level Signal to Interference plus Noise Ratio (SINR) to the system level is presented in [7]. Simulation scenario is an improved 3GPP macro case 1 [29] with wrap-around layout, 21 cells (7 base stations with 3-sector antennas), and inter-site distance of 500 meters. Modeling of propagation and radio link conditions includes slow and fast fading. Users are spread randomly around the network, so that on average there are 15 dynamically moving UEs per cell. The main configuration parameters of the simulated network are shown in Table 2.

Generated performance data
Generated performance data includes dominance map information and MDT log, which contains the following fields: • MDT triggering event ID. The list of possible events is presented in Table 1. This is a categorical (nominal) and sequential data, i.e. sequences of events are meaningful from data mining perspective; • UE ID. This is also categorical data; • UE location coordinates [m]. It is numerical, spatial data; • Serving and target cell ID -spatial, categorical data.
It is important to know the type of the analyzed data to construct efficient knowledge mining framework [9,37].
Simulations done for this study cover three types of network behavior: ''normal'' -network operation without random access sleeping cell; ''problematic'' -network with RACH failure in cell 1; ''reference''no sleeping cell, but different slow and fast fading maps, i.e. if compared to ''normal'' case, propagation-wise it is a different network. The latter case is used for validation purposes. All three of these cases have different mobility random seeds, i.e. call start locations and UE traveling paths are not the same. Each of these 3 cases is represented with 6 data chunks. The training and testing phases of sleeping cell detection are done with pairs of MDT logs by means of K-fold approach [37]. For example, ''normal''-''problematic'', or ''normal''-''reference'' cases are considered. Thus, in total there are 72 unique combinations of analyzed MDT log pairs, which is rather statistically reliable data base.

Sleeping cell detection framework
The core of the presented study is sleeping cell detection framework based on knowledge mining, Fig. 2. Both training and testing phases are done in accordance to the process of Knowledge Discovery in Databases (KDD), which includes the following steps [24,37]: data cleaning, integration from different sources, feature selection and Wireless Netw (2016) 22:2029-2048 2033 extraction, transformation, pattern recognition, pattern evaluation and knowledge presentation. The constructed data analysis framework for sleeping cell detection is semisupervised, because unlabeled error-free data is used for training of the data mining algorithms. The analysis can be logically separated to two parts: identification of the anomalous data points in MDT data and localization of these points in the real network and assignment of the real sleeping cell score to each cell (can be treated as extent of cell performance abnormality). The first problem is solved with preprocessing and pattern recognition, while the latter is more a task of pattern evaluation and postprocessing. In testing phase problematic data is analyzed to detect abnormal behavior. Reference data is used for testing in Step 2: Store the output of training Step 3: TesƟng order to verify how much the designed framework is prone to make false alarms.

Feature selection and extraction
Feature selection and extraction is the first step of sleeping cell detection. At this stage, input data is prepared for further analysis. Preprocessing is needed as reported UEs MDT event sequences have variable lengths, depending on the user call duration, velocity, traffic distribution and network layout.

Sliding window preprocessing
Sliding window approach [64] allows to divide calls to subcalls of constant length, and by that to unify input data. There are two parameters in sliding window algorithm: window size m and step n. After transformation, one sequence of N events (a call) is represented by several overlapping (in case if n\m) sequences of equal sizes, except for the last sub-call, which is the remainder from N modulo n.
In the presented results overlapping sliding window size is 15, and the step is 10 events. Such setup allows to maintain the context of the data after processing [49]. The number of calls and sub-calls for all three data sets are shown in Table 3.

N-gram analysis
When input user-specific MDT log entries are standardized with sliding window method, the data is transformed from sequential to numeric format. It is done with N-gram analysis method, widely used e.g. for natural language processing and text analysis applications such as speech recognition, parsing, spelling, etc. [6,8,35,45,56]. In addition, N-gram is applied for whole-genome protein sequences [26] and for computer virus detection [16,23].
N-gram is a sub-sequence of N overlapping items or units from a given original sequence. The items can be characters, letters, words or anything else. The idea of the method is to count how many times each sub-sequence occurs. This is the transformation from sequential to numerical space.
Here is an example of N-gram analysis application for two words: 'performance' and 'performer', N ¼ 2, and a single unit is a character. The resulting frequency matrix after N-gram processing is shown in Table 4. In case of sequence analysis of MDT data, a letter from this example corresponds to an MDT event given in Table 1. Thus, for 2-gram analysis pairs of network events are considered, such ''PL PROBLEM -RADIO LINK FAILURE'', or ''A3 RSRP-HO COMMAND''.

Dimensionality reduction with minor component analysis
Dimensionality reduction is applied to convert high-dimensional data to a smaller set of derived variables. In the presented study Minor Component Analysis (MCA) method is applied [54]. This algorithm has been selected on the basis of comparison with other dimensionality reduction methods such as Principal Component Analysis (PCA) [47] and diffusion maps [22]. MCA extracts components of covariance matrix of the input data set and uses minor components (eigenvectors with the smallest eigenvalues of covariance matrix). 6 minor components are used as a basis of the embedded space. This number is defined by means of Second ORder sTatistic of the Eigenvalues (SORTE) method [42,43].

Pattern recognition: K-NN anomaly score outlier detection
In order to extract abnormal instances from the testing dataset K-NN anomaly outlier score algorithm is applied.
In contrast with K-NN classification, method is not supervised, but semi-supervised, as the training data does not contain any abnormal labels. In general, there are two approaches concerning the implementation of this algorithm; anomaly score assigned to each point is either the sum of distances to k nearest neighbors [2] or distance to k-th neighbor [66]. The first method is employed in the presented sleeping cell detection framework, as it is more statistically robust. Thus, the algorithm assigns an anomaly score to every sample in the analyzed data based on the sum of distances to k nearest neighbors in the embedded   space. Euclidean metric is applied as similarity measure. Points with the largest anomaly scores are called outliers. Separation to normal and abnormal classes is defined by threshold parameter T, equal to 95th percentile of anomaly scores in the training data. Configuration parameters of data analysis algorithms in the presented sleeping cell detection framework are summarized in Table 5.

Pattern evaluation
The main goal of pattern evaluation is conversion of output information from K-NN anomaly score algorithm to knowledge about location of the network malfunction, i.e. RACH sleeping cell. This is achieved with postprocessing of the anomalous data samples through analysis of their correspondence to particular network elements, such as UEs and cells. For this purpose we developed 4 post-processing methods: Dominance Cell Sub-Call Deviation, Dominance Cell 2-Gram Deviation, Dominance Cell 2-Gram Symmetry Deviation, and Target Cell Sub-Calls. The essence of these methods, discussed throughout this section, is reflected in their names. The first part describes which geo-location information is used for mapping data samples to cells, e.g. dominance map information, target or serving cell ID. The second part denotes what is used as feature space for postprocessing. It can be either ''subcalls'', when rows of the dataset are used as features or ''2gram'', when individual event pair combinations, i.e. columns of the dataset are used as features. The last, third part of the method name describes analysis considers the difference between training and testing data (''deviation'' keyword), or whether only information about testing set is used to build sleeping cell detection histogram.
Output from the postprocessing methods described above is a set of values-sleeping cell scores, which correspond to each cell in the analyzed network. High value of this score means higher abnormality, and hence probability of failure. To achieve clearer indication of problematic cell presence, additional non-linear transformation is applied. It is called amplification, as it allows to emphasize problematic areas in the sleeping cell histogram. Sleeping cell score of each cell is divided by the sum of Sleeping Cell (SC) scores of all non-neighboring cells. Sleeping cell scores, received after postprocessing and amplification are then normalized by the cumulative SC score of all cells in the network. Normalization is necessary to get rid of dependency on the size of the dataset, i.e. number of calls and users.

Knowledge interpretation and presentation
The final step of the data analysis framework is visualization of the fault detection results. It is done with construction of a sleeping cell detection histogram and network heat map. However, sleeping cell histogram does not show how cells are related to each other: are they neighbors or not, and which area of the network is causing problems. Heat map method shows more anomalous network regions with darker and larger spots, while normally operating regions are in light grey color. The main benefit of network heat map is that mobile network topology and neighbor relations between cells are illustrated.

Performance evaluation
To apply data mining performance evaluation metrics labels of data points must be known. Cell is labeled as abnormal if its SC score deviates more than 3r (standard deviation of sleeping cell scores) from the mean of SC score in the network. Mean value and standard deviation of the sleeping cell scores are calculated altogether from 72 runs produced by K-fold method for ''normal''-''problematic'', and ''normal''-''reference'' dataset pairs. Availability of the labels and the outcomes of different postprocessing methods enables application of such data mining performance metrics as accuracy, precision, recall, F-score, True Negative Rate (TNR) and False Positive Rate (FPR) [34]: In these equations TP, TN, FP, FN denote elements of confusion matrix [37,38,41], and represent correspondingly the number of true positive, true negative, false  In addition to the conventional performance evaluation metrics described above, a heuristic method is applied to complement the analysis. This approach measures how far is the achieved performance from the a priori known ideal solution. Performance of the sleeping cell detection algorithm can be described by a point in the space ''cumulative standard deviation''-''sleeping cell magnitude''. ''Sleeping cell magnitude'' is the highest SC score, which can reach value 100 due to normalization. ''Cumulative standard deviation'' coordinate equals to the standard deviation of SC scores of all other cells. This plane contains two points of interest: [0; 100] and ½0; 100=N cellsinthenetwork . In case of malfunctioning network, the ideal sleeping cell detection algorithm assigns 100 value of SC score to the broken cell and zero values to the rest cells. Thus, the corresponding point [0; 100] is calculated. In case of error-free network, the ideal performance is mapped to the point ½0; 100=N cellsinthenetwork , because all the cells have even SC scores equal to 100=N cellsinthenetwork . Thus, the smaller the Euclidean distance between the achieved and ideal sleeping cell histograms, the better the performance of the sleeping cell detection algorithm.

Results of sleeping cell detection
This section presents the results of sleeping cell detection for different post-processing algorithms. In addition, the data at different stages of the detection process is illustrated. Then performance metrics are used to compare effectiveness of the developed SC identification algorithms.

Preprocessing and K-NN anomaly score calculations
After preprocessing with sliding window and N-gram methods we get a so-called 2-gram popularity matrix. The size of this matrix equals to data chunk size and has 32 features-the number of non-zero 2-grams. This popularity matrix is transformed with MCA. The output of dimensionality reduction with MCA has 6 features-coordinates of points in 6-dimensional embedded space based on eigenvectors with the smallest eigenvalues. Then training MDT data is processed with K-NN anomaly score algorithm. As it is discussed in Sect. 5.3, the anomaly score threshold, used for separation of data points to normal and abnormal classes, is selected to be 95th percentile of outlier score in training data. Shape of normal training dataset in  Fig. 3a. In this Fig. 3 dimensions are selected on the basis of visual inspection to demonstrate best the distribution of data points. Sorted anomaly outlier scores are presented in Fig. 3b. It can be seen that data points are very compact in the embedded space, and because of that there is no big difference in the anomaly score values. The main goals of analyzing testing dataset are to find anomalies, detect sleeping cell, and keep the false alarm rate as low as possible. At the testing phase either problematic or reference data are analyzed. After the same preprocessing stages as for training, the testing data is represented in the embedded space. When testing data is problematic dataset some of the samples are significantly further away from the main dense group of points, Fig. 4. These abnormal points are labeled as outliers, and the corresponding anomaly scores for these samples are much higher, as it can be seen from Fig. 4b. On the other hand, some of the points with relatively low anomaly score are above the abnormality threshold. This means that there is still certain percentage of false alarms, i.e. some ''good'' points are treated as ''bad''. The extent of negative effect caused by false alarms is discussed further in Sect. 6.4.
Though, there is no opposite behavior referred to as ''missdetection''-none of the anomalous points are treated as normal.
Validation of the data mining framework is done by using error-free reference dataset as testing data. No real anomalies are present in the network behavior. Reference testing data in the embedded space and corresponding anomaly outlier scores are shown in Fig. 5. Only few points can be treated as outliers, and in general the shapes of normal (Fig. 3a) and reference (Fig. 5a) datasets in the embedded space are very similar. Anomaly outlier scores of the reference testing data are low for all points, except 2 outliers.

Application of postprocessing methods for sleeping cell detection
After training and testing phases certain sub-calls are marked as anomalies. The next step is conversion of this information to knowledge about location of malfunctioning cell or cells, and this is done through postprocessing described in Sect. 5.4.

Detection based on dominance cell sub-call deviation
In our earlier study [12] postprocessing based on dominance cells and call deviation for sleeping cell detection is presented. One problem of using calls as samples is that, in case if the duration of the analyzed user call is long, the corresponding number of visited cells is large, especially for fast UEs. Hence, even if certain call is classified as abnormal, it is very hard to say which cell has anomalous behavior. To overcome this problem, analysis is done for sub-calls, derived with sliding window method, see Sect. 5.1.1. Sub-calls contain the same number of network events, and the length of the analyzed sequence is short enough to identify the exact cell, with problematic behavior. Deviation measures the difference between training and testing data, and it is used to sleeping cell detection histogram, presented in Fig. 6a. From this figure, it can be seen that abnormal sub-calls are encountered more frequently in the area of dominance of cell 1, which has the highest deviation. One can see that there are 2 types of bars-colorful (in this case blue) and grey. The second variant implies additional postprocessing step-amplification, described in Sect. 5.4. In addition to cell 1, its neighboring cells 8, 9, 11 and 12 also have increased deviation values, as it can be seen from the network heat map in Fig. 6c. Sleeping cell detection histogram and network heat map for reference dataset used as testing are shown in Fig. 6b, d correspondingly. Even though cells 6 and 17 have higher SC scores than other cells, they are not marked as abnormal, because their abnormality does not reach mean ? 3r level.

Detection based on dominance cell 2-gram deviation
In this method problematic network elements are found through the comparison of 2-gram frequencies in different areas of dominance map. For this purpose we consider all sub-calls from training data set against sub-calls assigned to abnormal class from testing dataset. In case there is a big increase or decrease, the cell associated with these changes is marked as abnormal. From sleeping cell detection histogram in Fig. 7a it can be that cell 1 has a clear difference in number of 2-gram occurrences in testing data, if compared to training data. This happens because handovers toward this cell fail. Due to this fact 2-gram sequence with events related to handovers become imbalanced in testing data if compared to training data. For instance, 2-grams like Handover (HO) Command-HO Complete and HO Complete-A2 RSRP ENTER, become very rare. On the other hand, 2-gram HO Command-A2 RSRP ENTER, which can be treated as indication of unsuccessful handovers, in opposite becomes very popular in testing data, while in training data it does not exist at all. Among the neighbors of problematic cell 1, only cell 11 has slightly increased sleeping cell score. Testing sleeping cell detection framework with reference data and postprocessing with Dominance Cell 2-Gram Deviation method demonstrates lower false-alarm rate than Dominance Cell Sub-Call Deviation, as it can be seen from Fig. 7b, d.

Detection based on dominance cell 2-gram symmetry deviation
This postprocessing method analyzes the symmetry imbalance of network event 2-grams. The symmetry imbalance is evaluated based on all sub-calls from training data set and sub-calls assigned to abnormal class from testing dataset. Information about the number of 2-grams directed to the cell, and from the cell is extracted from the training set. The considered 2-grams consist of events which sequentially occur in the dominance areas of 2 cells.
It means that if in the training data, the number of handovers from Cell A to Cell B, and from Cell B to Cell A, is roughly the same, and in the testing set it is not, it can be concluded that symmetry of this particular 2-gram is skewed. Most common types of 2-grams which are analyzed with this method are related to handovers, e.g. A3-HO COMMAND sequences. From Fig. 8 it can be seen that Dominance Cell 2-Gram Symmetry Deviation finds sleeping cell 1, while its neighboring cells 8, 9, 11 and 12 have suspiciously high sleeping cell score, if compared to other cells in the network.
Comparison of symmetry analysis method with two previously described postprocessing approaches shows that this method is very efficient in detecting sleeping cell and its neighbors. At the same time stability, i.e. false alarm rate, of this method is also very good, as it can be seen from Fig. 8b.

Detection based on target cell sub-calls
As it is discussed in Sect. 5.4, deviation between training and testing data is not calculated in this method. Extensive location information, like dominance map information, is not required for sleeping cell detection with target cell subcall method. The sleeping cell detection histogram, presented in Fig. 9, is constructed by counting all unique target cell IDs for each anomalous sub-call. It can be clearly seen that cell 1 is successfully detected. Neighboring cells 8, 9, 11 and 12 also contain indication of malfunction in this area, as it can be noticed from heat map, shown in Fig. 9b. For this method, the SC score of cell 1 is slightly lower than for the postprocessing methods, based on dominance cell deviation. On the other hand, target cell sub-call method is much simpler, and requires significantly less information about user event occurrence location.

Combined method of sleeping cell detection
The idea of this method is to create a cumulative sleeping cell detection histogram based on the results from all 4 postprocessing methods described above. The resulting amplified SC histogram is shown in Fig. 10. Cell 1 has sleeping cell score well over l þ 3 Ã r threshold. Neighboring cells 8,9,11,12 also have increased sleeping cell scores comparing to other cells, though they do not exceed the l þ 3 Ã r threshold. Reference data used as testing also demonstrates stability of the combined approach -no false alarms are triggered. Though, it can be seen that usage of target cell sub-call method introduces some noise. It is important to note that postprocessing methods are applied with equal weights. However, it is possible to emphasize more accurate method by increasing its weight, and penalize the unreliable, by reducing its weight. Though, selection of optimal weights is a matter of a separate study and is not discussed in this article.

Comparison of algorithms and performance evaluation
The postprocessing methods discussed above have their own advantages and disadvantages. Traditional data mining metrics, discussed in Sect. 5.5.1, are applied for quantitative comparison of sleeping cell detection methods, Fig. 11a. Ideal performance is presented with the solid double black line, and corresponds to the maximum area of the hexagon. K-fold cross validation method is utilized to obtain statistically significant results. Figures 6, 7 Fig. 11b. True positive rate equals 1 for all postprocessing methods. This is not always the case for many classification applications in real world systems. However, the proposed framework is able to create such projection of the input MDT data, that all anomalous points are correctly classified as abnormal. The latter can be seen from sorted anomaly scores of problematic data sets, shown in Fig. 4b. The negative side is that some methods mistakenly classify some normal points as abnormal, and this is reflected in false positive rate. Thus, the proposed framework is able to create such a projection of the MDT data, that in the new space normal data and anomalous data points are fully separable and do not overlap. Hence, the suggested data mining framework for sleeping cell detection is successful, and for reduction of false alarm rate it is necessary to invent a better separation rule, than 3r deviation from mean SC score, see Figs. 6, 7, 8, 9 and 10. Another method for comparison of postprocessing algorithms is a heuristic approach described in Sect. 5.5.1. According to this method, more accurate postprocessing algorithm is the one, which has the smallest distance to the ideal solution point for either problematic or error-free case. Cumulative distances for different algorithms in non-amplified and amplified cases are presented in Fig. 12a, b correspondingly. Also the coordinates of different postprocessing methods in heuristic performance measure plane are shown in Fig. 12c, d. It can be seen that Dominance Cell 2-Gram Symmetry Deviation method has the smallest distance from the ideal detection case. Thus, from perspective of the heuristic performance evaluation approach this method outperforms other postprocessing methods. Regarding the same performance metric we may conclude that amplified histograms show better results than nonamplified ones, which holds for all postprocessing methods.

Conclusions
This article presents a novel sleeping cell detection framework based on knowledge mining paradigm. MDT reports are used for the detection of a random access channel malfunction in one of the network cells. Experimental setup implements a simulated LTE network, used to generate a diverse statistics base with several thousands of user calls and tens of thousands of MDT samples. Investigated failure case is a sleeping cell caused by RACH malfunction. Even though the studied problem is rather specific, the proposed framework does not consider any properties or peculiarities of the random access failure for the detection. Moreover, analysis of event sequences makes the presented method applicable to data collected with MDT, TRACE functionality, mobile quality agents, and any other method, which is capable to gather the user specific sequences of network events. The studied type of sleeping cell problem is rather complex, and detection of this problem has never been done before. The applicability of our sequential analysis to other network failures might be beneficial, but it has to be studied. The designed knowledge mining framework is semisupervised. From the perspective of SONs the proposed system has centralized architecture, but it can also be hybrid, with preprocessing and transformation stages done in distributed manner. The heart of the developed detection framework is the analysis of sequences with N-gram method in the series of user event-triggered measurement MDT reports. Data preprocessing with sliding window transformation method allows to make the statistics base more reliable through standardization of the input event sequences. 2-gram analysis is used to convert sequential data to numeric format in the new feature space. To simplify analysis of the data in the new space, dimensionality reduction with minor component analysis method is applied. K-NN anomaly score detection algorithm is used to find the outliers in the data. Using this information, anomalous data points are converted with postprocessing to the knowledge about location of the problematic regions in the network. Comparison of different location mapping postprocessing methods is done. Additionally, so called amplification is used to take into account neighbor relations between cells and network topology, for improvement of 73  c Heuristic performance distances for problematic case. d Heuristic performance distances for reference case. sleeping cell detection performance. As it can be seen, amplified sleeping cell score of truly problematic cell is higher than corresponding non-amplified score. Results demonstrate, that the developed suggested framework, based on sequence analysis, allows for efficient detection of the random access sleeping cell problem in the network. The projection of the data in the new space is such that accurate separation of normal and abnormal data points becomes possible. Evaluation shows that postprocessing method named Dominance Cell 2-Gram Symmetry Deviation demonstrates the best combination of results, with respect to heuristic performance measure. According to the same metric, the proposed amplification approach, improves the detection quality of postprocessing methods. However, this approach is an additional element of the developed nontrivial framework and is not the most important outcome of our research.
Results of this work lay grounds and suggest exact methods for building advanced performance monitoring systems in modern mobile networks. One of the possible directions in this area is extensive usage of data mining techniques in general, and anomaly detection in particular. New systems of network maintenance would allow to address growing complexity and heterogeneity of modern mobile networks, and will help to meet the requirements of 5G.
Future work in this field includes validation of the developed system in more complex scenarios, detection of several or different types of malfunctions, and substitution of semi-supervised approach with unsupervised. The ultimate goal is to achieve accurate and timely detection of different sleeping cell types in highly dynamic mobile network environments. Obviously, low level of false alarms must be supported, and at the same time significant increase of computational complexity should be avoided.