Secrecy Analysis and Learning-based Optimization of Cooperative NOMA SWIPT Systems

Non-orthogonal multiple access (NOMA) is considered to be one of the best candidates for future networks due to its ability to serve multiple users using the same resource block. Although early studies have focused on transmission reliability and energy efficiency, recent works are considering cooperation among the nodes. The cooperative NOMA techniques allow the user with a better channel (near user) to act as a relay between the source and the user experiencing poor channel (far user). This paper considers the link security aspect of energy harvesting cooperative NOMA users. In particular, the near user applies the decode-and-forward (DF) protocol for relaying the message of the source node to the far user in the presence of an eavesdropper. Moreover, we consider that all the devices use power-splitting architecture for energy harvesting and information decoding. We derive the analytical expression of intercept probability. Next, we employ deep learning based optimization to find the optimal power allocation factor. The results show the robustness and superiority of deep learning optimization over conventional iterative search algorithm.


I. INTRODUCTION
Non-orthogonal multiple access (NOMA) has received much hype due to its promise to effectively utilize the wireless spectrum. NOMA works by allowing users to share the same temporal/ spatial resources while the receiving side carries out successive interference cancellation (SIC) [1], [2]. On the other hand, cooperative communications can help by improving the system capacity, extend the coverage area and achieve a higher degree of freedom with single antenna nodes. Thus, the idea of user cooperation in NOMA has attracted much interest due to its applications in 5G and has given birth to an important research topic called the cooperative NOMA. It was first proposed in [3] wherein, a user with the stronger channel decodes the message and then assist by relaying the message to the far NOMA user.
Despite substantial improvements in terms of spectral efficiency, the research work on energy efficient cooperative NOMA schemes is still at infancy stage. To that end, simultaneous wireless information and power transfer (SWIPT) has drawn much research interest due to the ability of RF signals to transfer information and energy at the receiver [4]. Thus, applications of SWIPT in NOMA have been studied from the perspective of outage performance, cooperation, and energy harvesting (EH) efficiency [5]. However, owing to the dual function of RF signal and broadcast nature of NOMA, the transmission from source to destination can be eavesdropped by a malicious user. More specifically, the EH receivers can intercept the confidential information being exchanged between legitimate users. In order to provide security to the low-powered devices, physical layer security (PLS) has been introduced as an alternative to computation heavy cryptographic techniques [6]. PLS techniques can improve the secrecy performance of wireless networks by means of cooperative relaying, jamming and multiple-antenna beamforming.
In [7], the authors proposed energy and spectral efficient protocol by combining NOMA with SWIPT. They showed that the proposed scheme does not jeopardize the diversity gain of the edge users while enabling the cell-center users to self-power themselves. In [4], Diamantoulakis et al. consider downlink and uplink multiple access protocols for SWIPT systems. In particular, they investigate the performance in the downlink for NOMA and time division multiple access (TDMA), while for uplink conditions they consider NOMA with time sharing. These works were extended for multipleinput-single-output (MISO) NOMA for hybrid time switching and power-splitting SWIPT architecture in [8]. They also derived tight closed-form expressions of the outage probability and demonstrated the superiority of cooperative NOMA over conventional NOMA and OMA systems. Besides these developments, few investigations have been conducted for improving the secrecy performance of NOMA using SWIPT. The authors of [9] maximized the secrecy sum rate by optimizing the allocated power. Closed-form expression of optimal power-splitting ratio was derived and it was shown that the proposed method outperforms the uniform power allocation methodology. In [10], Zhou et al. proposed a cooperative jamming technique for energy harvesting multiple-input-singleoutput (MISO)-NOMA cognitive radio systems. The authors claimed that the proposed scheme for NOMA outperforms the conventional orthogonal multiple access (OMA) scheme in terms of power efficiency.
Of late, deep learning has emerged as a key technique for improving the performance of wireless networks. Deep learning is a part of machine learning consisting of multiple hidden layers [11]. More specifically, in contrast to shallow machine learning methods, deep learning has multiple intermediate layers of neurons between input and output layers. At each hidden layer, the weighted sum of the previous layers are updated and an activation function is applied [12]. The authors of [13] first proposed the idea that deep learning is an important and powerful tool for handling non-linear and complex problems. Some other works considered deep learning for the physical layer, multiple-input-multiple-output (MIMO) systems, and channel coding [14], [15]. This positive trend also attracted much-needed attention to multiple access schemes. Thus, the authors of [16] optimized the sparse code multiple access (SCMA) scheme using deep learning. To do so, they developed a strategy for selecting the codebook which minimizes the bit error rate (BER) while using the minimum amount of computation time. Another important study that integrates orthogonal frequency division multiplexing (OFDM) and deep learning was conducted by the authors of [17]. It was shown that the deep learning approach performs best for signal detection and channel estimation. More recently, the authors of [18] used long short-term memory (a branch of supervised deep learning) for data detection in uplink NOMA. They showed that the deep learning based NOMA scheme is more reliable as compared to conventional hard-decision optimization solutions.
So far, it has become evident that the work on secrecy performance of energy harvesting cooperative NOMA systems is very limited. Moreover, the work on deep learning approaches for physical layer security of NOMA is nonexistent. Therefore, in order to advance this promising field of wireless communications, we consider a scenario where energy harvesting cooperative NOMA users communicate in the presence of an energy harvesting eavesdropper. We derive the analytical expression of intercept probability of DF energy harvesting cooperative NOMA system which, according to the authors' best knowledge, has not been derived in the literature. We also optimize the secrecy performance of the system by using deep learning for finding the optimal value of the power allocation factor. The results of deep learning approaches are then compared with benchmark iterative search algorithms. It has been shown that the deep learning based NOMA scheme is robust and computationally lightweight.
The remainder of the paper is organized as follows. Section II provides details of the system model. In Section III, the analytical results for intercept probability are provided. In Section IV, deep learning based neural network model is discussed. Section V provides numerical results and their relevant discussion. In Section VI, some concluding remarks are provided.

II. SYSTEM MODEL
Let us consider a cooperative relaying system consisting of a source (S), and two destinations (U N and U F ) in the presence of an eavesdropper (E) as shown in Figure 1. The nodes U N , U F and E are able to decode information and harvest energy from the received RF signal. It is assumed that U N , U F and E have the channel state information (CSI) of their corresponding links, whereas, S being the source has the CSI of all the nodes. The channel gains from S → U N , S → U F , S → E and U N → E are assumed to be Rayleigh distributed and given as h SUN , The transmission takes place in two time slots. In the first phase, S transmits the superimposed message where s i and α i are the data symbol and power allocation coefficient of i-th destination and P denotes the total transmit power. It is assumed that h SUN > h SUF , hence the power allocation factor should satisfy α N < α F , where α N +α F = 1. The nodes U N , U F and E are assumed to use the power-splitting receiver architecture for ID and EH. According to power-splitting architecture, the received power is split into two power streams by a powersplitting factor ρ for EH and (1 − ρ) for ID, where 0 < ρ < 1.
The received signal at U N , U F and E during the first time slot can be written as where ρ N,1 , ρ F,1 , ρ E,1 denote power-splitting factor at U N , U F and E during first phase. Also, n SUN , n SUF , n SE represent the additive white Gaussian noise (AWGN) with zero mean and N 0 variance. The node U N first decodes its own symbol s N by treating s F as interference. After obtaining s N , U N cancels its own signal by using successive interference cancellation (SIC) to get s F . The received signal to interference and noise ratio (SINR) and signal-to-noise ratio (SNR) for symbols s N and s F can be, respectively, given as The far user, U F , treats the s N as interference. Then the received SINR at U F can be written as It is assumed that the link between the source and near user is secure and the eavesdropper tries to decode the information signal of the far user. In order to decode s F , the eavesdropper treats s N as noise. Hence, the SINR at E can be expressed as In the second phase, U N transmits the decoded symbol s F to U F with power P . Assuming that U N can perfectly decode s F and use all harvested energy EN T = ρ N,1 ηP |h SUN | 2 during first phase to transmit s F to U F , the received SNR at U F can be given as Similarly, the received SNR at E during the second phase can be given as

III. INTERCEPT PROBABILITY
In this section, we derive the analytical expression of intercept probability for the considered case. An intercept event occurs when the achievable secrecy rate C sec falls below 0 [6]. The achievable secrecy rate is the difference between the rates of the main and wiretap links i.e. C sec = [C s − C e ] + .
Since we consider DF protocol at U N , therefore, the SNR will be determined by the bottleneck link between S and U F and U N to U F is given as γ DF UN UF ) while the achievable rate for the link S → U N → U F is given as C DF s = 1 2 log 2 (1 + γ DF S ). The eavesdropper is assumed to select the best messages received during first and second phase given as γ DF UN E ). Therefore, the achievable rate for wiretap links can be given as C e = 1 2 log 2 (1 + γ DF E ). Now the intercept probability can be given as UN UF ) and UN E ), we can re-write 10 as The Cumulative Distribution Function (CDF) can be found as In the above equation, we obtain where λ SUF = m ΩSU F and Ω SUF = UN UF depends on the event that near user has successfully decoded the symbol of the far user. In this case, the CDF of γ (2) UN UF can be expressed as Assuming X 1 = |h SUN | 2 and X 2 = |h UN UF | 2 we get .

IV. DEEP LEARNING BASED OPTIMIZATION
In this section, we are going to present a deep learning based resource allocation scheme for optimizing the achievable secrecy rate of the far user. We employ neural networks to learn the relationship between inputs and outputs and predict the optimal power allocation factor that maximizes the achievable secrecy rate. To do so, we carefully train our multi-layer artificial neural network, whereby, each layer consists of multiple neurons as illustrated in Figure 2. We show in the numerical results section that the computational efficiency of artificial neural networks is one of the highlights of deep learning models. After the model has been trained on a set of inputs, the testing (i.e., the real-time running phase) involves only nonlinear transformations and vector multiplications without compromising the performance.

A. Problem Formulation
We now try to optimize the secrecy performance of the far user due to it being most vulnerable to eavesdropping attack. Considering the worst-case scenario 1 , the legitimate receivers have no option but to maximize their own achievable rate. By this approach, maximizing the achievable rate would result in maximizing the secrecy rate as well since the secrecy rate is the difference between the rate of legitimate link and the rate of wiretap link. Under this condition, the optimization problem of the achievable rate becomes equivalent to max αF >0.5 However, the second term (i.e., γ UN UF ) in (20) does not contain power allocation factor α F . Thus, the optimization problem can be re-formulated as max αF >0.5 log 2 (1 + γ

B. Deep Learning Network Setup
Our neural network consists of multiple hidden layers and a single input and output layer. The main reason for using multiple hidden layers is to avoid under-fitting of test data while maintaining a sufficient level of complexity. Moreover, by utilizing multiple hidden layers, the complex interplay of inputs and outputs can be understood by the network during the learning phase. In our case, the inputs are the channel realizations and the outputs are the power allocation factors of the far user. For each channel realization, we take samples from the Rayleigh distribution while fixing all the other parameters. These values are generated for training and validation datasets that are fed into the network during the training phase.
At each hidden layer, we use rectified linear unit (ReLU) activation function. Mathematically, the ReLU activation function is represented as where z denotes the output of the activation function and y is the input of the function. We have used mean square error as the cost function and apply the mini-batch algorithm on the training data samples for calculating the gradients.

V. NUMERICAL RESULTS
This section provides numerical results and relevant discussion. It is worth mentioning that the analytical and simulation results have been generated using MATLAB, whereas, the deep learning optimization is performed in Python 3.6.7. Unless stated otherwise, the parameters used for generation of plots are as follows: ρ N,1 = ρ N,2 = ρ E,1 = ρ E,2 = ρ F,1 = ρ F,2 = ρ = 0.3, Ω = 5dB, decay rate = 0.9, training samples = 30000, test samples= 6000, and epochs=100. Figure 3 illustrates the intercept probability against different values of transmit SNR. It can be seen that the intercept Transmit SNR Ω (dB) probability decreases with an increase in the values of Ω.
As anticipated, the values of η also have a prominent impact on the intercept probability. Strictly speaking, the intercept probability generally increases with a reduction in the values of η. However, the separation between the curves of η grows as the value of transmitting SNR increases. This shows that a low energy harvesting efficiency is more harmful at higher values of SNR, giving rise to a higher intercept probability.
In addition to this, we note that at higher values of α F , the intercept probability significantly increases. This is partly because of the low power allocation to the near cooperative user. At lower values of the power allocation factor, decoding the message of the far user becomes difficult for the near cooperating user. It can also be seen that the simulation results closely follow the analytical result which validates the derived expression.
To further highlight the impact of a power-splitting factor, Figure 4 shows the intercept probability as a function of ρ. It can be observed that the intercept probability generally increases with an increase in the value of the power-splitting factor. This trend can be attributed to the low amount of energy reserved for information decoding which makes it difficult to maintain cooperation among near and far users. Evidently, the increasing values of η causes intercept probability to decreases. However, the power allocation factors for near and far users have shown different trends as the value of ρ changes. Precisely, we note that at lower values of power-splitting factors, the separation between the curves of α F = 0.9 and α F = 0.6 is quite large. But, as the values of the power-splitting factor increase, the difference between the curves becomes smaller. This shows that the impact of power allocation factors of NOMA reduces at higher values of a power-splitting factor.
In Figure 5, we have demonstrated the optimization results for the deep learning approach. Here, "Optimal-Iterative" refers to the optimal results achieved through iterative search scheme, "DL" denotes the results for deep learning approach, and "Random" represents the results for random power al- location factor generated using uniform distribution. Figure  5(a) shows the results for achievable secrecy rate against the increasing values of the power-splitting factor. As shown in this plot, the larger values of the power-splitting factor reduce the achievable secrecy capacity as more power is reserved for energy harvesting. It can be seen that the deep learning approach strictly follows the optimal results, while always achieving the accuracy of more than 90%. By contrast, the random power allocation achieves a very low secrecy capacity. To further highlight the robustness of the deep learning approach, Figure 5(b) plots the results for computation time against different values of ρ. For this case, the deep learning approach performs significantly better than that of iterative search. This shows that once trained, the deep learning models can provide a lightweight solution to achieve optimal results. Figure 6 emphasizes the suitable number of layers for the neural network but plotting achievable secrecy rate and computation time for 05 hidden layers. It can be seen in Figure 6(a) that the separation between deep learning and iterative approach slightly increases. We attribute this increase to over-fitting of data during the training phase. This results in causing the neural network to memorize the training set. However, when new testing data is presented, the network fails to generalize the results. This increase in a number of hidden layers also affects the computation time, as shown in Figure  6(b). Specifically, we observe that the computation time of deep learning approach considerably increases as compared to Figure 5(b). This increase in computation time is due to the increase in the number of hidden layers and in the total number of neurons in the neural network.
VI. CONCLUSION This study provides secrecy analysis and deep learning optimization of SWIPT-based cooperative NOMA systems. We derive the analytical expression of the intercept probability when near user act as a cooperative node in the presence of an eavesdropper. We have shown that the impact of power allocation factors of NOMA reduces at higher values of the power-splitting factor. Moreover, we have shown that deep learning approach is more robust and computationally efficient as compared to conventional iterative search approach. In the future, we aim to use deep learning for optimizing the secrecy performance of cooperative NOMA systems under colluding eavesdroppers. ACKNOWLEDGMENT This work is partially supported by the National Key R & D Plan (2017YFC0803403), the National Natural Science Foundation of China (61371188) and the Fundamental Research Funds of Shandong University (2018GN051).