# A Residual Phase Noise Compensation Method for IEEE 802.15.4 Compliant Dual-Mode Receiver for Diverse Low Power IoT Applications

Abdullah Zubair Mohammed\*, Ajay Kumar Nain\*, Jagadish Bandaru, Ajay Kumar, D. Santhosh Reddy Rajalakshmi Pachamuthu, *Member, IEEE* 

Abstract—The Internet of Things with its plethora of applications brings up new challenges in optimizing power consumption, error performance, latency and throughput in its communication devices. Recently, there has been increasing necessity of reconfigurable and multi-standard transceivers in order to bring adaptability in the IoT devices to suit the nature of the application and satisfy the aforementioned performance metrics as well. In this work, we focus on the efficient design of receivers compliant with IEEE 802.15.4 which is a prominent protocol for low power IoT applications. We propose a robust phase noise compensation method which efficiently removes the residual phase noise remaining after the coarse frequency offset compensation due to direct-sequence spread spectrum operation on large packets. We also propose a dual-mode receiver using the proposed compensation method and showcase its ability to cater to the diverse nature of IoT applications. The detailed architecture of proposed dual-mode receiver is presented along with its FPGA prototyping and ASIC implementation. We have analyzed overall power consumption by the proposed dual-mode receiver considering the packet error rate and retransmission scenario. The results show that the proposed receiver saves significant energy consumption by changing its mode in favorable channel environments.

*Index Terms*—Digital baseband receiver, IEEE 802.15.4, hardware architecture, frequency and phase offset compensator, IoT, phase noise cancellation, dual-mode receiver.

#### I. INTRODUCTION

T HE prominence of Internet of things (IoT) and its consequences in wireless technology is gaining attention since recent past. The race for a unified communication framework for IoT devices continues even as IoT finds applications in diverse areas. Several IoT applications are being deployed ranging from health care to entertainment, banking to home automation, indoor and outdoor, mobile and static [1]. On the basis of requirement, these diverse IoT applications can be rudimentarily categorized as sustainable IoT and critical IoT. A similar classification is done in [2] for cellular IoT. Critical IoT applications like remote healthcare, industrial automation and control have high demands for reliability, accuracy and

A.Z. Mohammed, A.K. Nain, B. Jagadish, A. Kumar, D.S.Reddy, P. Rajalakshmi are with Indian Institute of Technology Hyderabad, India.

Email: {ee14mtech01003, ee14resch11001, ee15resch02010, ee15resch02002, ee15resch11005, raji} @iith.ac.in

"Copyright (c) 2012 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org." low latency. On the other hand, sustainable IoT applications like environmental monitoring, smart agriculture requires its nodes to survive on the batteries for a long duration. However, sustainable applications can make-do with a relatively more error tolerance. The transceiver designed for critical IoT applications may not cater the needs of other class of applications due to higher power consumption. Similarly, the transceiver designed for sustainable IoT applications are not suitable for critical applications due to relatively poor performance in terms of reliability. Hence, there is a requirement for a universal communication chip-set for all IoT applications.

1

Low power consumption is always a strong desire for all the IoT applications. There are various approaches to reduce the power consumption in the network [3]. One method to reduce the power consumption in a network is to optimize MAC layer by changing various parameters like back off period, duty cycle etc. Here modifying duty cycle is a simple way to reduce average power consumption of the receiver, but it provides additional latency and additional complexity in performing synchronization among the nodes [4]. Another method to increase the lifetime of the network is to reduce power consumption at PHY layer by choosing appropriate transceiver design which consumes a significant power of the wireless node. The greater demand for low power consumption motivates us for the design of optimum digital baseband to minimize the power consumption in favorable IoT applications.

Many wireless protocols have been standardized and also being proposed, such as IEEE 802.15.4, Bluetooth Low Energy, LoRa, Narrow-Band IoT, and so forth. Each protocol has its own power consumption, range, and other specifications. Our work focuses on low range communication and makes an exposition of IEEE 802.15.4 which is a prominent protocol used in low power IoT applications [5]. Usually, the IEEE 802.15.4 compatible nodes are battery operated and the batteries are expected to be replaced or recharged once they die out [6]. However, its energy consumption can be significantly reduced by modifying it according to requirements. Since IEEE 802.15.4 uses offset quadrature phase shift keying (OQPSK) modulation with half-sine pulse shaping, the receivers are generally designed to demodulate the signal non-coherently as a minimum shift keying (MSK) signal. However, OQPSK signal can also be demodulated coherently under the class of M-ary PSK signals. Coherent demodulation requires carrier synchronization and gives better error rate performance at the

<sup>\*</sup> The first two authors contributed equally to this work. Asterisk indicates the first authors.

2

cost of additional power and hardware complexity. This creates a trade-off between power consumption and error performance. The trade-off can be utilized in making the receiver adaptable to IoT application viz. critical and sustainable applications.

In this work, we have focused on IEEE 802.15.4 baseband receiver for making it suitable for diverse IoT application ranging from critical to sustainable applications with the goal of optimum power consumption. We have proposed a novel residual phase noise compensation method for coherent demodulation of the signals. The proposed method is applied after performing the coarse frequency offset compensation and provides better error rate performance which is the requirement of critical IoT applications. We employ the error correction capability of direct spread spectrum sequence (DSSS) used in 802.15.4 to estimate the current symbol from despreaded and corrected preceding symbols. As an application of the proposed algorithm, we demonstrate a dual-mode IEEE 802.15.4 receiver and showcased how it can meet the requirements of diverse IoT applications. The proposed dual-mode receiver can be configured according to the channel quality or the application requirements, viz, critical or sustainable. For instance, in an industrial IoT scenario, if an IEEE 802.15.4 node is deployed to monitor the temperature, humidity etc. then the receiver can be configured to sustainable mode [7]. On the other hand, if it is deployed for any critical application like fire alarms, fault detection etc., then it can be configured to operate in the critical mode.

Following are the main contributions of the work:

- 1) Propose and implement a novel DSSS-aided residual phase noise compensation algorithm to provide high error performance for critical IoT applications.
- Propose a dual-mode IEEE 802.15.4 receiver architecture with two demodulator chains optimized for error performance and power consumption depending on the application and/or channel conditions.
- Employ a shared hardware architecture, wherein, the low complex MSK receiver is derived completely from the hardware of the more complex OQPSK receiver.
- Compare the power consumption and resource allocation between the individual receiver chains and proposed dual-mode architecture.
- 5) Compare the two modes of the proposed dual-mode receiver in terms of bit error rate, packet error rate, and hardware complexity as a performance metrics.

The rest of the paper is organized as follows. Section II describes the related works. Section III explains the need for the proposed residual phase noise compensation and its elaborated algorithm. Section IV describes system architecture of the dual-mode receiver consisting of the proposed compensation algorithm for the critical mode (critical IoT applications). Section V explains circuit level architecture of the complete dual-mode receiver. Section VI explains the results and performance analysis of the proposed system. The paper is concluded in section VII.

## II. RELATED WORKS

The effect of carrier frequency offset on IEEE 802.15.4 class of signals has been studied well in the literature. In [8],

authors presented the receiver with frequency offset estimation but didn't address the issue of residual phase noise remaining after applying coarse frequency compensation. In [9], a simple design of detector (SDD) is presented in the presence of frequency offset where correlators are implemented using multiplexers to reduce the hardware complexity. However, we have utilized the capability of DSSS to propose a phase noise compensation method which is lesser complex than the SDD. Few methods have been proposed in literature to utilize the capabilities of DSSS used in 802.15.4 for the purpose other than the error rate reduction. In [10], the authors used the property of DSSS to create a covert channel along with the primary channel for secret data transmission. However, the DSSS is unexplored for the purpose of frequency offset compensation to the best of the authors knowledge.

Development of low power transceiver for IoT application has been in increased demands since recent past. Multiple works have been proposed in literature for IoT communication using low power transceiver. In [11], [12], [13] authors proposed IoT transceiver which performs well for body area networks with coverage range upto 1-2 meters. However, these are not suitable for low power application applications which need to cover large geographical area like environment monitoring, industrial applications etc. In [14], [15], authors proposed IEEE 802.15.4 compliant transceivers for larger range low power IoT applications. They have proposed techniques to reduce power consumption in RF section of IEEE 802.15.4 receiver and used standard base-band processing to propose fully integrated receiver. In this work, we propose a method for digital base-band section of the transceiver which reduce the overall power consumption in IoT application. The proposed baseband can be integrated to the low power RF section proposed by [14], [15] for further reduction of overall power consumption in IoT applications.

Recently, there has been increasing necessity of reconfigurable and multi-standard transceivers due to increased variety of communication technologies. In [16], a reconfigurable analog-to-digital converter (ADC) has been proposed for providing various high data rate for the multi-standard receiver. In [17], P. Wu. et al. proposed a multi-mode multiband SoC (system on chip) which includes three modulation schemes and utilized a single synchronization mechanism for all the different schemes in order to save power. However, they have not utilized the channel statistics to choose the scheme and require user intervention all the time to switch. In [18], [19], authors proposed an all-digital fully compliant IEEE 802.11b and 802.15.4 configurable baseband receiver which can be configured to receive signal based on the requirements. In [20], authors proposed a reconfigurable baseband receiver which supports dual mode for IEEE 802.15.3c and IEEE 802.11ad dual standards. In [21] and [22], a MSK receiver for IEEE 802.15.4 has been proposed while OQPSK approach in presence of frequency offset has been proposed in [8], [9]. In [23], a dual-mode IEEE 802.15.4 digital baseband receiver has been proposed. However, there was no residual offset cancellation module which limits its capability for critical IoT applications. In this work, we propose a dual-mode receiver with a residual phase noise compensation algorithm which



(a) received (without compensation)(b) received (without compensation) small Packet (20 bytes) large Packet (100 bytes)



Fig. 1: Constellation plot of received symbols with coarse frequency & phase offset estimation and initial (frequency and phase) offset compensation

makes it suitable for critical IoT application. For sustainable IoT applications, the MSK chain is utilized to reduce the energy consumption on the cost of tolerable error rate performance degradation.

#### **III. PROPOSED RESIDUAL PHASE NOISE COMPENSATION**

To compensate the effect of frequency offset in carrier signal of any communication system, the first step is to apply a coarse frequency estimation method which gives the the approximate value of the frequency offset [24]. However, few fine frequency offset correction methods [25],[26] have also been proposed but these methods necessitates much computational load to find the correlation values for all possible carrier frequency offsets [24]. To overcome the remnant error after applying the coarse frequency estimation, we propose a DSSS-aided phase noise compensator, which works in tandem with de-spreading and uses the *corrected* preceding symbols to estimate the current symbol. In this section, the effect of remnant error due to coarse frequency estimation on large packets is discussed followed by the proposed residual phase noise algorithm in detail.

#### A. Effect of residual offset on large packets

As a coarse frequency estimation method, we have used the R & B algorithm [27] which is described in the section V. We have analyzed the method for different packet size. The constellation diagram of the received symbols are shown in Figs. 1a and 1b for the packet length of 20 bytes and 100

bytes respectively and a frequency offset of 80 ppm. As we can observe, there cannot be an optimum decision boundary due to continuous rotation of the symbols. After the coarse frequency offset compensation, the constellations for both the packets are shown in Figs. 1c and 1d respectively. From the mentioned figures, it can be observed that for a small packet, there is no problem in further detection of symbols. On the other hand, for the larger packet, the symbol cannot be detected perfectly because the symbols are seen to be still rotating. This phenomenon can be explained by the following equation 1 which represents the received signal  $\hat{z}_k$  after coarse frequency estimation and initial offset compensation.

$$\hat{z_k} = x_k e^{j(2(f_d - f_d)\pi T_s k)} + n_k \tag{1}$$

where  $T_s$  represents one symbol period, k refers to  $k_{th}$  symbol of the frame,  $x_k$  and  $n_k$  represents the  $k_{th}$  symbol and noise corresponding to  $k_{th}$  symbol respectively.  $f_d$ ,  $\hat{f}_d$  represents actual frequency offset, estimated frequency offset. Let  $\hat{e}_f$  be the phase noise remaining after coarse frequency compensation. Then, equation 1 can be written as:

$$\hat{z_k} = x_k e^{j2\pi \hat{e_f} k T_s} + n_k \tag{2}$$

where  $\hat{e_f} = (f_d - \hat{f_d})$ . From equation 2, it can be observed that effect of  $\hat{e_f}$  increases with the symbol index k. If the k is large due to the lengthy frame, the value of product  $\hat{e_f}k$  is sufficient enough to rotate the symbol out of its decision region on the constellation plane. For a packet of 100 bytes, IEEE 802.15.4 transmitter transmits the DSSS signal of 3200 symbols. The maximum rotation an OQPSK symbol can tolerate is  $\pi/4$ . Therefore, for the accurate detection of the last symbol (k =3200),  $2\pi \hat{e_f}kT_s < \pi/4$ . For sampling rate of 1 Mega samples per second (after modulation),  $\hat{e_f} < 40$  Hz which demands the frequency estimate to be accurate up to 0.02% of 80 ppm frequency offset, which is a strenuous task for an estimator.

As discussed, due to estimation error there is a residual phase error which is increasing with the number of symbols. To address the issue, we have proposed a DSSS-aided method named as residual phase noise compensation algorithm which is described in detail in the subsequent section.

# B. Description of Proposed Residual Phase Noise Compensation

In the proposed method, we have utilized the knowledge of preceding symbols to nullify the effect of the large value of k. If we ignore noise  $n_k$  for the frequency offset compensation the received signal  $z_k$  is represented by the following

$$z_k = x_k e^{j2\pi f_d T_s k} \tag{3}$$

Similarly, if we use the  $(k - i)_{th}$  symbol for correction in frequency offset the equation for  $(k - i)_{th}$  received symbol will be

$$z_{k-i} = x_{k-i} e^{j2\pi f_d T_s(k-i)}$$
(4)

From the equations 3 and 4, we can write the following equation

$$z_{k-i}^* \hat{x}_{k-i} z_k = x_{k-i}^* e^{-j2\pi f_d T_s(k-i)} \hat{x}_{k-i} x_k e^{j2\pi f_d T_s k}$$
$$= x_{k-i}^* \hat{x}_{k-i} e^{j2\pi f_d T_s i} x_k \tag{5}$$

4

the symbol  $x_k$  can now be estimated as

$$\hat{x}_{k_i} = \frac{z_{k-i}^* \hat{x}_{k-i}}{|\hat{x}_{k-i}|^2} z_k e^{-j2\hat{v}i}$$
(6)

Here the notation  $\hat{x}_{k_i}$  represents the estimate of  $x_k$  using the  $i_{th}$  preceding symbol.  $i = 1, 2, 3, ..., N_{rcfo}, N_{rcfo}$  is the number of total preceding symbols used. It should be noted here that  $\hat{x}_{k_i}$  is the symbol generated from the chips of the original DSSS sequence which matches the most with the received DSSS sequence. This eliminates up to 5 bit errors (minimum hamming distance between any two DSSS sequences used in IEEE 802.15.4 is 12) and hence gives us an accurate estimate of the current symbol. This process will return a set  $\mathbf{X}_{k} = \{\hat{x}_{k_{i}} | i = 1, 2, 3, ..., N_{rcfo}\}$  of length  $N_{rcfo}$ . The final estimate of the current symbol is then given as  $\hat{x}_k = mode(\hat{X}_k)$  where function mode represents the mode of input set i.e. gives the element that occur most in the input set. The pseudo-code of the algorithm is given in Algorithm 1. In the algorithm, z represents received symbols,  $\hat{x}$  represents preceding estimated symbols while  $\hat{v}$  represents the coarse estimate of frequency offset.  $N_{rcfo}$  represents the number of previous symbols used to estimate the current symbol. The function DeSpreadSequence generates the OQPSK symbols from the actual DSSS sequence that is closest to the sequence of accumulated 32-bits and provides corresponding symbol Symbols\_out by performing chip-to-symbol mapping.

Algorithm 1 Proposed Residual Phase Noise Compensation

| 1:  | <b>procedure</b> PRPNC $(z, \hat{x}, \hat{v}, N_{rcfo})$                                   |              |
|-----|--------------------------------------------------------------------------------------------|--------------|
| 2:  | $l \leftarrow 1$                                                                           |              |
| 3:  | for all $k$ in $N$ do                                                                      |              |
| 4:  | for all $i$ in $N_{rcfo}$ do                                                               |              |
| 5:  | $x_{k_i} \leftarrow \frac{z_{k-i}^* \bar{x}_{k-i}}{ \hat{x}_{k-i} ^2} z_k e^{-j2\hat{v}i}$ |              |
| 6:  | $\hat{oldsymbol{X}_k} \leftarrow \hat{oldsymbol{X}_k} \cup \{x_{k_i}\}$                    |              |
| 7:  | end for                                                                                    |              |
| 8:  | $\hat{x}_k \leftarrow mode(\hat{X_k})$                                                     |              |
| 9:  | Case $\{x_k\}$                                                                             |              |
| 10: | $\epsilon \ Q1: \{b_{k_i}, b_{k+1_i}\} \leftarrow \{1, 1\}$                                |              |
| 11: | $\epsilon \ Q2: \{b_{k_i}, b_{k+1_i}\} \leftarrow \{-1, 1\}$                               |              |
| 12: | $\epsilon \ Q3: \{b_{k_i}, b_{k+1_i}\} \leftarrow \{-1, -1\}$                              |              |
| 13: | $\epsilon \ Q4: \{b_{k_i}, b_{k+1_i}\} \leftarrow \{1, -1\}$                               |              |
| 14: | EndCase                                                                                    |              |
| 15: | $l \leftarrow l+1$                                                                         |              |
| 16: | if $l = 16$ then                                                                           |              |
| 17: | ${\hat{x}[k], \hat{x}[k - 1], \dots, \hat{x}[k - 15]}$                                     | $\leftarrow$ |
|     | $DeSpreadSequence(\{\{b_{j_i}, b_{j+1_i}\}   j = 1, 2,, 16\})$                             |              |
| 18: | $Symbols\_out$                                                                             | $\leftarrow$ |
|     | $DeSpreadSequence(\{\{b_{j_i}, b_{j+1_i}\}   j = 1, 2,, 16\})$                             |              |
| 19: | $l \leftarrow 1$                                                                           |              |
| 20: | end if                                                                                     |              |
| 21: | end for                                                                                    |              |
| 22: | return Symbols_out                                                                         |              |
| 23: | end procedure                                                                              |              |

## IV. SYSTEM ARCHITECTURE AND WORKING PRINCIPLE OF PROPOSED DUAL-MODE RECEIVER

The working principle of the proposed dual-mode receiver lies in the fact that the OQPSK with half-sine pulse shaping can be analyzed as MSK signal under the category of continuous phase frequency shift keying (CPFSK) [28] and also as QPSK with an offset in quadrature phase under the class of M-ary phase shift keying (PSK). CPFSK signals can be demodulated non-coherently. So, the carrier frequency and phase synchronization which consumes significant power can be avoided. The proposed receiver has two modes of operation and it can be switched automatically/manually between the coherent and non-coherent detection on the basis of channel condition/application requirement.

The block diagram of the proposed receiver is depicted in Fig. 2. The *External Memory* block stores the samples from the ADC of the RF section of the receiver. *Memory Controller* block is used to control the accesses to the external memory by different blocks of the receiver. The *Symbol Timing Recovery* block finds the appropriate timing boundary in order to locate the samples having the highest peaks in the pulse for maximum signal-to-noise ratio (SNR). Then, as depicted, the chain is divided into two parts: upper chain performs OQPSK demodulation. The *Shared Cordic Vector* and *Shared Cordic Rotation* block give output to all the blocks of the two chains for converting incoming samples from memory to polar coordinates in order to reduce the hardware complexity of the receiver which is described in detail in the next section.

In OQPSK chain, first of all *Coarse Frequency and Phase Offset Estimation* block provides the coarse estimates of frequency and phase offset computed using the FFT-based estimation algorithm. In the *Initial Offset Compensation and Preamble Detection* block, thus estimated offset are removed from the samples and the preamble detection is performed to find exactly the start of the data frame. If a preamble is detected, the remaining samples are fed to the *Proposed Residual Phase Noise Compensation* block. This minimizes the phase noise error remaining after the removal of coarse offset. In this block, the DSSS is utilized to correct the errors in the chip sequence and use the corrected symbols to estimate the upcoming samples. The algorithm is described in the section V.

In MSK chain, *Differential Phase Detection* is employed to annul the effects of frequency and phase offset and then to demodulated the samples according to the detected phase. After that, the frame synchronization is performed in the *Preamble Detection* block and then samples are mapped to symbols in *Despreading* block which performs The data bits are then fed to MAC layer after symbol to bit conversion in the *Decoding* block.

The memory controller manages the accesses to the external memory by the different blocks and also controls the receiver to operate in one of the two modes. The proposed receiver architecture is designed to operate in two modes namely Manual Mode and Automatic Mode.

In the Manual Mode the receiver is manually configured to operate in one of the two demodulator chains based on the application's requirements. In the Automatic Mode, on the other hand, the receiver configures itself based on the input from the channel indicator. The preamble detection block in each of the chain can be used to indicate the channel condition whether it is good or bad. If the received preamble is corrupted beyond a certain threshold, the SNR is decided This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2018.2884654, IEEE Internet of Things Journal



Fig. 2: Architecture of the proposed dual-mode receiver with a novel residual phase noise compensator

to be low, else good. In the beginning, the receiver starts with MSK mode assuming channel is good. If SNR is found good enough, MSK chain is continued for the reception. Conversely, if SNR is below the threshold, the channel will have a poor error performance which results in unsuitability of the MSK chain. The OQPSK chain performs better for lower SNRs. Depending on the preamble detection correlation peak, the memory controller switches between the two chains. The noteworthy point is that the transmitter need not change its modulation scheme. For the same transmitted signal, the receiver can demodulate it using either of the two chains.

# V. IMPLEMENTATION OF PROPOSED DUAL-MODE RECEIVER

In this section, we describe received signal model for followed by implementation of all the different blocks of the dual-mode receiver. The received baseband signal, z(t) is given as

$$z(t) = x(t - \tau)e^{j(2\pi f_d t + \theta)} + n(t)$$
(7)

where x(t) is the transmitted signal,  $\tau$  is the timing error,  $f_d$  is the frequency offset,  $\theta$  is the constant phase offset. n(t) is additive white gaussian noise (AWGN). For ease of simplicity, we decompose  $\tau$  into two parts as given below:

$$\tau = \mu T_s + \epsilon T_s \tag{8}$$

where  $\mu \ge 0$  is an integer delay and  $-0.5 < \epsilon < 0.5$  represents the fractional delay. The  $T_s$  represents one symbol period. The  $\mu$  represents the symbol delay which can be dealt with preamble detection. For timing recovery, we can assume  $\mu$  is known and hence our goal is to estimate the  $\epsilon$ . In discrete form the received signal can be represented as below:

$$z_k(i) = x(kT_s + i\frac{T_s}{N} - \epsilon T_s)e^{j(2\pi f_d(kT_s + i\frac{T_s}{N}) + \theta)} + n_k(i)$$
(9)

where  $z_k(i)$  represents the  $i^{th}$  sample of the  $k^{th}$  symbol and  $n_k(i)$  represents the noise to the corresponding sample. N

is the total number of samples per symbol. The first step at the receiver is to perform timing recovery which is done by *Symbol Timing Recovery* block. We employ non-linear transformation based algorithm [29] for STR. The STR block provides the estimated time shift  $\hat{\tau}$  which is used to find the peak of the pulse (out of N samples per symbol pulse only one sample per symbol). After performing timing recovery, the received signal expressed in equation 9 can be represented as below:

$$z_k = x_k e^{j(2\pi f_d(kT_s) + \theta)} + n_k \tag{10}$$

where  $z_k$ ,  $x_k$  and  $n_k$  represents the received symbol, transmit symbol and noise corresponding to  $k^{th}$  modulated symbol. The estimated time shift  $\hat{\tau}$  determine memory index which is then fed to the memory controller such that it fetches the optimum samples (peak of the pulse per modulated symbol) from the memory for the further processing. After the calculation of the time shift, the *Symbol Timing Recovery* block is bypassed and the required samples from the memory is read by the succeeding blocks. In the following subsections, the algorithm and elaborate circuit design are presented for each block.

## A. Coarse Frequency and Phase Offset Estimation

We consider the R & B algorithm [27] which uses discrete fourier transform (DFT) to compute the maximum likelihood estimate of frequency offset  $f_d$  and phase offset  $\theta$ . The estimates  $\hat{f}_d$  and  $\hat{\theta}$  are given as

$$\hat{f}_d = \arg\max_{f_d} |\sum_{k=0}^{N_{FFT}-1} q_k e^{-j2\pi f_d k T_s}|$$
(11)

$$\hat{\theta} = \arg\max_{\theta} e^{-j\theta} |\sum_{k=0}^{N_{FFT}-1} q_k e^{-j2\pi \hat{f}_d k T_s}|$$
(12)

where  $N_{FFT}$  is the length of the DFT to be performed and the value of  $q_k$  is  $q_k = e^{jM\phi_k}$ , where  $\phi_k = arg(z_k)$  and M = 2 for OQPSK signal [30]. The length of the sequence  $q_k$  is taken as 128 samples.



Fig. 3: Circuit architecture of initial offset compensation and preamble detection



Fig. 4: Output of the correlator of preamble detection as a SNR indicator

## B. Initial Offset Compensation and Preamble Detection

The samples  $z_k$  are read from the memory and converted to polar  $(r_k, \phi_k)$  form. The magnitude  $r_k$  can be neglected since the symbol-detection is made only from  $\phi_k$  by checking the quadrant on which the symbol lies. Compensated phase  $\hat{\phi}_k$  for the initial 128 samples using the coarse estimates  $(\hat{f}_d, \hat{\theta})$  can be expressed as

$$\hat{\phi_k} = \phi_k - 2\hat{v}k - \hat{\theta} \tag{13}$$

where  $\hat{v} = \hat{f}_d T_s$  and  $\hat{f}_d$ ,  $\hat{\theta}$  are the estimate of frequency and phase offset respectively. To find the detection rule on  $\hat{\phi}_k$ , first we have to find its probability distribution. Here  $\hat{\phi}_k$  is the phase of a complex random variable where both the real and imaginary components are independent gaussian random variable ( due to additive white gaussian noise channel). It can be easily proved according to method described in [31] that  $\hat{\phi}_k$  follows the uniform distribution in the period  $[0, 2\pi]$ . Hence, the detection is performed using the following rule:

$$(\hat{b}_{l}, \hat{b}_{l+1}) = \begin{cases} (1,1) ; \phi_{2l}\epsilon(-\pi/2, \pi/2) \& \phi_{2l+1}\epsilon(0,\pi) \\ (1,0) ; \hat{\phi}_{2l}\epsilon(-\pi/2, \pi/2) \& \hat{\phi}_{2l+1}\epsilon(\pi, 2\pi) \\ (0,1) ; \hat{\phi}_{2l}\epsilon(\pi/2, 3\pi/2) \& \hat{\phi}_{2l+1}\epsilon(0,\pi) \\ (0,0) ; \hat{\phi}_{2l}\epsilon(\pi/2, 3\pi/2) \& \hat{\phi}_{2l+1}\epsilon(\pi, 2\pi) \end{cases}$$
(14)

After the detection, 256-bit XOR operation is performed on the bit-stream  $\hat{b}$  and the reference preamble bit-stream. If the output crosses a specified threshold, a valid packet is detected and the start of payload index is written in the memory controller for further processing. The circuit architecture is shown in Fig. 3.

For Automatic Mode of the receiver, the output of the XOR is used to indicate the channel conditions. Fig. 4 shows the

output varying with SNR. A threshold is to be set and the OQPSK chain is selected in case of the XOR output below the threshold. On the other hand, MSK chain is selected if the output is found above the threshold and receiver operate in sustainable mode. We have selected 110 as the threshold for taking this decision. If threshold is found less than 80, we reject the frame due to insufficient channel quality and consider the frame loss.

## C. Proposed Residual Phase Noise Compensator

The working principle and algorithm of the proposed residual phase noise compensator has been discussed in section III-B. To reduce the hardware complexity of the system, we have implemented the equation (6) using its polar form. The equation (6) can be represented as below

$$\phi_{k_i} = -\phi_{z_{k-i}} + \phi_{\hat{x}_{k-i}} + \phi_{z_k} - 2\hat{v}i \tag{15}$$

where  $\phi$  represents the phase of the signal estimated from  $i^{th}$  preceding symbols.

The architecture of proposed residual frequency offset compensation is shown in Fig. 5. The samples are read from the memory blocks and fed to the *CORDIC Vector* block. This *CORDIC* block converts the received samples into polar form. The partial sum of the equation 15,  $-\phi_{z_{k-i}} + \phi_{\hat{x}_{k-i}}$  is stored in the *register bank* 0 and the offset term  $2\hat{v}i$  in the *register bank* 1. The *register bank* 0 is updated for every set of 32 samples. The adder then completes the sum of equation 15 and after detection, the bits are sent to the shift register of length  $N_{rcfo}$  bits. This is repeated for  $N_{rcfo}$  number of times. The decision  $b_{k_i}$  on  $\hat{\phi}_{k_i}$  is made from the decision rule expressed by equation 14 and the bit  $\hat{b}_k$  is decided in *decision* block based on the mode of the collected bits:

$$\hat{b}_k = mode(\{b_0, b_1, ..., b_{N_{refo}}\})$$
(16)

The bits are then stored in a 32-bit shift register which performs de-spreading upon every 32-bit cycle by matching bits with sixteen 32-bit reference chip sequences. From the *DSSS Look Up Table*, the closest chip sequence is written to the *register bank 0* and is used for estimation of the upcoming samples. This gives a better estimate when compared to using only the current estimated bits. The chip sequence is also converted to corresponding 4-bit symbol which is fed to the upper layers for further processing.

As apparent from the Fig. 5, for detection of one symbol, the proposed implementation uses  $3 * N_{rcfo}$  adders for the computation of equation (15) and  $N_{rcfo}$  adders for the computation for *register bank 1*. As will be discussed in the subsequent sections, coarse frequency offset estimation need not be performed for every packet. Therefore, the *register bank 1* remains the same for at-least a single burst of packets and hence the total number of adders is approximated to  $3 * N_{rcfo}$  only. The comparison of the proposed method with the method proposed in [9] is given in the following Table I in terms of number of operations.

Using the proposed algorithm, the constellation plot gets distinct decision regions even with the large size of frames as

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2018.2884654, IEEE Internet of Things Journal



Fig. 5: Circuit architecture of proposed residual phase noise compensation

TABLE I: Comparison of hardware complexity of proposed method in terms of number of additions

| Parameters (lag) | [9] | Parameters $(N_{rcfo})$ | Proposed method |
|------------------|-----|-------------------------|-----------------|
| d = 1            | 60  | 4                       | 12              |
| d = 2            | 58  | 8                       | 24              |
| d = 3            | 56  | 16                      | 48              |



Fig. 6: Constellation plot of received symbols with the proposed residual phase noise compensation for a packet length of 100 bytes

shown in Fig. 6. The error rate performance, power consumption and and hardware utilization of the proposed method is discussed in detail in the next section.

#### D. Differential Phase Detection of MSK Signal

The baseband representation of the MSK modulated signal can be written as

$$z(t) = e^{j(\phi(t-\tau) + 2\pi f_d t + \theta)} + n(t)$$
(17)

where  $\tau$ ,  $f_d$ ,  $\theta$  represents the timing error, frequency offset and phase offset respectively. z(t) and n(t) represents the baseband received and noise signal respectively at time instant t.  $\phi(t)$ is the phase of the transmitted modulated signal and it is represented as given below

$$\phi(t) = 2\pi h \sum_{k} b_k q(t - kT_s) \tag{18}$$

where  $b_k$  is the  $k^{th}$  bit in the transmitted bit-stream, h = 0.5 is the modulation index,  $T_s$  is the symbol duration and q(t) is represent as given below:

$$q(t) = \begin{cases} 0, & t < 0\\ t/2T_s, & 0 < t < T_s\\ 1/2, & t > T_s \end{cases}$$
(19)

To detect the MSK signal, initially the timing recovery is performed similar to OQPSK chain. Then, differential detection [32] is used to nullify the effects of frequency and phase offset. The discrete representation of the received MSK signal given in equation 17 can be expressed by the following equation

$$z_k(i) = e^{j(\phi_k(i-\tau) + 2\pi f_d(kT_s + i\frac{T_s}{N}) + \theta)} + n_k(i)$$
(20)

where *i* represents the  $i^{th}$  sample of the  $k^{th}$  symbol if N number of samples are used per symbol. After timing recovery, only one sample per symbol is taken for the decision purpose. Hence, the received signal after the timing recovery is presented as below

$$z_k = e^{j(\phi_k + 2\pi f_d k T_s + \theta)} + n_k \tag{21}$$

where  $\phi_k$  is the discrete sample of phase of the symbol modulated by equation 18. In differential detection the difference of the phase of two adjacent samples is calculated and then *sin* of that difference is taken to detect the symbol. Hence, the decision is made on  $sin((angle(z_k - z_{k-1}) = sin(\phi_k - \phi_{k-1} + 2\pi f_d T_s))$ . The  $f_d$  term remains but the product  $f_d T_s$  has a maximum value of 0.01 (for a frequency offset of 100ppm [8]). Therefore, the signal for the decision reduces to

$$sin(\phi_{k} - \phi_{k-1} + 2\pi f_{d}T_{s}) = sin(\phi_{k} - \phi_{k-1} + 2\pi/100)$$
  
=  $sin(\phi_{k} - \phi_{k-1})cos(2\pi/100)$   
+  $cos(\phi_{k} - \phi_{k-1})sin(2\pi/100)$   
 $\approx sin(\phi_{k} - \phi_{k-1})$ 

2327-4662 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications\_standards/publications/rights/index.html for more information.



Fig. 7: Circuit architecture of differential phase detection of MSK

The bit stream  $b_L$  is then detected using the following rule:

$$b_L = \begin{cases} 0; \ \sin(\phi_k - \phi_{k-1}) > 0\\ 1; \ \sin(\phi_k - \phi_{k-1}) < 0 \end{cases}$$
(22)

The circuit implementation of MSK differential phase detection is shown in Fig. 7. The phase of the samples is evaluated by the *Shared CORDIC Vector* block and the sinusoidal function of the differential phase is evaluated by the *Shared CORDIC Rotation* block. The *Detection* block then computes  $b_L$  using equation 22.

## VI. PERFORMANCE ANALYSIS AND RESULTS

In this section, the performance of the dual-mode receiver is analyzed in the terms of bit error rate (BER), packet error rate (PER), power consumption and hardware utilization. For BER and PER, we have implemented the system in MATLAB. We have taken packet size of 80 bytes. As the channel effect, we have introduced the frequency offset of 80 ppm and the added AWGN noise with the signal to noise ratio varies from -10 dB to 10 dB. A system frequency of 16 MHz is taken for the hardware implementation of the proposed baseband receiver. The proposed receiver is implemented on FPGA and discussed at the end of this section along with its ASIC implementation.

## A. Error Performance Analysis

To analyze the performance of the residual frequency offset compensation, the BER for the OQPSK receiver chain is plotted for different values of  $N_{rcfo}$  and result is shown Fig. 8a. From Fig. 8a, it can be inferred that for the values of  $N_{rcfo}$  as 8 and 16, the OQPSK considerably outperforms the MSK demodulation. We have compared the BER result with the work presented in [9], in which a simple design of detector (SDD) has been proposed for detecting signal in the presence of frequency offset. The comparison is shown in Fig. 8b which shows the result of the proposed method with  $N_{rcfo} = 2, 4$ compared with SDD. It can be inferred from the Fig. 8b that the proposed method outperforms the previous method with 3 dB gain for the value of  $N_{rcfo} = 4$ . However for  $N_{rcfo} = 16$ , the BER performance will be further improved.

Fig. 8c depicts the packet error rate of the two demodulator chains. The PER result has been taken by simulating with 40,000 packets of 80 bytes each. The PER result is compared with the PER result of the system proposed in [33]. The PER result of the the proposed system is better than the previous method.

# B. Power Consumption Analysis

The power consumption of two chains of the proposed dualmode receiver is analyzed considering packet error rate and retransmission scenario. Let  $p_1, p_2$  be the probabilities of packet error for the MSK and OQPSK chains respectively. Let the power consumption by the base-band processor of the two chains be  $P_{D1}, P_{D2}$  respectively. Let the power consumption by the RF section of both the chain is  $P_A$ , since we are not changing the analog section, the power consumption is same for both the chains. let  $r_1, r_2$  be the estimated transmission per packet (ETP) for the successful transmission.  $1 \ge r_1, r_2 < \inf$ as there will be at least one transmission per packet. The  $r_1$ and  $r_2$  can be calculated by the following equations

$$r_1 = \sum_{i=1}^{\infty} i P_1(i)$$
 and  $r_2 = \sum_{i=1}^{\infty} i P_2(i)$  (23)

where  $P_1(i)$  and  $P_2(i)$  are the probabilities such that packet is transmitted on  $i^{th}$  attempt successfully while it has failed i-1 times. By replacing the values of  $P_1(i)$  and  $P_2(i)$ , the above equations can be simplified as follows:

$$r_1 = \sum_{i=1}^{\infty} i p_1^{i-1} (1-p_1) = \frac{1}{(1-p_1)}$$
(24)

$$r_2 = \sum_{i=1}^{\infty} i p_2^{i-1} (1-p_2) = \frac{1}{(1-p_2)}$$
(25)

Here we have considered a retransmission scenario in which packet will be keep on transmitted again until the successful transmission of the packet, Let the total estimated power consumption per packet be  $P_1$  and  $P_2$  for MSK and OQPSK chain respectively. These powers can be described by the following equations:

$$P_1 = r_1(P_{D1} + P_A)$$
 and  $P_2 = r_2(P_{D2} + P_A)$  (26)

Using MSK chain the packet error rate is high but power consumption per packet is low. On the other hand, using the OQPSK chain the average transmission per packet is less due to lower packet error rate but it consumes more power than that of the MSK chain. So, considering the retransmission scenario, the estimated power per packet viz.  $P_1$  and  $P_2$  should be compared. In case of the estimated power consumption by MSK chain less than that of OQPSK chain i.e  $P_1 < P_2$ , the following equation should be satisfied.

$$r_1(P_{D1} + P_A) < r_2(P_{D2} + P_A)$$
 (27)

PER (From Fig. 8c) and corresponding value of the  $r_1$  and  $r_2$ , are given in Table II. From the hardware implementation, we can conclude that OQPSK chain contains resources approximately twice of that of MSK chain. Hence, we can assume  $P_{D2} \approx 2P_{D1}$  to simplify the equation 27. After considering this assumption in above equation, we get the following equation

$$P_{D1}(r_1 - 2r_2) + P_A(r_1 - r_2) < 0$$
  
or 
$$P_{D1}(r_1 - 2r_2) < P_A(r_2 - r_1)$$
 (28)

For high SNR values, the  $r_1, r_2 \approx 1$ , the equation 28 will be true. So, it can be inferred that MSK chain is better in

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2018.2884654, IEEE Internet of Things Journal

9



(a) BER of the proposed residual phase noise compensator for different values of  $N_{rcfo}$ 

(b) BER comparison of the proposed residual phase noise compensator with SDD

(c) PER comparison of the receiver chains: MSK and OOPSK

Fig. 8: BER and PER results of the proposed dual-mode receiver

TABLE II: Estimated transmission per packet (ETP) with different values of SNR

| SNR      | MSK         | Chain       | OQPSK Chain |             | Conclusion             |                      |                                            |                                        |
|----------|-------------|-------------|-------------|-------------|------------------------|----------------------|--------------------------------------------|----------------------------------------|
| dB       | PER $(p_1)$ | ETP $(r_1)$ | PER $(p_2)$ | ETP $(r_2)$ | $r_2 \le r_1 \le 2r_2$ | $\frac{P_{D1}}{P_A}$ | P1 <p2< td=""><td>Optimal Chain</td></p2<> | Optimal Chain                          |
| -4       | 1           | inf         | 1           | inf         | False                  | -                    | False Always                               | OQPSK                                  |
| -3       | 1           | inf         | 0.9973      | 371.8571    | False                  | -                    | False Always                               | OQPSK                                  |
| -2       | 1           | inf         | 0.8808      | 8.386       | False                  | -                    | False Always                               | OQPSK                                  |
| -1       | 1           | inf         | 0.4274      | 1.7463      | False                  | -                    | False Always                               | OQPSK                                  |
| 0        | 0.985       | 66.6667     | 0.1033      | 1.1152      | False                  | -1.0173              | False Always                               | OQPSK                                  |
| 1        | 0.7241      | 3.6245      | 0.0150      | 1.0152      | False                  | -1.6369              | False Always                               | OQPSK                                  |
| 2        | 0.2316      | 1.3014      | 0.0015      | 1.0015      | True                   | 0.4276               | True if $\frac{P_{D1}}{P_{A}} > 0.4276$    | $MSK : if \frac{P_{D1}}{P_A} > 0.4276$ |
|          |             |             |             |             |                        |                      | - A                                        | OQPSK : otherwise                      |
| 3        | 0.0500      | 1.0526      | 0.0001      | 1.0001      | True                   | 0.0554               | True if $\frac{P_{D1}}{P_A} > 0.0554$      | MSK : if $\frac{P_{D1}}{P_A} > 0.0554$ |
|          |             |             |             |             |                        |                      |                                            | NOV : PD1 > 0.0044                     |
| 4        | 0.0043      | 1.0043      | 0           | 1           | True                   | 0.0044               | True if $\frac{P_{D1}}{P_{A}} > 0.0044$    | MSK : if $\frac{D_1}{P_A} > 0.0044$    |
|          |             |             |             |             |                        |                      | 1 A                                        | OQPSK : otherwise                      |
| 5        | 0.0003      | 1.0003      | 0           | 1           | True                   | 0.0003               | True if $\frac{P_{D1}}{>0.0003}$           | MSK : if $\frac{P_{D1}}{P_A} > 0.0003$ |
| 5        | 0.0005      | 1.0005      | 0           |             |                        |                      | $P_A > 0.0005$                             | OQPSK : otherwise                      |
| 6        | 0           | 1           | 0           | 1           | True                   | 0                    | True Always                                | MSK                                    |
| $\geq 7$ | 0           | 1           | 0           | 1           | True                   | 0                    | True Always MSK                            |                                        |

terms of overall power consumption in transmitting a packet successfully when channel conditions are quite high. From Tabel II, it can observed that the estimated transmission per packet for MSK chain is always greater than that of OQPSK chain because PER performance for OQPSK chain always better than that of MSK chain (Fig. 8c). So,  $r_1 \ge r_2$  in equation 28. Hence, the necessary condition for making equation 28 to be true is:  $(r_1 - 2r_2) \le 0$ . If  $r_2 \le r_1 \le 2r_2$ , then only Equation 28 holds true i.e. MSK chain can perform better than OQPSK chain in terms power consumption only if  $r_2 \le r_1 \le 2r_2$ . For this range of  $r_1$ , we can rewrite the equation as below:

$$\frac{P_{D1}}{P_A} > \frac{r_2 - r_1}{(r_1 - 2r_2)} \tag{29}$$

The value of  $\frac{P_{D1}}{P_A}$  (i.e. the ratio of digital baseband power to analog power) and ETP corresponding to different SNRs is given in Table II.

We can conclude that for noisy channel condition (i.e. very low SNR) the OQPSK chain gives better performance as the ratio  $\frac{P_{D1}}{P_A}$  is negative for very low SNR. On the other hand, for the high SNR the MSK chain outperforms the other one as ratio  $\frac{P_{D1}}{P_A}$  is positive. Moreover, the condition on ratio  $\frac{P_{D1}}{P_A}$  is that it should be greater than a positive value which is less than one (From Table II). This condition is easily achievable from a design point of view as the baseband power is always less than the RF counterpart in the IEEE 802.15.4 transceiver [34]. As the proposed transceiver switches between the two chains according to the channel conditions, the transceiver will choose the chain for which the multiplication of power and ETP is less, thus resulting in optimized power consumption by the setting a proper threshold.

## C. Latency Analysis

The latency of the two chains is analyzed in terms of clock cycles and discussed below.

1) Coarse Frequency and Phase Offset Estimation: The latency deciding part of frequency offset estimation block is a radix-2 FFT that takes an approximate of logN stages where each stage having N additions, N/2 multiplications, 2N memory read and write operations. Therefore, for N = 1024, number of clock cycles is given by logN \* (N + N/2 + 2N + 2N) = 10 \* (1024 + 512 + 2048 + 2048) = 56320 where N represents the number of points in FFT.

10

2) Initial Offset Compensation and Preamble Detection: This block employs the pipelined CORDIC algorithm which offers an initial delay of 16 clock cycles. For the preamble to be detected, at least 128 samples need to be compensated. Each sample takes 2 cycles for memory read and the sum. Thus, the number of cycles required to get is 16 + 128 \* 2 = 272. The detection of 128 samples to get the bit stream, the correlation with the reference preamble and the comparison with threshold requires 128 + 1 + 1 = 130 clock cycles. Thus, total number of cycles required for this block is 272 + 130 = 402

3) Proposed Residual Phase Noise Compensation: For a set of 32 samples, viz, one chip, the proposed algorithm encounters the following delays in terms of clock cycles: 16 cycles in initial CORDIC delay, 64 cycles in register banks, 32 cycles in sum and detection, 2 cycles in de-spreading and comparison. In this way, it offers a latency equivalent to 114 clock cycles to every symbol. Considering the clock frequency of 16 MHz, this latency is quite reasonable.

4) Differential Phase Detection: This block has an initial delay of 32 clock cycles by the two CORDIC modules and also requires of 8 more clock cycles for the sample delay. After that one cycle is required for detection of each bit. In this way for the packet size of 100 bytes the total delay by this block is 40 + (200\*32) cycles.

The latency in terms of number of cycles caused by all the blocks is summarized in Table III. As the system use the clock frequency of 16 MHz resulting in clock period  $T_c = 0.0625 \mu s$ , the total latency of both the chains are calculated and shown in the same Table. It can be observed from the Table III that the MSK chain has considerable low latency, 0.4 ms amounting to only 28% of that of OQPSK chain (1.45 ms), thus highly suitable for applications that require less latency but are error tolerable. Considering a typical 100 byte frame size, the frame duration is 3.2 ms according to IEEE 802.15.4 standard. In this way, switching from OQPSK chain to MSK chain, makes the transmission around 30% faster if the communication channel is good. However, if communinication environment is rapidly changing like in vehicular networks, ocean sensor mesh networks etc., the proposed method may not give optimal performance. This is due to the fact that the proposed receiver need to switch its modes rapidly resulting in more latency and power consumption.

TABLE III: Latency analysis of the proposed system for a packet of length 100 bytes

| Algorithm                    | OQPSK Chain                     | MSK Chain                       |
|------------------------------|---------------------------------|---------------------------------|
| Coarse Frequency and Phase   |                                 |                                 |
| Offset Estimation*           | 56320                           | -                               |
| Initial Offset Compensation  |                                 |                                 |
| and Preamble Detection       | 402                             | -                               |
| Proposed Residual Phase      |                                 |                                 |
| Noise Compensation           | 200*114                         | -                               |
| Differential Phase Detection | -                               | 40 + (200*32)                   |
| and Preamble Detection       |                                 |                                 |
| Total number of cycles       | 23202                           | 6440                            |
| Total Latency                | $23202 * T_c = 1.45 \text{ ms}$ | $6440 * T_c = 0.402 \text{ ms}$ |

\*Frequency offset estimation is not included in the total number of cycles due to the fact that it need not be performed for every packet. Once the offset is estimated, the value can be used to compensate for other packets unless there is a major change in hardware or environment.

TABLE IV: Summary of Power Consumption and Hardware Utilization of the Proposed Receiver

| Receiver  | Hardware Utilization |      |       | Power Consumption (mW) |         |        |
|-----------|----------------------|------|-------|------------------------|---------|--------|
| Mode      | Slices               | FFs  | LUTs  | Static                 | Dynamic | Total  |
| MSK       | 1520                 | 2783 | 5141  | 127.77                 | 223.71  | 351.48 |
| OQPSK     | 5468                 | 5803 | 17644 | 165.77                 | 487.84  | 653.61 |
| Dual-Mode | 5836                 | 6514 | 18181 | 129.82                 | 533.03  | 662.85 |

TABLE V: ASIC results of the proposed dual-mode receiver

| Result       | OQPSK                           | Dual-mode                       |
|--------------|---------------------------------|---------------------------------|
| Gate Count   | 229202                          | 288087                          |
| Power $(mW)$ | 8.89                            | 10.27                           |
| Core Area    | $1.33 \ge 1.33 \ (mm^2)$        | $1.49 \text{ x } 1.48 \ (mm^2)$ |
| Chip Area    | $1.67 \text{ x } 1.67 \ (mm^2)$ | $1.87 \text{ x } 1.86 \ (mm^2)$ |

## D. Hardware Implementation

We have developed the Verilog-RTL for the proposed dualmode receiver. The complete design is simulated, synthesized, and validated using Xilinx Kintex-7 FPGA KC705. The summary of hardware utilization and power consumption is given in Table IV. As MSK receiver chain is mostly built from the shared hardware blocks and derived from OQPSK chain, thus it reflects the negligible power and hardware overhead in comparison with OQPSK chain. It can be inferred from the power consumption result given in Table IV that when MSK chain operates alone, it consumes the dynamic power of 223.71 mW much less than 487.83 mW of OQPSK operating alone. The power consumption by shared architecture is only 45% of the power consumed by the individual chains. Thus by switching the chain from OQPSK to MSK in favorable channel condition results in significant saving in power consumption.

The proposed dual-mode receiver has also been implemented in an application-specific integrated circuit (ASIC) using the UMC 0.18  $\mu m$  CMOS technology with the global operating voltage of 1.8 Volts. Faraday Technology SRAM compiler has been used for the internal memory architecture of the proposed dual-mode receiver and the ASIC implementation results are shown in Table VI. The proposed dualmode baseband receiver consumes a power of 10.27 mW with the gate count of 288K. On the other hand, the power consumption of the OQPSK reciver without the proposed phase noise compensation method is 8.89 mW. In this way, with a justified energy increment, the dual mode receiver can be cater to the diverse nature of IoT applications because the dual-mode receiver switches to MSK chain in the good channel environment. The clock frequency used in the implementation is 16 MHz and the proposed dual mode receiver design is synthesized using the Synopsys Design Compiler. However, the power consumption and the chip area can be further reduced by performing multiple refinements in the layout and by using the low power CMOS process technology.

# VII. CONCLUSION AND FUTURE SCOPE

In this paper, we proposed a DSSS-aided residual phase noise compensation method for IEEE 802.15.4 to provide improved error rate performance for critical IoT applications. The proposed method provides better BER than that of existing methods in the channel with frequency offset up to 80 ppm. We also proposed a dual-mode receiver utilizing the proposed method which works for critical as well as sustainable IoT application. The proposed receiver can be configured to run for sustainable or critical applications in the manual mode. It can also adjust itself according to channel condition to optimize power consumption and error performance when in automatic mode. We compared the performance of the proposed receiver for both of the modes in terms of BER, PER, power consumption and hardware utilization. The simulation results show that the proposed receiver significantly reduces overall power consumption to transmit a packet successfully considering packet error rate and re-transmission scenario.

The future scope of the work includes the integration of the proposed digital base-band receiver to a low power RF compliant to IEEE 802.15.4 in order to propose fully integrated dual-mode receiver. Future scope further includes the analysis of fully integrated receiver in terms of various parameters like sensitivity, energy efficiency, channel rejection, noise figure etc.

#### ACKNOWLEDGMENT

The authors would like to thank Tata Consultancy Services (TCS) for partially funding this work through TCS Research Scholar Program.

#### REFERENCES

- L. D. Xu, W. He and S. Li, "Internet of Things in Industries: A Survey," in *IEEE Transactions on Industrial Informatics*, vol. 10, no. 4, pp. 2233-2243, Nov. 2014.
- [2] Cellular networks for massive IoT, ericsson white paper, January 2016, https://www.ericsson.com/res/docs/whitepapers/wpiot.pdf
- [3] E. Nilsson and C. Svensson, "Power Consumption of Integrated Low-Power Receivers," in *IEEE Journal on Emerging and Selected Topics in Circuits and Systems*, vol. 4, no. 3, pp. 273-283, Sept. 2014
- [4] T. Kim, I. H. Kim, Y. Sun and Z. Jin, "Physical Layer and Medium Access Control Design in Energy Efficient Sensor Networks: An Overview," in *IEEE Trans. on Industrial Informatics*, vol. 11, no. 1, pp. 2-15, Feb. 2015.
- [5] IEEE Standard for Local and metropolitan area networks–Part 15.4: Low-Rate Wireless Personal Area Networks (LR-WPANs), IEEE 802.15.4-2011.
- [6] U. Noreen, A. Bounceur, L. Clavier and R. Kacimi, "Performance evaluation of IEEE 802.15.4 PHY with impulsive network interference in cupcarbon simulator," 2016 Int. Symp. on Networks, Computers and Comm. (ISNCC), Yasmine Hammamet, 2016, pp. 1-6.
- [7] V. C. Gungor and G. P. Hancke, "Industrial Wireless Sensor Networks: Challenges, Design Principles, and Technical Approaches," in *IEEE Trans. on Industrial Electronics*, vol. 56, no. 10, pp. 4258-4265, Oct. 2009
- [8] Shengchen Dai, Hua Qian, Kai Kang, Weidong Xiang, "A Robust Demodulator for OQPSKDSSS System", *Circuits System and Signal Processing* (2015), vol. 34, no. 1, pp. 231247.
- [9] D. Park, C. S. Park and K. Lee, "Simple Design of Detector in the Presence of Frequency Offset for IEEE 802.15.4 LR-WPANs," in *IEEE Trans. on Circuits and Systems II: Express Briefs*, vol. 56, no. 4, pp. 330-334, April 2009.
- [10] A. K. Nain and P. Rajalakshmi, "A reliable covert channel over IEEE 802.15.4 using steganography," 2016 IEEE 3rd World Forum on Internet of Things (WF-IoT), Reston, VA, 2016, pp. 711-716.
- [11] Y. Liu, A. Ba, J. H. C. van den Heuvel, K. Philips, G. Dolmans and H. de Groot, "A 1.2 nJ/bit 2.4 GHz Receiver With a Sliding-IF Phase-to-Digital Converter for Wireless Personal/Body Area Networks," in IEEE Journal of Solid-State Circuits, vol. 49, no. 12, pp. 3005-3017, Dec. 2014.
- [12] D. Liu, X. Liu, W. Rhee and Z. Wang, "A 19.2mW 1Gb/s secure proximity transceiver with ISI pre-correction and hysteresis energy detection," 2016 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), San Francisco, CA, 2016, pp. 75-78
- [13] D. Liu, X. Ni, R. Zhou, W. Rhee and Z. Wang, "A 0.42-mW 1-Mb/s 3- to 4-GHz Transceiver in 0.18- μm CMOS With Flexible Efficiency, Bandwidth, and Distance Control for IoT Applications," in IEEE Journal of Solid-State Circuits, vol. 52, no. 6, pp. 1479-1494, June 2017.

- [14] J. Gil et al., "A Fully Integrated Low-Power High-Coexistence 2.4-GHz ZigBee Transceiver for Biomedical and Healthcare Applications," in IEEE Transactions on Microwave Theory and Techniques, vol. 62, no. 9, pp. 1879-1889, Sept. 2014
- [15] B. Xia and N. Qi, "Low-power 2.4GHz ZigBee transceiver with inductor-less radio-frequency front-end for Internet of things applications," in IET Circuits, Devices & Systems, vol. 12, no. 2, pp. 209-214, 3 2018
- [16] L. Bettini, T. Christen, T. Burger and Q. Huang, "A Reconfigurable DT  $\Delta\Sigma$  Modulator for Multi-Standard 2G/3G/4G Wireless Receivers," in *IEEE Journal on Emerging and Selected Topics in Circuits and Systems*, vol. 5, no. 4, pp. 525-536, Dec. 2015
- [17] P. Wu, C. Zhang, C. Wei, H. Jiang and Z. Wang, "A baseband transceiver for multi-mode and multi-band SoC," 2012 IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS), Boise, ID, 2012, pp. 770-773.
- [18] A. Espinoza-Rhoton et al., "An FPGA-based all-digital 802.11b & 802.15.4 receiver for the Software Defined Radio paradigm," in *International Conference on ReConFigurable Computing and FPGAs (ReCon-Fig14)*, Cancun, 2014, pp. 1-6.
- [19] A. Di Stefano, G. Fiscelli and C. G. Giaconia, "An FPGA-Based Software Defined Radio Platform for the 2.4GHz ISM Band," Ph.D. Research in Microelectronics and Electronics, Otranto, 2006, pp. 73-76
- [20] C. Y. Liu; M. S. Sie; E. W. J. Leong; Y. C. Yao; C. W. Jen; W. C. Liu; C. F. Wu; S. J. Jou, "Dual-Mode All-Digital Baseband Receiver With a Feed-Forward and Shared-Memory Architecture for Dual-Standard Over 60 GHz NLOS Channel," in *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol.PP, no.99, pp.1-11
- [21] N. Dehaese, S. Bourdel, H. Barthelemy and G. Bas, "Simple demodulator for 802.15.4 low-cost receivers," in *IEEE Radio and Wireless Symposium*, 2006, pp. 315-318.
- [22] Shouyi Yin, Jianwei Cui, Ao Luo, Leibo Liu and Shaojun Wei, "A high efficient baseband transceiver for IEEE 802.15.4 LR-WPAN systems," in ASIC (ASICON), 2011 IEEE 9th International Conference on, Xiamen, 2011, pp. 224-227.
- [23] M. A. Zubair, A. K. Nain, J. Bandaru, P. Rajalakshmi, U.B. Desai, "Reconfigurable Dual Mode IEEE 802.15.4 Digital Baseband Receiver for Diverse IoT Applications," accepted for publication in *Internet of Things (WF-IoT), IEEE World Forum on*, Reston, Dec. 2016
- [24] Bo-Seok Seo, Su-Chang Kim and Jinwoo Park, "Fast coarse frequency offset estimation for OFDM systems by using differentially modulated subcarriers," in IEEE Transactions on Consumer Electronics, vol. 48, no. 4, pp. 1075-1081, Nov 2002.
- [25] T. M. Schmidl, D. C. Cox, "Robust Frequency and Timing Synchronization for OFDM", IEEE Trans. Commun., vol. COM-45, no. 12, pp. 1613-1621, Dec. 1997.
- [26] P. Ciblat and E. Serpedin, "A fine blind frequency offset estimator for OFDM/OQAM systems," in IEEE Transactions on Signal Processing, vol. 52, no. 1, pp. 291-296, Jan. 2004.
- [27] D. Rife and R. Boorstyn, "Single tone parameter estimation from discrete-time observations," in *IEEE Transactions on Information Theory*, vol. 20, no. 5, pp. 591-598, Sep 1974.
- [28] S. Pasupathy, "Minimum shift keying: A spectrally efficient modulation," in IEEE Communications Magazine, vol. 17, no. 4, pp. 14-22, July 1979.
- [29] R. Mehlan, Yong-En Chen and H. Meyr, "A fully digital feedforward MSK demodulator with joint frequency offset and symbol timing estimation for burst mode mobile radio," in *IEEE Transactions on Vehicular Technology*, vol. 42, no. 4, pp. 434-443, Nov 1993.
- [30] Morelli, M. and Mengali, U. (1998), "Feedforward frequency estimation for PSK: A tutorial review" Eur. Trans. Telecomm., 9: pp. 103116.
- [31] Athanasios Papoulis, S. Unnikrishna Pillai, "Two Random Variables", Probability, Random Variables and Stochastic Process, 4th ed. India:Tata McGraw-Hill 2002, ch.6, pp. 169-242
- [32] T. Masamura, S. Samejima, Y. Morihiro and H. Fuketa, "Differential Detection of MSK with Nonredundant Error Correction," in *IEEE Transactions on Communications*, vol. 27, no. 6, pp. 912-918, Jun 1979.
  [33] C. C. Wang, C. C. Huang, J. M. Huang, C. Y. Chang and C. P. Li,
- [33] C. C. Wang, C. C. Huang, J. M. Huang, C. Y. Chang and C. P. Li, "ZigBee 868/915-MHz Modulator/Demodulator for Wireless Personal Area Network," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 16, no. 7, pp. 936-939, July 2008.
- [34] W. Kluge et al., "A Fully Integrated 2.4-GHz IEEE 802.15.4-Compliant Transceiver for ZigBee Applications," in *IEEE Journal of Solid-State Circuits*, vol. 41, no. 12, pp. 2767-2775, Dec. 2006.