

www.arpnjournals.com

# FPGA BASED FIBER OPTIC GYROSCOPE SIGNAL DENOISING USING DISCRETE WAVELET TRANSFORM

P. Rangababu, K. Shravan Kumar and Samrat L. Sabat School of Physics, University of Hyderabad, Hyderabad, India E-Mail: p.rangababu@gmail.com, slssp@uohyd.ernet.in

# ABSTRACT

This paper presents field programmable gate array (FPGA) implementation of the forward/inverse discrete wavelet transform for denoising Fiber Optic Gyroscope (FOG) signal. In this work an extensive study on the effect of different threshold techniques of DWT algorithm are carried out denoising the FOG signal. Different architectures such as multiply and accumulate (MAC), Distributed Arithmetic (DA) and Systolic MAC are used to implemented the DWT algorithm on XILINX Virtex-5FXT-1136 FPGA development board. The resource utilization of all the three architectures are compared and the experimental result concludes that the DA architecture is the optimum architecture that gives latency of 1057 clock cycles for processing 1024 samples with maximum operating frequency as 174 MHz. Further the DWT algorithm reduces the drift by 5 and 100 times for denoising the gyroscope and accelerometer signal of the considered FOG. These results confirmed a DWT shows good improvement of denoising for processing the noisy FOG and accelerometer signal

Keywords: Discrete wavelet transforms, Distributed arithmetic, Fiber Optic Gyroscope FPGA, Signal denoising.

### 1. INTRODUCTION

Gyroscope can precisely measure the position of the object. This is used in navigation applications to accurately measure the position of the object. Accuracy of measurement and measured parameters are important for tracking applications. Various kinds of gyro devices are used in Inertial Navigational Systems for measuring the rotation rate, among which Fiber Optic Gyroscope (FOG) is popular due to its sensitivity, low drift, wide dynamic range, high accuracy and reliability [1]. The rotation angle is obtained by integrating the measured rotation rate over a period of time [2]. Thus small errors in the rotation rate lead to larger error over the integration time. Apart from the drift, many other internal and external noises also contribute to degrade the performance of FOG.

To improve the accuracy and performance of FOG, both manufacturers and users are keen to quantify the noise and to identify the source of noise associated with it. Once the noises are quantified, manufacturers can find a way to minimize the sources causing the noise and user can also improve the navigation solution by make use of different denoising algorithms to reduce the noise. Discrete Wavelet Transform is a popular algorithm used for denoising the non stationary signals. In the literature, DWT has been applied to denoise the FOG signal The DWT effectively eliminates short time errors of a sensor system [3][4][5].

Due to recent advances in SoC technology and decreasing costs there is an increasing demand for developing an efficient architecture for DWT algorithm to implement in the FPGA. Real-time implementation of DWT algorithm for FOG denoising using FPGA confronts a great challenge, where execution techniques must be reorganized to meet the time and resource constraints of the existing technology. Different techniques like convolution and lifting schemes are proposed in literature to optimize the processing time and resources for executing the DWT algorithm. However the major disadvantages of those architectures are complexity and delay. The shortcomings of those disadvantages can be overcome by using DWT with DA or SMAC based architectures. In this paper three different DWT architectures are developed using MAC, DA and systolic MAC techniques. These architectures are implemented in the FPGA and validated with denoising the three axis FOG signal. In this paper, we have implemented the DWT algorithm in FPGA real-time denoising of FOG signal. The proposed architecture for DWT has been implemented using Xilinx System Generator

The rest of the paper is organized as follows. Section 2 presents the DWT algorithm. Section 3 describes in detail the FPGA implementation of DWT. Simulation and FPGA results are presented and discussed in Section 4 and 5 respectively. Finally, conclusions and future work are given in Section 6.

## 2. DISCRETE WAVELET TRANSFORM

A wavelet is a small, localized wave of a particular shape having an average value of zero. The wavelet transform decomposes a signal into scaled and translated versions of a mother wavelet [7]. This algorithm is widely used to reconstruct the signal of interest after removing the noise. The details of algorithm are presented in [4].

Mallat proposed the pyramid algorithm for computing the DWT of a signal. It has filter bank structure and the scale and translation steps are discrete and dyadic in nature[7]. The algorithm has two stages; Decomposition and Reconstruction. In the decomposition stage the signal is analyzed at different frequency bands with different resolutions by decomposing the signal into a coarse approximation and detail information as shown in Figure-1. In this stage, at each level of decomposition the signal is passed through a half band low pass and a high pass filters



#### www.arpnjournals.com

of coefficients H0 and G0 respectively. These filter coefficients are the wavelet coefficients. In DWT the values of these coefficients effects the denoising result. For different wavelets, these coefficients are different. After decomposition, it is followed by down sampling with a factor 2. Each level of the decomposition produces low frequency component (approximation) and high frequency component (detail) of the signal. In DWT, the selection of level of decomposition also plays an important role in denoising. For selecting the level of decomposition, the mean value of detail wavelet coefficient at each level is computed. Since, white noise has zero mean, maximum level of decomposition is same as the level up to which mean value of detail coefficient is zero. As the detail coefficient contains high frequency components, most of the high frequency noise lies in detail part. The coefficient contains rotation approximation rate information. The noises from high frequency (detail) coefficients are removed by applying a suitable threshold. For the reconstruction reverse procedure of analysis part is performed. Different threshold techniques are tested for identifying the suitable threshold method. The detail coefficients are used in reconstruction stage after thresholding. Different types of threshold techniques also effect the denoising result. The efficiency of the wavelet de-noising algorithm depends on the selection of the threshold and types of threshold selection rules [7]. There are two threshold methods, (a) Hard threshold (b) Soft threshold. There are 4 threshold selection rules for selecting the value of threshold i.e; Rigrsure, Sqtwolog, Heursure and Minimaxi. By comparing all the four threshold selection rules, the Minimaxi and SURE threshold selection rules are found to be more conservative and convenient, when small details lie near the noise range [6]. The square root log threshold removes the noise more efficiently and has lower computation compared to other threshold techniques, and is more suitable for hardware implementation

# 3. FPGA IMPLEMENTATION OF DWT ALGORITHM

In this section we describe the details of hardware platform for FPGA implementation of the DWT algorithm and the detail description of Distributed Arithmetic (DA) FIR filter for convolving the signal with filter coefficients, and finally comparison of DA with MAC and SMAC architectures is presented.



Reconstruction stage

Figure-1. Decomposition and reconstruction structure of DWT.

©2006-2012 Asian Research Publishing Network (ARPN). All rights reserved.





Figure-2. Sys-gen architecture of Decomposition block.







Figure-4. Sys-gen architecture of Square rootlog hard threshold of DWT.



#### www.arpnjournals.com

#### a) Hardware implementation

The DWT based denoising algorithm using three different architectures i.e., DA, MAC, SMAC FIR filter is designed in System-generator (Sysgen) for DSP environment. Sysgen is a system-level modeling tool that facilitates FPGA hardware design. It is anadd on tool in the Simulink for hardware design. The tool provides highlevel abstractions that are automatically compiled into an FPGA. It supports bit-accurate and cycle-accurate blocks as well as simulations [8]. Bit-accurate blocks produce values in Simulink that match corresponding values produced in hardware; cycle-accurate blocks produce accurate simulation results that helps to verify the design in real time. It also supports Co-simulation and allows the compiled portion to be tested in actual hardware and can speed up simulation dramatically which is useful to test the hardware in real time [8].

#### b) Distributed- arithmetic fir (DA) filter

The efficiency of any DSP algorithm mainly depends on its hardware architecture. The selection of the architecture for system level DSP algorithms such as DWT plays an important role in the performance evaluation. Hardware implementation of computationally intensive DWT algorithm requires several half band high pass and low pass filters. We have realized these filters using Distributed-Arithmetic (DA), multiply and accumulate (MAC), Systolic MAC structures. DA is one of the efficient methods for computing the inner product operation which constitutes the core of the discrete wavelet transform. The DA architecture uses look up table (LUT) s for realizing the hardware. Since FPGA is a bank of LUTs, so FPGA is found to be more suitable for implementation of this algorithm. Implementation of DWT de-noising algorithm consists of Sysgen for DSP complier, up/down-converters, blocks like FIR interpolators, summers and reinterpret blocks for wavelet decomposition, reconstruction modules and Black box module for thresholding the wavelet coefficients.

# c) Comparison between MAC, systolic MAC fir and DA based fir filter

The throughput is the performance measure for a filter. This is defined as the 'number of clock cycles per output sample' [8]. In a conventional multiply-accumulate (MAC) based FIR realization, the sample throughput is coupled to the filter length. The filter sample throughput is inversely proportional to the number of filter taps. As the filter length increases, the system sample rate proportionally decreases. This is not the case with DA-based architectures. The filter sample rate is decoupled

from the filter length. As the filter length is increased in DA FIR filter, more logic resources are consumed, but throughput is maintained. The SMAC based filters are area efficient with modular and efficient data driven array architecture type [8]. In the FIR compiler block we have selected three different architectures DA, MAC and SMAC. The Sysgen architecture of decomposition stage is shown in Figure-2. The reconstruction/synthesis part is shown in the Figure-3 where as the architecture for threshold is shown in the Figure-4

# d) DWT core for denoising

The design of DWT core is useful for processor based system on chip systems based on FPGA [10]. Which are useful for real-time signal processing or direct interfacing of other peripherals like RS232 (UART). The DWT core consists of two asynchronous FIFOs are used at input and output of DWT logic. The two asynchronous FIFOs are of depth 2048 and width 32 bit. The input signal is processed as stream with each stream having 2048 samples. However we use only 1024 samples for processing one frame signal. The other 1024 samples used for buffering purpose and checking FIFO 50% full or not. The DWT core accepts Start and Input data (*fifo\_ip\_data*) from and receives denoised data through Output data (fifo\_op\_data). For handshaking purpose the associated full/empty qualifiers of FIFO are used for reading and writing.

# 4. SIMULATION AND IMPLEMENTATION RESULTS

The signals from all the three axis i.e., x-axis, yaxis and z-axis are collected in static condition with a sampling frequency of 200Hz at room temperature (20°C) for 1 hour. Since the FOG signal is sampled at 200 Hz, according to Nyquist criteria, the maximum frequency content in the signal is 100 Hz shown in Figure-5(a). After passing the signal through half-band high pass and low pass filter, the low pass filter contains the frequency content from 0-50 Hz shown in Figure-5(b), after second level decomposition, the approximation part contains 0-25 Hz shown in Figure-5(c). Hence, after third, fourth, fifth, sixth and seventh level decomposition, the data contains only very low frequency components as shown in Figure-5(d), Figure-5(e), Figure-5(f) and Figure-6(a) respectively. Figure-6(b) represents reconstructed denoised signal for one complete frame. In this work, db2 wavelet with 7 level of decomposition is considered for denosing the FOG signal. In the figures, SW DWT and HW DWT correspond to the results obtained using MATLAB simulation and Sysgen generated RTL respectively.

©2006-2012 Asian Research Publishing Network (ARPN). All rights reserved.





Figure-5. DWT decomposition results up to sixth level.

#### VOL. 7, NO. 11, NOVEMBER 2012

#### ARPN Journal of Engineering and Applied Sciences

©2006-2012 Asian Research Publishing Network (ARPN). All rights reserved.





Figure-6. DWT 7<sup>th</sup> level decomposition and reconstruction results for one frame.

The effects of different thresholding techniques are studied and the denoised signals corresponding to different threshold techniques are shown in Figures-7 and 8. From these figures it is observed that soft thresholding gave smoother results compared to hard thresholding. Hence, for better noise elimination for the given data, either *sqtwolog* or *heursure* rules are preferred.



Figure-7. Threshold selection rules in Soft threshold for x gyro.

# 5. FPGA IMPLEMENTATION RESULTS

To test the FPGA implemented DWT hardware for denoising the FOG signal all the three axis data are simulated, however as a case study only x-gyro, y-gyro, x-accelerometer and y- accelerometer signals are shown in the figures. The data was collected for four minute duration and contains 40, 000 samples. This data consists of scale-factor error and bias drift error, which are random in nature. Ideally the data should indicate zero degree/sec, as the gyro sensor is stationary, but because of bias drift, the data gets fluctuate between -0.4 and 0.4. The hardware DWT de-noising reduces the bias drift error, which indirectly eliminate the high frequency component of the gyro signal without losing the rotation information. The denoised signal using the RTL simulation and the noisy signal are shown Figure-9.

©2006-2012 Asian Research Publishing Network (ARPN). All rights reserved.

#### www.arpnjournals.com



Figure-8. Threshold selection rules in hard threshold for x-gyro.



Figure-9. Co-simulation of x-gyro denoising DWT.

After performing the DWT de-noising, the bias drift gets reduced and the values fluctuate in between -0.01 to 0.01 with fewer variations. The wavelet de-noising removes Short term errors present in the gyro sensor data. From Figure-10 SW DWT results are closely matching with FPGA HW DWT results, even though the SW results are in floating point arithmetic. As the level of decomposition increases, the closeness of the results decreases shown in Figure-10. This can be improved by increasing the bit-width of the signal in FPGA. The standard deviation of x, y and z gyros along with accelerometers are tabulated in Table-1. The drift is reduced for a denoised signal by 5 times for gyros where as the drift is reduced by 100 times for the accelerometer signals. Figure-10 and Figure-11 shows the x, y gyro denoising results and Figure-12, Figure-13 shows x, y accelerometer denoising results, respectively. For presenting the effectiveness of denoising, the plots are shown for 40 frames only. The following conclusions can be made on the basis of the gyro and accelerometer denoising results: The algorithmic and hardware implementation results are same while denoising the static FOG signal. Figure-12 to Figure-15 shows that the denoised signal due to MATLAB and Hardware are consistent and denoise the signal effectively. The standard deviation which is a measure of drift of the signals are computed before and after applying the DWT algorithm in the hardware and are tabulated in Table-1. From this table it is observed that the gyro, accelerometer drift is reduced by 6 and 100 times respectively compared to raw signals. The accelerometer Standard deviation (STD) is calculated before the transition takes place.

©2006-2012 Asian Research Publishing Network (ARPN). All rights reserved.





Figure-12. Comparison of RTL implementation x gyro denoising.



Figure-13. Comparison of RTL implementation y gyro denoising.







Figure-15. Comparison of RTL implementation y accelerometer denoising.

#### www.arpnjournals.com

| Raw Signal | STD (Raw signal) | DWT (Denoised Signal) |
|------------|------------------|-----------------------|
| x-gyro     | 0.0689           | 0.0187                |
| y-gyro     | 0.0571           | 0.0195                |
| z-gyro     | 0.0297           | 0.0182                |
| x-accel    | 0.1186           | 0.0697                |
| y-accel    | 0.1313           | 0.0937                |
| z-accel    | 0.0682           | 0.0283                |

**Table-1.** Drift comparison of denoised signal.

Table-2. Resource utilization of various DWT architectures in virtex-5FPGA.

| Resources            | DAFIR based DWT | MAC based DWT | SMAC based DWT |
|----------------------|-----------------|---------------|----------------|
| Slice registers      | 17838 (39%)     | 2562 (5%)     | 2530 (5%)      |
| Slice LUTs           | 15589 (34%)     | 2208 (4%)     | 2179 (4%)      |
| Slices               | 5340 (47%)      | 929 (8%)      | 1163 (10%)     |
| LUT FF pairs         | 15087 (82%)     | 11635 (52%)   | 11764 (59%)    |
| DSP48E               | 13 (8%)         | 92 (71%)      | 107 (83%)      |
| BRAM                 | 12 (5%)         | 12 (5%)       | 12 (5%)        |
| Maximum freq         | 172 MHz         | 140MHz        | 108MHz         |
| Latency<br>(1 frame) | 1057            | 3125          | 1506           |

Table-3. Resource utilization of DA based DWTs in different FPGAs.

| Resources       | XUP         | Virtex-4XtremeDSP | Virtex-5FXT |
|-----------------|-------------|-------------------|-------------|
| Slice registers | 25147 (91%) | 24236 (78%)       | 17838 (39%) |
| Slice LUTs      | 19164 (69%) | 18616 (60%)       | 15589(34%)  |
| Slices          | 13696 (99%) | 13613 (88%)       | 5340 (47%)  |
| LUT FF pairs    | 19983 (72%) | 17427 (56%)       | 15087 (82%) |
| DSP48E/Mult/    | 24 (18%)    | -                 | 13 (8%)     |
| BRAM            | 16 (15%)    | 16 (15%)          | 12 (5%)     |
| Maximum freq    | 126MHz      | 155MHz            | 172MHz      |

The resource utilization of DWT core for different architectures namely DAFIR, MAC, SMAC are tabulated in Table-2. The latency of DAFIR is smaller ac compared to other architectures. The resource utilization mainly affects the number of Slices, LUTs and LUT FF Pairs. This architecture consumes less DSP48 slices compared to other architectures. The other architectures presented in this paper show less frequency of operation, higher DSP48s and lesser number of other resources. The resource utilization of DWT core on different FPGA development boards is shown in Table-3. The DWT consumes higher amount of resources in XUP and Virtex-4 boards over Virtex-5 board which is an obvious observation.

#### 6. CONCLUSIONS AND FUTURE WORK

This paper presents architecture for FPGA implementation of DWT algorithm. Three different architectural techniques are used and compared their hardware performance. DWT hardware core is developed using DA architecture. Extensive studies of different threshold methods are carried out for denosing the FOG signal. The core is implemented in the in Virtex-5 FXT development board. The DWT algorithm based on DAFIR architecture gives better results in terms of frequency of operation, resource utilization, DSP48 resources and latency. The DWT core is able to reduce the noise of FOG signal in static condition satisfactorily. The future work includes the System on chip development using this core for real time denoising of FOG signal.

#### www.arpnjournals.com

#### ACKNOWLEDGMENTS

The authors are thankful to Research Center Imarat (RCI), DRDO, Hyderabad, Government of India for providing financial support to carry out this work.

# REFERENCES

- Culshaw B. 1983. Giles I P. Fiber-optic gyroscopes. J. Phys. E (16): 5-15.
- [2] Nayak J. 2011. Fiber-optic gyroscopes: from design to production. Appl. Opt. Sep; 50(25): E152–E161.
- [3] S. Dang, W. Tian, and Z. Jin. 2009. De-noising stochastic noise in fog based on second-generation db4 wavelet and sure-threshold. Wuhan University Journal of Natural Sciences. 14: 494-498.
- [4] X. Chen. 2005. Adaptive filtering based on the wavelet transform for fog on the moving base. In Advances in Intelligent Computing, ser. Lecture Notes in Computer Science. D.-S. Huang, X.-P. Zhang and G.-B. Huang (Eds.). Springer Berlin Heidelberg. 3644: 447-455.
- [5] Mao Ben, Wu Jun Wei, Wu Jian Tong and Zhou Xue Mei. 2010. MEMS Gyro Denoising Based on Second Generation Wavelet Transform. Proceedings of the 9<sup>th</sup> International Conference on Pervasive Computing Signal Processing and Applications (PCSPA). pp. 255-258.
- [6] M. Bahoura and H. Ezzaidi. 2011. FPGAimplementation of discrete wavelet transform with application to signal denoising, Circuits, Systems, and Signal Processing. pp. 1-29.
- [7] S. Mallat. 1989. A theory for multiresolution signal decomposition: the wavelet representation. Pattern Analysis and Machine Intelligence, IEEE Transactions on. 11(7): 674-693, July.
- [8] D. J. M. Emmert. 2008. System for Dsp User Guide. Xilinx, Inc. 2100 Logic Drive, San Jose, CA 95124-3400, Tech. Rep. 10.1.3.
- [9] Pingree P, Blavier JF, Toon G and Bekker D. 2007. An FPGA/SoC Approach to On-Board Data Processing Enabling New Mars Science with Smart Payloads. Proceedings of the IEEE Conference on Aerospace. pp. 1-12.
- [10]2008. Reference Guide UG200 Embedded Processor Block in Virtex-5 FPGAs. Technical Report 10.1.3, Xilinx, Inc., 2100 Logic Drive, San Jose, CA 95124-3400.

