¢,

www.arpnjournals.com

# ON CHIP COMMUNICATION ARCHITECTURE POWER ESTIMATION IN HIGH FREQUENCY HIGH POWER MODEL

Khalid B. Suliman<sup>1</sup>, Rashid A. Saeed<sup>2</sup> and Raed A. Alsaqour<sup>3</sup>

<sup>1</sup>Department of Electrical and Electronic Engineering, Omdurman Islamic University, Khartoum, Sudan <sup>2</sup>College of Engineering, Sudan University of Science and Technology (SUST), Khartoum, Sudan <sup>3</sup>School of Computer Science, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia

E-Mail: raed.ftsm@gmail.com

# ABSTRACT

System-on-Chip (SoC) on chip communication architecture solved the problem of how to interconnect hundreds of processing element (PE) and storage element (SE) inside one chip, but in the other hand it introduced power consumption hindrance in the communication elements such as bridges, bus wire, bus interface and arbiters to the overall power usage in the chip. Various power estimation techniques was introduced mostly focusing only on the power consumed in parts of the SoC communication architecture, like the global bus interconnect or the bus wire those techniques only tackles part of the overall consumed power. This paper proposes a system level power consumption estimation model for SoC for all of the communication elements with high frequency effects and system communication activity consideration.

Keywords: system-on-ship, on chip communication, power estimation, power consumption.

# INTRODUCTION

The development in embedded system chips is still evolving in its capabilities to cover the everlasting needs in high edge technology over the world for production and manufacturing. The development led to a plethora of System on Chip (SoC) designs having high processing capabilities with high memory and interfaces. All of this requirements increased the consumed power within the chip. Bearing this in mind, we find all system designers over the world are optimizing power usage efficiency over the system chip due to its low power budget usage (e.g. batteries). Based on all of this, we find that a good power optimization system must be built over an accurate power estimation model.

To show the significance of the power consumed in the communication architecture, we compare the power consumed in the communication architecture to the power consumed to other system component (Table-1) [1]. The table presents data obtained from gate-level power measurements and manufacturer data sheets of several commercial SoC components, including complete communication architecture. The table indicates that the communication architecture can consume significantly more power than many system components. In fact, its power is comparable in magnitude to well-known primary sources of power consumption (e.g., processors, caches).

| System<br>component  | Part name                  | Power (mW) |
|----------------------|----------------------------|------------|
| Embedded processor   | ARM946E-S <sup>1</sup>     | 60         |
| Memory controller    | DW_ahb_memctl <sup>2</sup> | 29.1       |
| On-Chip bus          | DW_AMBA <sup>2</sup>       | 22.6       |
| Cache                | ARM946E-S <sup>1</sup>     | 36         |
| Interrupt controller | DW_ahb_ictl <sup>2</sup>   | 2.6        |
| UART                 | DW apb uart <sup>2</sup>   | 4.1        |

# Table-1. Power consumption of SoC components@ 200 Mhz.

This work introduces a power SoC estimation model based on a system level analysis for a modified piece of the art ARM Bus Architecture (AMBA chip), and that by decomposing the SoC chip power to the power consumed in the logic element such as arbiter, decoder, input devices and output devices. By using a power estimation model that i's include any device functionality. Then, the power consumed in the bus wire by using a power model takes in its consideration high frequency work situation and system activity variation.

# **RELATED WORK**

A plethora of works was made in the topic of power estimation in SoC using many analytic techniques in the register transfer level as in [1-4]. A systematic level

Ę,

#### www.arpnjournals.com

analysis was made in on-chip communication system to analyze the chip into logic elements. Bus wire and bus interface element calculate the power in any part of the chip and recommend a number of power decreasing schemes [1]. In [2], the abstraction level of design-entry of hardware systems was raised from register transfer level (RTL) to electronic system level (ESL) to provide a solution for two recognizable problems - how to enable a power aware design to flow with design entry point at the ESL and how to enable power aware at a high level of synthesis to utilize RTL implementation from ESL automatically. Development in platform for power analysis and build an improved system with power analysis was suggested [5]. Another model was done featuring three important issues - power measurement, power analysis and power management. This system model helped the designers to develop a high performance in embedded system designs with low power consumption. A framework for describing the power behavior in systemlevel designs has been proposed in [6]. A model with a set of resources, an environmental workload specification and a power management policy serves as the heart of the system. These abilities mentioned before map this model to a simulation-based framework so as to obtain estimation to system's power dissipation. Also, they proposed an algorithm to optimize power management policies, the optimization algorithm can be used in a tight loop with the estimation engine to derive new power-management policy algorithms for a given system-level description.

#### OUR APPROACH

This paper has discussed the estimation of energy consumed in on-chip communication system exactly like a simplified AMBA chip as in Figure-1. It has an advanced High Performance Bus (AHB) which is a pipelined bus, in other word, the address and the data for different transaction may overlap on time [7]. All the slaves are memory mapped, for each transfer, Multiplexers route address properly, write data and control parameters from the masters to the slaves. The decoder generates slave selected signals to contact to the correct slave, as well as slave responses and reads data from the slaves back to the masters. The arbiter regulates access to the shared bus using a configurable arbitration scheme. Burst transactions enable the master to perform a sequence of transfers without requiring arbitration for each transfer. Making a system that has level analysis depending on the theory [4]. modifies the model to insert high frequency effect as resistor inductance and capacitance (RLC) effect. We focused in the wire bus model by inserting system activity response to proposing two bus model; the first one is a high frequency model and the second one is high level power model with high frequency effect. The first model is presumed to be the normal activity model while the second model is switched to high activity situation, Figure-2 shows system flowchart.

#### POWER ESTIMATION MODEL

A macro model was used to obtain the energy consumption for the AMBA AHB bus matrix communication architecture which gives the total energy consumption of the bus. This macro model was first introduced in [7]. There were some modifications made in the wire energy model which include high frequency effect as can be seen in equation 8.



Figure-1. Used AMBA chip model.



Figure-2. System flowchart.

A switch in the wire model is done according to the system activity which can be related to the master device activity. Master device activity is noticed in the consumed energy of the master device, according to that we can make the switch between the two wire model (equation 6 and 9). The whole system model is expressed in equation 1:

$$E_{TOTAL} = E_{INP} + E_{DEC} + E_{ARB} + E_{OUT} + E_{WIRE}$$
(1)

# (C)

#### www.arpnjournals.com

where  $E_{INP}$  and  $E_{DEC}$  are the energy for the master and decoder components for all the masters devices connected to the bus matrix,  $E_{ARB}$  and  $E_{OUT}$  are the energy for arbiters and slave stages which connects slaves to the bus matrix, and  $E_{WIRE}$  is the energy of the bus wires that connects the masters and slaves.

Each master device is connected to a bus matrix which has its own input stage that buffers address and controls bits for a transaction, the input stage model can be expressed as:

$$E_{INP} = \alpha_{inp0} + \alpha_{inp1} \cdot \psi_{load} + \alpha_{inp2} \cdot \psi_{desel} + \alpha_{inp3} \cdot \psi_{HDin} + \alpha_{inp4} \cdot \psi_{drive}$$
(2)

A decoder device is connected to every master, and consists of logic elements that generate the selected signal for any slave individually after decoding the destination address of an issued transaction. It also handles multiplexing of read data and response signals from slaves. The decoder energy consumption model can be calculated as:

$$E_{DEC} = \alpha_{dec0} + \alpha_{dec1} \cdot \psi_{slavesel} + \alpha_{dec2} \cdot \psi_{respsel} + \alpha_{dec3} \cdot \psi_{HDin} + \alpha_{dec4} \cdot \psi_{sel}$$
(3)

Each slave is connected to the bus matrix through the output stage which handles multiplexing of address and control bits from the input stage. It also interact with the arbiter to determine which master is allowed to use the bus. The energy consumption for the output stage is given by:

$$E_{OUT} = \alpha_{out 0} + \alpha_{out 1} \cdot \psi_{addrsel} + \alpha_{out 2} \cdot \psi_{datasel} + \alpha_{out 3} \cdot \psi_{HDin} + \alpha_{out 4} \cdot \psi_{noport}$$
(4)

The arbiter is influenced by the output stage, and works according to an arbitration scheme to grant access to one of the potentially several masters requesting an access to the slave. The cycle energy model for the arbiter is given by:

# $E_{ABB} = \alpha_{ab0} + (\alpha_{ab1} + n\alpha_{ab2}) \cdot y_{ab} + \alpha_{ab3} \cdot y_{ab+1} + (\alpha_{ab4} + n\alpha_{ab5}) \cdot y_{ab3} + \alpha_{ab5} \cdot y_{ab+1} + (5)$

where  $\alpha$ - represent energy coefficient,  $\psi$ - represent control signals and n- represent number of input device.

The total power consumption of a bus in this model will be regarded as high activity power model which is given by the equation below:

$$Ptotal = Psw + Pvias + Prepeater$$
(6)

where *Psw* is the power consumption due to switch interconnect capacitance and inter-wire coupling. *Pvias* represents the power consumed in the *vias* due to the use of multiple metal layers. Its model can be seen in equation 7, and Prepeater is the power consumed by repeaters

which is used to minimize signal delay, its model can be seen in equation 8. The switching energy model makes use of a table first presented by Taylor *et al.*, in [8], where total switching power is determined by the types of transitions not the number of transitions that can occur on the interconnect.

$$P_{vias} = V_N \cdot P_{via} \tag{7}$$

where  $V_N$  is the number of Vias.  $P_{via}$  represents the power consumed by a single via.

$$P_{rep} = Z_{rep} V_{dd}^2 f \cdot \sum_{i \in I} \rho_i N_{Ri} + P_{via} \cdot \sum_{i \in I} V_{Ri}$$
(8)

where  $P_{rep}$  is the power consumed in the repeater.  $Z_{rep}$  represents repeater impedance.  $V_{dd}$  is the operational voltage.  $\rho_i$  is the switching activity.  $N_R$  is the number of repeaters. *f* is the clock frequency.  $V_R$  The total number of repeater vias, it is calculated to be twice  $N_R$ .

Another wire model was proposed by [9]. This model is used as a high frequency RLC model using model order reduction technique.

$$E_{bus} = r^2 c^2 \times \frac{c(2l - cr^2)}{(r^2 + \sqrt{c(2l - cr^2)})} e^{-x} \frac{r(\sqrt{c(2l - cr^2)} + \sqrt{cr})}{l - cr^2}$$
(9)

where r is resistance per length, l represents inductance per length, c is capacitance per length and x is the bus length.

#### **RESULTS AND DISCUSSIONS**

The power estimated in Figure-3 resembles the power consumed in the master device.



Figure-3. Power consumed in the master device.

#### www.arpnjournals.com

The master device are proposed to be active for the entire simulation period communicating and sending data to the slave device.

The power estimated in Figure-4 resembles the power consumed in the decoder; decoder activity is low compared to the master device because it is only active when the master needs to select the slave device and when some multiplexing is needed by the slave device.



Figure-4. Power consumed in the decoder.

The power estimated in Figure-5, resembles the power consumed in the arbiter, the arbitration process depends on the master devices activity hence the arbiter job is to be sure that only one master can use the bus in any time by using an arbitration scheme.



Figure-5. Power consumed in the arbiter.

Power dissipated in the wires was estimated by using two power system model. The first model is developed for high frequency response system with low system activity. Dealing with the wire as an RLC network, the power estimation for this model is shown in Figure-6. The second model was developed for high frequency response system with high system activity. Depending on the transition activity between wire interconnect and the power consumed in the repeater and the *vias*, power estimation for the second model is shown in Figure-7.



Figure-6. Power consumed in the wire for normal high frequency system.



Figure-7. Power consumed in the wire for high system activity.

The major power consumption was in the master device (input stage) is about 58% of the total power dissipation, which can be related to the high activity for the master device in initiating data transmission and managing all the connected devices (see Figure-8). In the other hand, the least power consumption was in the bus wires at less than 1% of the total dissipated power (see

()

#### www.arpnjournals.com

Figure-8). That is because the dissipated power only depends on bus wire material, length and wire interconnects. The other devices consumed power is about the same at 14% of total power consumed in the decoder and 17% on the output stage and 10% in the arbiter, which can be related to their work nature as they are active for bursts of time when they are communicating with the master device, you can see the power consumption in the system devices as a whole in Figure-9.



Figure-8. Power consumption percentage for all system components.



Figure-9. Power consumed in the AMBA chip parts.

# CONCLUSIONS

This paper has presented a power estimation for a simplified AMBA bus chip architecture with an emphases on a system level analysis using Matlab, the system power estimation model was segmented to a five power models, part of the power estimation model was dedicated to estimate power in the input stage by making it work at high activity level transmitting data to the output device and managing all the device connected.

The reached result recovered that most of the power was consumed in this device at about 60% of the total power. Another part of the power estimation model was dedicated to the output stage dealing with it as a slave device only receiving and transmitting data with the input stage the reached result showed that only 10% of the power was consumed in this stage. There was a model

dedicated to the bus wires which is the most focused part on this research by making the power estimation depends on the system activity by using two wire power estimation for high system activity and normal system activity.

The high system activity was proposed as a high power level model. The normal system activity was proposed as power estimation for a modified RLC network. The reached result in both models shows that the least power dissipation was in the bus wire at less than 1% of the total power. The other two power estimation model was dedicated to the middle stage devices (decoder and arbiter). Those two devices work only to help the input device to communicate with the output device and vice versa. The reached power result shows that only about 31% of the total power was consumed in those devices.

As a future work we can approach reduction techniques for high level or multilevel synthesis based design frameworks enabling clock-gating from ANSI C description. This approach can be extended for some of the dynamic and static power reduction techniques.

#### ACKNOWLEDGMENTS

This research was supported in part by the Ministry of Higher Education, Malaysia and University Kebangsaan Malaysia (UKM), and Malaysia under the research Grant numbers: FRGS/1/2012/SG05/UKM/02/7 and DIP-2014-037.

# REFERENCES

- K. Lahiri and A. Raghunathan. 2004. Power analysis of system-level on-chip communication architectures. In Proceedings of the 2<sup>nd</sup> IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis. pp. 236-241.
- [2] S. Ahuja. 2010. High Level Power Estimation and Reduction Techniques for Power Aware Hardware Design. Virginia Polytechnic Institute and State University.
- [3] M. Onouchi, T. Yamada, K. Morikawa, I. Mochizuki and H. Sekine. 2006. A system-level powerestimation methodology based on IP-level modeling, power-level adjustment, and power accumulation. In Proceedings of the 2006 Asia and South Pacific Design Automation Conference. pp. 547-550.
- [4] E. Macii, M. Pedram, and F. Somenzi. 1998. Highlevel power modeling, estimation, and optimization. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on. 17: 1061-1079.



#### www.arpnjournals.com

- [5] L.-B. Chen, T.-Y. Ho, C.-H. Lin and J. Huang. A Real-time Power Analysis System for an Embedded Systems Development Platform. In Proc. of the 17th VLSI Design/CAD Symposium.
- [6] L. Benini, R. Hodgson, and P. Siegel. 1998. Systemlevel power estimation and optimization. In Low Power Electronics and Design, 1998. Proceedings. 1998 International Symposium on. pp. 173-178.
- [7] S. Pasricha and N. Dutt. 2010. On-chip communication architectures: system on chip interconnect: Morgan Kaufmann.
- [8] C. N. Taylor, S. Dey and Y. Zhao. 2001. Modeling and minimization of interconnect energy dissipation in nanometer technologies. In Proceedings of the 38th annual Design Automation Conference. pp. 754-757.
- [9] R. Kar, V. Maheshwari, M. Maqbool, A. K. Mal and A. Bhattacharjee. 2010. Power-Estimation for On-Chip VLSI Distributed RLC Global Interconnect Using Model Order Reduction Technique. Power. Vol. 1.