# ANN based STLF of Power System

### Text-only Preview

*International Journal of Computer Applications (0975 - 8887)*

*Volume 30- No.4, September 2011*

**Artificial Neural Network based Short Term Load**

**Forecasting of Power System**

Salman Quaiyum, Yousuf Ibrahim Khan, Saidur Rahman, Parijat Barman

Department of Electrical and Electronic Engineering,

American International University - Bangladesh,

Banani, Dhaka - 1213.

**ABSTRACT**

forecasting applications. These algorithms are better than back-

propagation in convergence and search space capability.

Load forecasting is the prediction of future loads of a power

system. It is an important component for power system energy

**2. ARTIFICIAL NEURAL NETWORK**

management. Precise load forecasting helps to make unit

An Artificial Neural Network (ANN) is a mathematical or

commitment decisions, reduce spinning reserve capacity and

computational model based on the structure and functional

schedule device maintenance plan properly. Besides playing a

aspects of biological neural networks. It consists of an

key role in reducing the generation cost, it is also essential to the

interconnected group of artificial neurons, and it processes

reliability of power systems. By forecasting, experts can have an

information using a connectionist approach to computation. An

idea of the loads in the future and accordingly can make vital

ANN mostly is an adaptive system that changes its structure

decisions for the system. This work presents a study of short-

based on external or internal information that flows through the

term hourly load forecasting using different types of Artificial

network during the learning phase.

Neural Networks.

**General Terms**

**2.1 Recurrent Neural Network**

A Recurrent Neural Network (RNN) is a class of neural network

Artificial Intelligence, Neural Networks.

where connections between units form a directed cycle. Unlike

**Keywords**

feed-forward neural networks, RNNs can use their internal

memory to process arbitrary sequences of inputs. A recurrent

Load Forecasting, Power System, Particle Swarm Optimization.

neural network consists of at least one feedback loop. It may

consist of a single layer of neurons with each neuron feeding its

**1. INTRODUCTION**

output signal back to the inputs of all the other neurons.

Load forecasting is one of the central functions in power

In this work, Elman's recurrent neural network has been chosen

systems operations and it is extremely important for energy

suppliers, financial institutions, and other participants involved

as the model structure which has shown to perform well in

in electric energy generation, transmission, distribution, and

comparison to other recurrent architectures [8]. Elman's network

supply. Load forecasts can be divided into three categories:

contains recurrent connections from the hidden neurons to a

short-term forecasts, medium-term forecasts and long-term

layer of context units consisting of unit delays which store the

forecasts. Short-term load forecasting (STLF) is an important

outputs of the hidden neurons for one time step, and then feed

part of the power generation process. Previously it was used by

them back to the input layer.

traditional approaches like time series, but new methods based

on artificial and computational intelligence have started to

replace the old ones in the industry, taking the process to newer

heights.

Artificial Neural Networks are proving their supremacy over

other traditional forecasting techniques and the most popular

artificial neural network architecture for load forecasting is back

propagation. This network uses continuously valued functions

and supervised learning i.e. under supervised learning, the actual

numerical weights assigned to element inputs are determined by

matching historical data (such as time and weather) to desired

outputs (such as historical loads) in a pre-operational "training

session". The model can forecast load profiles from one to seven

days.

Evolutionary algorithms such as, Genetic Algorithm (GA) [1, 2],

Particle Swarm Optimization (PSO) [3-5], Artificial Immune

System (AIS) [6], and Ant Colony Optimization (ACO) [7] have

been used for training neural networks in short term load

**Fig 1: Elman recurrent neural network topology.**

1

*International Journal of Computer Applications (0975 - 8887)*

*Volume 30- No.4, September 2011*

Figure 1 is an Elman recurrent neural network topology where

**w**

the leader and each particle keeps track of its coordinates in the

denotes a vector of the synaptic weights,

**x**and

**u**are vectors of

problem space. This fitness value is stored which is referred to

the inputs to the layers,

**m**is the number of input variables, and

**r**

as

**pbest**. Another "best" value tracked by the particle swarm

is the number of neurons in the hidden layer.

optimizer, is the best value obtained so far by any particle in the

neighbors of the particle. This location is called

**lbest**. When a

The weighted sums for the hidden and the output layers are:

particle takes all the population as its topological neighbors, the

best value is a global best and is called

**gbest**.

(1)

(7)

(2)

where,k = [1,r], n = [1,N], and N is the number of data points

(8)

used for training of the model. The outputs of the neurons in the

hidden layer and output layer are computed by passing the

where,

**Vi**is the current velocity,

**t**defines the discrete time

weighted sum of inputs through the tan sigmoid and pure linear

interval over which the particle will move, is the inertia

transfer functions respectively.

weight,

**Vi-1**is the previous velocity,

**presLocation**is the present

location of the particle,

**prevLocation**is the previous location of

Mathematically, the outputs of the hidden layer and the output

the particle and

**rand( )**is a random number between 0 and 1.

**c1**

layer can be defined as:

and

**c2**are the learning factors, stochastic factors and

acceleration constants for "gbest" and "pbest" respectively.

(3)

(4)

where,K is a coefficient of the pure linear transfer function.

Another training parameter considered is the momentum factor

as an attempt to prevent the network to get stuck in a shallow

local minimum. Equation (5) shows how the synoptic weights

are adjusted and how the network determines the value of the

increment

on the basis of the previous value of the

increment

(5)

The other important training parameter is the learning rate which

controls the amount of change imposed on connection weights

during training and to provide faster convergence.

Mathematically, the weights are updated using the equation:

(6)

where, is the learning rate.

**Fig 2: PSO-ERNN Learning Process**

Figure 2 shows the learning process of PSO-ERNN (Particle

**3. PARTICLE SWARM OPTIMIZATION**

Swarm Optimized - Elman Recurrent Neural Network). The

PSO as a tool that provides a population based search procedure

learning is initialized with a group of random particles in step 1,

in which individuals called particles change their position (state)

which are assigned with random PSO positions (weight and

with time. In a PSO system, particles move around in a

bias). The PSO-ERNN is trained using the initial particles

multidimensional search space. During flight, each particle

position in step 2. Then, it produces the learning error (particle

carries out position adjustment in accordance to its own

fitness) based on initial weight and bias in step 3. The learning

experience, to the experience of a neighboring particle. This

error at current epoch is reduced by changing the particles

helps make use of the best position of the particle encountered

position, which updates the weight and bias of the network. The

by itself and its neighbor. Thus, a PSO system combines local

"pbest" and "gbest" values are applied to the velocity update

search methods with global search methods, attempting to

according to (7) to produce a value for positions adjustment to

balance exploration and exploitation.

the best solution or targeted learning error in step 4. Step 5 has

the new sets of positions (weight and bias) produced by adding

The basic concept is that for every time instant, the velocity of

the calculated velocity value to the current position value. Then,

each particle, also known as the potential solution, changes

these new sets of positions are used to produce new learning

between its

**pbest**(personal best)and

**lbest**(local best) locations.

error in PSO-ERNN. This process is repeated until the stopping

The particle associated with the best solution (fitness value) is

2

*International Journal of Computer Applications (0975 - 8887)*

*Volume 30- No.4, September 2011*

conditions of either minimum learning error or maximum

**4.**

**DATA**

**COLLECTION**

**AND**

number of iterations are met which is shown in step 6.

**PREPROCESSING**

**3.1 Global best PSO**

A simple data collection method was employed to ensure

Global version of PSO is faster but might converge to local

adequate historical samples of the load. Load Data used in this

optimum for some problems. The Local version, though slower

work were collected from the Bangladesh Power Development

does not easily get trapped into a local optimum. We

Board (BPDB) and ISO New England.

implemented the global version to achieve a quicker result. The

position of a particle is influenced by its best visited position

**4.1 Data Scaling Methods**

and the position of the best particle in its neighborhood.

Data scaling is carried out in order to improve interpretability of

network weights. The equation below has been adopted and

Particle position, xi, was adjusted using

implemented to normalize the historical load data.

(9)

(12)

where the velocity component,

*vi*, represents the step size.

The load value is normalized into the range between 0 and 1 and

For the basic PSO,

then the neural networks are trained using the suitable algorithm.

Neural networks provide improved performance with the

normalized data. The use of original data as input to the neural

network may increase the possibility of a convergence problem.

(10)

**4.2 Data Storage**

where,

*w*is the inertia weight,

*c1*and

*c2*are the acceleration

Data can be stored in many ways. In this work, the collected

coefficients,

*r1,j, r2,j ~ U(0,1)*,

*yi*is the personal best position of

data was stored in Microsoft Excel worksheets. The worksheets

particle

*i*, and is the neighborhood best position of particle

*i*.

were then imported into MATLAB using the command

"xlsread".

If a fully-connected topology is used, then

*i*refers to the best

position found by the entire swarm. That is,

**4.3 Composition of the Input Vector of the**

prediction models

prediction models

(11)

The structure of the input vector specifies the selected

endogenous and exogenous variables. In the work of T. Gowri

where,

*s*is the swarm size.

Monohar et al. [9] five inputs were selected from the previous

day and five each from the previous weeks on the same day

The pseudo-code for PSO is shown below:

were selected to predict the load of the next day. If data points

**for**each particle

*i*1,...,

*s*

**do**

were insufficient, the forecasting would be poor. If data points

were useless or redundant, modeling would be difficult or even

Randomly initialize

*xi*

skewed [10].

Set

*vi*to zero

The input vector (IV) structure used here, consists of two

Set

*yi*=

*xi*

previous hours active power values,

*L(t-1)*and

*L(t-2)*and some

homologous consumption past load values of the previous week

**endfor**

*L(t-168)*and

*L(t-169)*. The inclusion of these values provides

**Repeat**

information regarding the consumption trend in the past

**for**each particle

*i*1,...,

*s*

**do**

homologous periods [11]. It was found that load for 24 hours

and 168 hours are highly correlated. The structure of IV is given

Evaluate the fitness of particle

*i*,

*f(xi)*

in Figure 3.

Update

*yi*

Update

*using equation (11)*

**for**each dimension

*j*1,...,

*Nd*

**do**

Apply velocity update using (10)

**endloop**

Apply position update using (9)

**endloop**

**Until**some convergence criteria is satisfied

**Fig 3: Composition of the input vector (IV) (non-weather**

**and weather sensitive model)**

3

*International Journal of Computer Applications (0975 - 8887)*

*Volume 30- No.4, September 2011*

The ISO New England historical load data was first collected

from their website. The load data was taken every hour for a

period of one week. The training data was defined from

Monday, 31st March till 6th April, 2008 and the corresponding

target was defined for the period from 7th to 13th April, 2008.

The models were also evaluated for generalization capability,

thus testing data set was defined from the first week of March

3rd to 9th. After correlation analysis, Dry Bulb and Dew Point

temperatures were used as the input for the Elman Weather

Sensitive Model.

**5. SIMULATION AND RESULTS**

For evaluating our proposed load forecast model, several neural

network architectures were implemented in MATLAB version

7.10.0.499 (R2010a). Feed Forward Network and Elman

Recurrent Network - these two networks were implanted

**Fig 4(a): The performance goal met**

according to their default architectures as provided in

MATLAB. In addition, Jordan Recurrent Network was created

Load Forecasting: Day Model

using the custom network creation process. Before starting the

4900

training of the networks, each layer was individually initialized

Actual load values

using the

*initnw*function.

Predicted load values

4800

Then each model was trained using

*traingdx*(Gradient descent

with momentum and adaptive rule backpropagation) training

4700

algorithm with the help of Neural Network Training tool. After

]

W

that each network was simulated and their performance was

M

[

4600

d

observed. Finally, Elman Network was trained using Particle

a

o

L

Swarm Optimization method. Mean Square Error (MSE)

4500

performance measure function was employed along with other

functions like MPE (Mean Percentage Error) and MAE (Mean

Absolute Error).

4400

(13)

4300 1

2

3

4

5

6

7

Day

(14)

**Fig 4(b): Forecasting Result of the maximum demand**

(15)

-3

x 10

Network Error

0

where,

*Lactual (n)*is the actual load,

*Lpredicted (n)*is the forecasted

value of the load, and

*N*is the number of data points.

-0.5

**5.1 Daily Maximum Demand Prediction**

For maximum demand prediction, the data that was collected

-1

from BPDB was insufficient for training of the network using

r

o

r

r

*traingdx.*Therefore, the network was trained using

*trainlm*

E

(Levenberg-Marquardt

backpropagaton).

This

network

-1.5

performed better - it achieved smaller MSE than the network

trained with

*traingdx*.

-2

-2.51

2

3

4

5

6

7

Day

**Fig 4(c): Actual Network Error**

4

*International Journal of Computer Applications (0975 - 8887)*

*Volume 30- No.4, September 2011*

**5.2 Forecasting with ISO New England Load**

-3

x 10

Network Error

4

**Data**

3

4

x 10

168 hours ahead STLF using training data

1.8

Actual load values

2

1.7

Predicted load values

1.6

1

r

1.5

o

r

]

r

0

W

E

k

[

1.4

d

a

o

L

-1

1.3

-2

1.2

1.1

-3

1 0

20

40

60

80

100

120

140

160

180

-4

Time [hrs]

0

20

40

60

80

100

120

140

160

180

Time [hrs]

**Fig 4(d): Output of Feedforward Network**

**Fig 4(g): Network Error (Elman)**

-3

x 10

Network Error

4

**5.3 Particle Swarm Optimized Elman**

3

**Recurrent Neural Network (PSO-ERNN)**

-4

2

x 10

average error

1

error goal

r

o

r

r

0

3

E

-1

e

c

n

a

m

-2

2

r

o

f

r

e

P

-3

1

-40

20

40

60

80

100

120

140

160

180

Time [hrs]

0 0

0.5

1

1.5

2

2.5

3

**Fig 4(e): Network Error (Feedforward)**

Epochs

**Fig 4(h): The performance goal met**

4

x 10

168 hours ahead STLF using training data

4

168 hours ahead STLF using training data

1.8

x 10

1.8

Actual load values

Actual load values

1.7

Predicted load values

1.7

Predicted load values

1.6

1.6

1.5

1.5

]

]

W

W

k

[

k

[

1.4

1.4

d

d

a

a

o

o

L

L

1.3

1.3

1.2

1.2

1.1

1.1

1

1

0

20

40

60

80

100

120

140

160

180

0

20

40

60

80

100

120

140

160

180

Time [hrs]

Time [hrs]

**Fig 4(i): Output of PSO-ERNN**

**Fig 4(f): Output of Elman Network**

5

*International Journal of Computer Applications (0975 - 8887)*

*Volume 30- No.4, September 2011*

For future works, the error in the network can be further reduced

Network Error

0.02

if a larger dataset is used for network training. Also, the load

forecasting model can be improved by including other weather

0.015

parameters like temperature, wind speed, rainfall etc.

0.01

**7. ACKNOWLEDGMENT**

0.005

We would like to thank Engr. K. M. Hassan, CSO, BPDB for

providing us with necessary information.

0

r

o

r

r

E -0.005

**8. REFERENCES**

[1] Heng, E.T.H., Srinivasan, D., Liew, A.C., "Short term load

-0.01

forecasting using genetic algorithm and neural networks",

Energy

Management

and

Power

Delivery,

1998.

-0.015

Proceedings of EMPD '98. 1998 International Conference

-0.02

on, Volume 2, 3-5 March 1998, Page(s):576 - 581 vol.2.

-0.025

[2] Worawit, T., Wanchai, C., "Substation short term load

0

20

40

60

80

100

120

140

160

180

forecasting using neural network with genetic algorithm",

Time [hrs]

TENCON '02. Proceedings. 2002 IEEE Region 10

Conference on Computers, Communications, Control and

**Fig 4(j): Network Error (PSO-ERNN)**

Power Engineering; Volume 3, 28-31 Oct. 2002,

Page(s):1787 - 1790 vol.3.

**Table 1: Comparison between different network**

[3] Azzam-ul-Asar, ul Hassnain, S.R., Khan, A., "Short term

**performances**

load forecasting using particle swarm optimization based

**Network**

**Training**

**MSE**

**MAE**

**MPE**

ANN approach", Neural Networks, 2007. IJCNN 2007.

**Epoch**

International Joint Conference on ; 12-17 Aug. 2007,

Feedforward

8

2.0351e-6

0.0012

0.0011

Page(s):1476 - 1481.

Jordan

3

4.6124e-6

0.0018

0.0021

[4] Wei Sun, Ying Zou, Machine Learning and Cybernetics,

Elman

7

1.868e-6

0.0012

9.1269e-4

"Short term load forecasting based on BP neural network

Elman (WS)

8

2.3895e-6

0.0013

1.8288e-8

trained by PSO", 2007 International Conference on;

PSO-ERNN

3

1.6296e-5

2.6042e-3

7.1934e-3

Volume 5, 19-22 Aug. 2007, Page(s):2863 - 2868.

[5] Bashir, Z.A., El-Hawary, "Short-term load forecasting

using artificial neural network based on particle swarm

From the above table, it can be seen that PSO-ERNN is faster

optimization

algorithm",

Electrical and Computer

but slightly more prone to errors (with MSE); but this model

Engineering, 2007. CCECE 2007. Canadian Conference

could outperform others if more data sets were used. This

on; 22-26 April 2007, Page(s):272 - 275.

network could handle large amount of data in a short amount of

time. The Elman Weather sensitive model performs better but it

[6] You Yong, Wang Sun'an, Sheng Wanxing, "Short-term

takes longer time than Simple Elman network. Dry bulb and

load forecasting using artificial immune network", Power

dew point temperatures had a very weak correlation with the

System Technology, 2002. Proceedings. PowerCon 2002.

load, so their presence did not have any significant effect on

International Conference on; Volume 4, 13-17 Oct. 2002,

forecasting. Providing different weather parameters would have

Page(s):2322 - 2325 vol.4.

definitely increased the network performance. Overall this table

shows that there is always a trade-off between the networks and

[7] Chengqun Yin, Lifeng Kang, Wei Sun, "Hybrid neural

network model for short term load forecasting", Third

depends on the amount of data, quality of data, time required

and most importantly design requirements and designers.

International Conference on Natural Computation, 2007.

[8] Almedia L.B. et al., "Parameter adaptation in stochastic

**6. CONCLUSION**

optimization", On-line learning in neural Networks (Ed. D.

Several neural network models for short-term load forecasting

Saad), Cambridge University Press, 1998.

were studied in this work. According to the discussion and the

comparison of model forecast accuracy shows that Particle

[9] T.Gowri Monohar and V.C. Veera Reddy, "Load

Swarm Optimized Elman Recurrent Neural Network (PSO-

forecasting by a novel technique using ANN", ARPN

ERNN) is the best model for 168 hours ahead load forecasting.

Journal of Engineering and Applied Sciences, VOL. 3, NO.

This type of network can be very efficient in terms of predicting

2, April 2008.

future loads.

[10] Lendasse, J. Lee, V. Wertz, M. Verleysen, "Time Series

Though the simulations seemed very promising, the models

Forecasting using CCA and Kohonen Maps - Application

developed here still need to be tested on data sets from other

to Electricity Consumption", ESANN'2000 proceedings -

sources, so that reliability of these models can be verified for

European Symposium on Artificial Neural Networks

other load patterns.

Bruges (Belgium), 26-28 April 2000, D-Facto public.,

ISBN 2-930307-00-5, pp. 329-334.

6

*International Journal of Computer Applications (0975 - 8887)*

*Volume 30- No.4, September 2011*

[11] P.J. Santos, A.G. Martins, A.J. Pires, J.F.Martins, and R.V.

Illam Region", International Journal of Electrical,

Mendes, "Short Term load forecast using trend information

Computer, and Systems Engineering, Volume 1, 2007,

and process reconstruction", International Journal of

pp.1307-5179.

Energy Research, 2006; 30:811-822.

[15] T.Gowri Monohar and V.C. Veera Reddy, "Load

[12] Simaneka Amakali, "Development of models for short-term

forecasting by a novel technique using ANN", ARPN

load forecasting using artificial neural networks", Cape

Journal of Engineering and Applied Sciences, VOL. 3, NO.

Peninsula University of Technology Year, 2008.

2, April 2008.

[13] Ayca Kumluca Topalli, Ismet Erkme, and Ihsan Topalli,

[16] George I Evers, "An automatic regrouping mechanism to

"Intelligent short-term load forecasting in Turkey",

deal with stagnation in particle swarm optimization",

International Journal of Electrical Power & Energy

Graduate thesis for the degree of Master of Science,

Systems, Volume 28, Issue 7, September 2006, pp. 437-

University of Texas-Pan American, pp. 35-40, 2009.

447.

[17] S. Sumathi, Surekha P., "Computational Intelligence

[14] Mohsen Hayati and Yazdan Shirvany, "Artificial Neural

Paradigm: Theory and application using MATLAB", CRC

Network Approach for ShortTerm Load Forecasting for

Press, 2010.

7