International Journal of Recent Advances in Engineering & Technology (IJRAET)
_______________________________________________________________________________________________
_______________________________________________________________________________________________
ISSN (Online): 2347 - 2812, Volume-4, Issue -4, 2016 93
A Stream Flow Forecasting Using Data Driven Techniques
1A. M. Moperkar, 2Seema A. Jagtap
1Civil Engineering Department, Yadavrao Tasgaonkar College of Engineering & Management, University of Mumbai, Maharashtra, India
2Civil Department, Thakur College of Engineering & Technology, Mumbai, Maharashtra, India Abstract: Data-driven modelling is the area of hydro-
informatics undergoing fast development. This topic reviews the main concepts and approaches of data-driven modelling, which is based on computational intelligence and machine-learning methods.
A stream-flow model represents the conversion of rainfall to runoff, evapotranspiration, movement of water to and from groundwater system and change in the volume of water within the catchment using a series of mathematical relationship.
The present work deals with modelling of stream-flow using model tree(MT) to forecast daily average discharge one day ahead at Paud, Rakshewadi, Khamgaon and Nighoje four stations on Krishna river basin in Maharashtra state, India. The stream-flow models are developed using previous values of measured stream-flow to forecast stream-flow one day advance. The applicability of this technique is studied by predicting runoff (discharge) of Krishna River. The results show the high accuracy of the M5 technique to identify low values of flows with very high accuracy, but most of the high flows were underestimated.
Index terms: Stream Flow, Model tree, rainfall-runoff
I. INTRODUCTION
Reliable estimates of stream flow generated from catchment are required as part of the information sets that helps policy makers informed decisions on water planning and management.
The need to analyze and predict hydrologic variables such as stream-flow and precipitation has been recognized for centuries. Stream-flow forecasting and improving upon existing techniques to forecast the same is vital to many hydrology and associated areas. Once we get an idea of Stream flow generation, reservoir capacity can be calculated.
A rainfall-runoff model is a mathematical model describing the rainfall - runoff relations of a catchment area, drainage basin or watershed. More precisely, it produces the surface runoff hydrograph as a response to a rainfall hydrograph as input. In other words, the model
calculates the conversion of rainfall into runoff. A rainfall runoff model can be really helpful in the case of calculating discharge from a basin. The transformation of rainfall into runoff over a catchment is known to be very complex hydrological phenomenon, as this process is highly nonlinear, time-varying and spatially distributed. Over the years researchers have developed many models to simulate this process. Based on the problem statement and on the complexities involved, these models are categorized as empirical, black-box, conceptual or physically-based distributed models.
Physically based distributed models are very complex and required too many data and tedious for the application purpose. The conceptual models attempt to represent the known physical process occurring in the rainfall-runoff transformation in a simplified manner by way of linear or nonlinear mathematical formulations but their implementation and calibration is complicated and time consuming. While black-box models, which establish a relationship between input and the output functions without considering the complex physical laws governing the natural process such as rainfall-runoff transformation. The unit hydrograph, which is a linear rainfall-runoff model, is one well-known example of such a relationship. However, these simpler models normally fail to represent the nonlinear dynamics inherent in the process of rainfall-runoff transformation which can be done by using Artificial Neural Networks and fuzzy logic.
The present work the predictions were carried out using the relatively advanced technique of Model tree which is piecewise linear model as ARIMA and truly nonlinear model as ANN. model Tree is a linear regression method based on an assumption of linear dependencies input and output. The work deals with modelling of stream flow using M5 Model Tree to forecast daily average discharge one day in advance at Paud, Rakshewadi, Khamgaon and Nighoje four stations, in Krishna river basin of India. Daily stream flow data at these five stations from 1999 to 2012 is used to develop the model after dividing into training or calibration and testing subsets. The data were obtained from Maharashtra Engineering Research Institute, Nasik, and Maharashtra, India.
International Journal of Recent Advances in Engineering & Technology (IJRAET)
_______________________________________________________________________________________________
_______________________________________________________________________________________________
ISSN (Online): 2347 - 2812, Volume-4, Issue -4, 2016 94
II. MODEL TREE
Model Tree is linear regression tree, but each node has a linear regression function as the decision. Model Tree is the computational process is represented by a tree structure consisting of a root node (decision box) branching out to numerous other nodes and leaves.
Model Tree that locally linear but overall is non-linear If each leaf in the tree contains a linear regression model, that is used to predict the target variable at that leaf, is called a model tree.
The M5 model tree algorithm was originally developed by Quinlan (1992). Detail description of this technique is beyond of this paper and can be found in Witten and Frank(1999). A bit description of this technique follows.
The M5 algorithm constructs a regression tress by recursively splitting the instance space using tests on a single attributes that maximally reduce variance in the target variable. Figure 1 illustrates this concept. The formula to compute the standard deviation reduction (SDR) is: (Quinlan, 1994)
SDR=sd(T)- sd(T)
Where T represents a set of example that reaches the node; T represents the subset of examples that have the I th outcome of the potential set; and sd represents the standard deviation.
After the tree has been grown, a linear multiple regressions is built for every inner node using the data associated with that node and all the attributes that participate for tests in the subtree to that node. After that, every subtree is considered for pruning process to overcome the overfitting problem. Pruning occurs if the estimated error for the linear model at the root of a subtree is smaller or equal to the expected error for the subtree. Finally, the smoothing process is employed to compensate for the sharp discontinuities between adjacent linear models at the leaves of the pruned tree.
III. AIMS & OBJECTIVE
The present work aims at forecasting of stream flow one day ahead at four station Paud, Rakshewadi, Khamgaon, and nighoje in Maharashtra, India, using previous values of stream flow and rainfall and the techniques of M5 model trees.
Krishna River is India’s fourth-largest river basin, which covers 258,948km2 of southern India. Krishna River originates in the Western Ghats at an elevation of about 1337 m just north of mahabaleshwar in Maharashtra, India about 64 km from the Arabian Sea and flows for about 1400 km and outfall into the bay of Bengal traversing tree states Karnataka (113,271 km2), Andhra Pradesh (76,252 km2), Maharashtra (69,425 km2). The
basin is divided into twelve sub basins for hydrological studies namely Upper Krishna, Middle Krishna, Ghataprabha, Malaprabha, Upper Bhima, Lower Bhima, Lower Krishna, Tungabhadra, Venavati, Musi, Pallaru and Munnaru. The data was collected by surface and Ground water hydrology department (M.E.R.I.) through Hydro-project Nashik.
Krishna River Brief Description of the Basin
The Krishna is the second largest eastward draining interstate river in Peninsular India. It rises in the Mahadev range of the Western Ghats at an altitude of 1,337 m near Mahabaleshwar in Maharashtra State, about 64 km from the Arabian Sea. It flows for a distance of 305 km in Maharashtra, 483 km in Karnataka and 612 km in Andhra Pradesh before finally out falling into the Bay of Bengal. The length of the river is about 1,400 km Krishna basin lies between latitudes 13º 07’ N and 19º 20’ N and longitudes 73º 22’
E and 81º 10’ E. On the north, the basin is bounded by the range separating it from the Godavari basin, on the south and east by the Eastern Ghats and on the west by the Western Ghats. Drainage area of the basin is 258,948 km2
IV. MODEL DEVELOPMENT
A freely available software Weka( Waikato Environment for Knowledge Analysis) was used to obtain regression relation from model tree. WEKA is free software available under the GNU General Public
License. (Source:
http//www.c.s.waikato.ac.nz/ml/weka/). Weka support several standard data mining tasks, more specially, data prepocessing, classification, clustering, regression and feature selection. For all four locations models are developed by using 70% of available data for training and 30% data for testing purpose. Varying the number of inputs did trials. Initially trials were made by using previous values of daily average runoff at the respective station only to predict the next day’s runoff at that station. First two i.e. Qt-1 (discharge for the previous day) and Qt (discharge for the day) were chosen as inputs and Qt+1 (discharge for the next day) was the
International Journal of Recent Advances in Engineering & Technology (IJRAET)
_______________________________________________________________________________________________
_______________________________________________________________________________________________
ISSN (Online): 2347 - 2812, Volume-4, Issue -4, 2016 95
output. Then next was Qt-2 (discharge for 2 day previous), Qt-1, Qt was the input and Qt+1 was output.
Next was Qt-3 (discharge for 3 day previous), Qt-2, Qt-1, Qt was the input and Qt+1 was output. Thus 3 models were developed for each monsoon month at each station using previous values of discharge alone.
Rainfall data was used for model formation i.e. in addition to runoff; rainfall data was used for training and testing the model for prediction of runoff. Model formed with Qt-1, Qt, Rt(rainfall for the day) as input and Qt+1 as output. Next model was Qt-1, Qt, Rt-1 (rainfall for the previous day) and Rt as input and Qt+1 as output. Next model was Qt-2, Qt-1, Qt, Rt as input and Qt+1 as output.
Next model was Qt-2, Qt-1, Qt, Rt-1 and Rt as input and Qt+1 as output. Next model was Qt-3, Qt-2,Qt-1, Qt, Rt as input and Qt+1 as output. Next model was Qt-3, Qt-2, Qt-1, Qt, Rt-1 and Rt as input and Qt+1 as output.
Thus in total 9 models (3 with previous values of discharge and 6 with previous values of both discharge and rainfall) were developed for each month at each station making it total 180 models for MT from these different trails the best model was selected for each month of each station depending on the statistical properties such as root mean square error and coefficient of correlation as well as by plotting hydrograph and scatter plots
Following table is sample of nine different models for each monsoon month one station one month, these models are same for all four station
Sr.
No.
Station name
Model Name
Input Outp
ut
1 Paud PJUN1 Qt-1, Qt Qt+1
2 Paud PJUN2 Qt-2, Qt-1, Qt Qt+1 3 Paud PJUN3 Qt-3, Qt-2, Qt-
1,Qt
Qt+1 4 Paud PJUN4 Qt-1, Qt, Rt Qt+1 5 Paud PJUN5 Qt-1, Qt, Rt-
1,Rt
Qt+1 6 Paud PJUN6 Qt-2, Qt-1, Qt,
Rt
Qt+1 7 Paud PJUN7 Qt-2,Qt-1, Qt,
Rt-1, Rt
Qt+1 8 Paud PJUN8 Qt-3,Qt-2, Qt-1,
Qt, Rt
Qt+1 9 Paud PJUN9 Qt-3,Qt-2,Qt-1,
Qt, Rt-1,Rt
Qt+1
Monthly Average Discharge Values for Paud, akshewadi, Khamgaon, and Nighoje Staions
Daily Average Discharge
Table 3: the Correlation Coefficient between Observed and Predicted Discharge Values for All the Models Developed Using Model Trees
Sr.
No.
Model Linear Models
RMSE Correlation Coefficient 'r'
01 P062001 1 107.53 0.0442
02 P072005 3 78.02 0.4108
03 P082008 5 73.84 0.7825
04 P092002 4 6.52 0.7497
05 P102009 5 4.63 0.6395
06 R062001 6 29.044 0.5138
07 R072006 1 221.08 0.7127
08 R082003 1 59.86 0.5925
09 R092006 2 45.48 0.8454
10 R102005 3 26.05 0.9671
11 K062001 3 70.10 0.6952
12 K072008 1 61.71 0.3768
13 K082007 1 144.49 0.6454
14 K091999 1 22.19 0.4172
15 K102003 7 8.56 0.9209
16 N062004 6 27.417 0.4717
17 N072001 2 38.71 0.5114
18 N082000 1 43.74 0.7572
19 N092004 3 10.06 0.6344
20 N102005 1 4.81 0.9413
V. RESULT AND DISCUSSION
The above-developed nine different models for each monsoon month that is June, July, August, September, and October for each station were tested for their performance using statistical parameters and plotting hydrographs and scatter plots.
VI. CONCLUSION
Rainfall runoff is an important phenomenon, which is most useful to the human kind. Earlier large number of models are developed and applied to predict this process. The developed models are broadly classified into empirical, conceptual and physically based models.
With the help of previous discharge values initially this model was developed. After that rainfall was also added as input. The effect of rainfall improves the results obtained from the model. After various trials it was
International Journal of Recent Advances in Engineering & Technology (IJRAET)
_______________________________________________________________________________________________
_______________________________________________________________________________________________
ISSN (Online): 2347 - 2812, Volume-4, Issue -4, 2016 96
found that the testing accuracy has not increased with more than five inputs, both in precipitation and runoff.
The work is definitely help practically to predict the runoff within few seconds.
REFERENCES
[1] Aram Karalic. Employing linear regression in regression tree leaves. In Bernard Neumann, editor, European Conference on Artificial
Intelligence, 1992.
[2] Charhate, S.B., Deo, M.C., Sanil Kumar, V., 2007. Soft and hard computing approaches for real time prediction of currents in a tide dominated coastal area. Journal of Engineering for the Maritime Environment 221 (4), 147–163.
[3] seema a. jagtap & s. k. Ukarande rainfall runoff modeling using model tree techniques