METHODOLOGY OF THE STUDY
3.1. Selection of Models
In the past, studies have focused on estimating climate change impacts on mean agricultural outcomes only (Guiteras 2009, Greenstone 2007, Fishman 2012, Gupta et al.
2014). There are a handful of studies estimating climate change impacts on yield variability, with only one such study for India (Kotani 2010) where in increased climate variability is found to augment rice yield variability. Results from studies for other countries also show similar results (Cabas et al. 2010, McCarl et al. 2008). Implicit in such an approach is the idea of climate change leading to a mean shift in agricultural outcomes, with no changes of the underlying relationship between agricultural outcomes and climate. This is problematic for several reasons; for instance, in much of the scientific literature on climate change, focus is on changes in variability (especially in the hydrologic cycle, which determines both long and short run availability of water supply, a critical ingredient in agriculture) as a result of an altered climate; while such changes are incorporated in a “mean effect” framework, they are not restricted to it (Krishnamurthy 2012).
To estimate the effect of climate change on mean and variance of rice yields, this study estimates the stochastic production function formulated by Just and Pope (1978,1979), which allows the effect of inputs on mean yield to differ from that on yield variance.
The basic specification is:
y = f(X,β) + μ = f(X,β) +h(X,α)ε
where y is measure of output, X is the input vector, f(.) is the production function relating X to output with β being the vector of estimable parameters, h(X,α) is the risk (variance) function, such that h2 is the yield variance; µ random shock distributed with mean zero and unitary variance, α is the vector of estimable parameters associated with the risk
21
function (where α > 0 implies that yield variance increases as X increases, and vice versa).
Most empirical studies have used the method of Feasible Generalized Least Squares (FGLS). Alternatively, Maximum Likelihood Estimation (MLE) can be used. However, FGLS estimation is employed in most empirical studies, although MLE is more efficient and unbiased than FGLS for small samples (Saha et al. 1997). Given the large sample size here, FGLS was used, as described in Judge et al. (1988), to estimate a form of fixed effects panel model. The exact procedure is mentioned below (Just and Pope 1978, Cabas et al. 2010).
First stage entails regressing y on f(X, β) which gives the least squares residuals, μ^ which (μ^=y−f(X,þ^)), is a consistent estimator of μ. The second stage uses least square residuals from the first stage to estimate marginal effects of explanatory variables on the variance of production (α). In the second stage, µ2 is regressed on its asymptotic expectation h(X, α) with h(.) assumed to be an exponential function. The third and final stage uses predicted error terms from the second stage as weights for generating FGLS estimates for the mean yield equation. The resulting estimator of β in this final step is consistent and asymptotically efficient under a broad range of conditions and the whole procedure corrects for the heteroscedastic disturbance term (Just and Pope 1978).
3.2. Basic Design of FGLS Estimator
The GLS estimator requires that t be known for each observation in the sample. To make the GLS estimator feasible, we can use the sample data to obtain an estimate of t
for each observation in the sample. We can then apply the GLS estimator using the estimates of t. When we do this, we have a different estimator. This estimator is called the Feasible Generalized Least Squares Estimator, or FGLS estimator.
Suppose that we have the following general linear regression model.
Yt = 1 + 2Xt2 + 3Xt3 + t for t = 1, 2, …, n Var(t) = t2 = Some Function for t = 1, 2, …, n
22
The rest of the assumptions are the same as the classical linear regression model.
Suppose that we assume that the error variance is a linear function of Xt2 and Xt3. Thus, we are assuming that the heteroscedasticity has the following structure.
Var(t) = t2 = 1 + 2Xt2 + 3Xt3 for t = 1, 2, …, n
To obtain FGLS estimates of the parameters 1, 2, and 3 proceed as follows.
Step 1: Regress Yt against a constant, Xt2, and Xt3 using the OLS estimator.
Step 2: Calculate the residuals from this regression, ˆ . t Step 3: Square these residuals, ˆt2
Step 4: Regress the squared residuals, ˆt2, on a constant, Xt2, and Xt3, using OLS.
Step 5: Use the estimates of 1, 2, and 3 to calculate the predicted valuesˆt2. This is an estimate of the error variance for each observation. Check the predicted values. For any predicted value that is non-positive replace it with the squared residual for that observation. This ensures that the estimate of the variance is a positive number (you can’t have a negative variance).
Step 6: Find the square root of the estimate of the error variance, ˆ for each observation. t Step 7: Calculate the weight wt = 1/ˆ for each observation. t
Step 8: Multiply Yt, , Xt2, and Xt3 for each observation by its weight.
Step 9: Regress wtYt on wt, wtXt2, and wtXt3 using OLS.
Properties of the FGLS Estimator
If the model of heteroscedasticity that you assume is a reasonable approximation of the true heteroscedasticity, then the FGLS estimator has the following properties.
1) It is non-linear.
2) It is biased in small samples.
3) It is asymptotically more efficient than the OLS estimator.
23
4) Monte Carlo studies suggest it tends to yield more precise estimates than the OLS estimator. However, if the model of heteroscedasticity that you assume is not a reasonable approximation of the true heteroscedasticity, then the FGLS estimator will yield worse estimates than the OLS estimator.
Assumptions for Panel data:
1. The data generating process is linear; it assumes observations as individuals or groups.
yit = x’itβ+ z’iγ+I - The data generating process is linear i= 1, 2, 3,…,N - we think of i as individuals or groups t= 1, 2, 3,….,Ti - usually, N >> T
2. E[i|X,z]=0, X and z: exogenous.
3. Var[i|X,z]= 2I, Heteroscedasticity can be allowed.
4. Rank(X) = full rank.
3.3. Estimation
As the dataset is a Panel dataset, the following tests were conducted prior to estimation.
Panel unit root test
If the average T is ten or higher, the data series may be subject to serious unit root.
However, if the data are constructed data from few other variables, the unit root problem may be eliminated. Pre-test the data for unit root is very useful to know if the data series is either stationary or not. Therefore a panel unit root test had conducted by STATA to see the expected result. To estimate the result of panel unit root, all the tests had been run.
The hypotheses are-
H0: Panels contain unit roots H1: Panels are stationary
We reject the null hypothesis when p<0.001.
a)Levin Lin Chu test:
This test can be performed only when the panels are non-stationary and to test the stationarity of the series, panel unit root test ) were run based on the null hypothesis of unit root. Results reported in Appendix.I show that all variables were found to be
24
accepting the null of common unit process at level but rejected the null hypothesis at first difference and thus we concluded that all variables were found to be stationary at first difference.
All variables were tested for non stationarity and found to be stationary. The table of result is shown in Appendix.I.
b)Harris Tzavalis test:
All variables were tested for non stationarity and found to be stationary. The table of result is shown in Appendix.I.
c)Breitung test:
This result has 37% p-value and therefore contains unit roots. The table of result is shown in Appendix.I.
d)Im Pesaran Shin test:
All variables were tested for non stationarity and some panels are found to be stationary.
The table of result is shown in Appendix.I.
e)Augmented Dicky Fuller and Phillips Perron test:
Since the present research paper focuses a model to analyze the effects of climate variation on different rice crops, we need to confirm zero degree of integration for each variable under study. Otherwise, the variables cannot be used for correlation, causality, and OLS estimations if they characterize different degrees of integration. Thus, first we need to make sure that the data series are free of unit roots, i.e. the series are stationary to make all results valid and all estimates consistent (Enders, 1995). In this regard, we have chosen two widely used methods: Augmented Dickey-Fuller (ADF) test (Dickey and Fuller, 1979) and Phillips-Perron (PP) test (Phillips and Perron, 1988) to check the presence of unit roots in the data series and the outputs were presented in table 3. Once stationarity in all variables is confirmed, we can run our comprehensive regression model.
All variables were tested for non stationarity and found to be stationary. The table of result is shown in Appendix.I.
All variables were tested for non stationarity and majority of them are found to be stationary.
25
Table.1: Results of Unit Root Test (Augmented Dickey-Fuller & Phillips-Perron Tests)
Variables
Integration of order for Mean yield of Rice
logMeanyeild I(1)
logTemperature I(0)
logRainfall I(0)
logHumidity I(0)
logHYV I(0)
Note: MacKinnon (1996) one-sided p-values (at 1%, 5% & 10% level is -3.605, -2.936& -2.606 respectively) is used.
Source: Authors own estimation based on BMD, BBS and DAE.
Testing for Cross Sectional Dependence
Pesaran, Friedman and Frees tests were performed and cross sectional dependence was found in all data sets.
Testing for fixed versus random effects
Hausman test was performed and random effect model was found to be appropriate (Appendix.IV).
In light of the above results, panel corrected standard error (PCSE) estimates were obtained, which correct for cross sectional dependence, heteroscedasticity and autocorrelation. The parameters are estimated using a Prais Winsten (or OLS) regression.
Equations have been estimated with district and year fixed effects.
Regression was run for rice, explaining mean yield. Mean yield depends on climate and non-climate inputs. Our results, however, show mean yields are best explained by levels of rainfall and temperature. We surmise therefore it is variability in climate that makes agriculture more risky.
3.4. Model Specification
Regression equation estimated for rice is :
lnMean Yieldit= β1 + β2 lnTemperatureit + β3 lnRainfallit+ β4 lnHuimidityit + β5 lnHYVit
+ νit
26
where i refers to the district and t refers to the year; αi denotes district level fixed effects;
δt denotes year fixed effects; Temperatureit is the average temperature; Rainfallit is the annual rainfall; HYVit is the gross cropped area of rice per hectare in metric tons under HYV seeds; Humidityit is annual humidity; νitis stochastic error terms where νit ~ N(0,1).
In our specifications we do not include inputs such as irrigation and fertilizer in the regression. Irrigation is likely to reduce production risk (Foudi and Erdlenbruch 2011) though some argue otherwise (Guttormsen and Roll 2013). Fertilizer use typically increases production risk even as it increases expected output (Just and Pope 1979, Rosegrant and Roumasset 1985, Roumasset et al.1987, Ramaswami 1992, Di Falco, Chavas, and Smale 2007). However, since the two are correlated with each other and also with HYV use their interactive effect is unclear and we leave this for further research.
3.4. Sources of Data
The data set which has been used in this study is a Panel data set. A panel data set has multiple entities, each of which has repeated measurements at different time periods.
Panel data may have individual (group) effect, time effect, or both, which are analyzed by fixed effect and/or random effect models. Crops under study, namely rice, accounts for about 75 percent of agricultural land use (and 28 percent of GDP). Paddy production increased during the decade (1987), except for fiscal year 09, but annual growth was generally modest, keeping pace with the population.
About 75% of the total crop area and more than 80% of the total irrigated land is planted.
Thus, rice plays an important role in the livelihood of the people of Bangladesh. The rice fields of Bangladesh expanded somewhat during the period of 2001-10. However, the area under irrigation has increased from about 5% to 73% from 1995 to 2008. At the same time, the proportion of modern varieties has increased from 12% to about 5%. Two flash-flood-resistant varieties for the submergence zones, BRRI Dhan 3 (Swarna-Sub 1) and BRRI dhan52 (BR11-Sub-1) were released. An early-maturing variety, BINA Dhan7 was also released for safe cultivation. BRRI dhan51 was developed when IRRI scientists
27
invented an immersion resistance gene for a popular high-yielding Indian rice variety in 2004.
1. Agricultural Data
Data on agricultural variables spans the time period 2011-2018, and has been obtained from the latest BBS Agricultural Yearbook-2017,BBS Agricultural Yearbook-2015, BBS Agricultural Yearbook-2013 and BBS Agricultural Yearbook-2011 database. This is a district level database. Also, agricultural data like the production of Aus per hectare in metric tons, the production of Aman per hectare in metric tons, the production of Boro per hectare in metric tons, the production of HYV Aus per hectare in metric tons,the production of HYV Aman per hectare in metric tons, the production of HYV Boro per hectare in metric tons are taken from Department of Agricultural Extension (DAE) and Bangladesh Bureau of Statistics (BBS).
Districts in this database are according to 1984 base, data on districts formed after 1966 is given ‘back’ to the parent districts i.e. apportioned, based on percentage area of parent district transferred to the new district. Hence, the final database comprises of data for the parent districts only.
The variables of interest in this database include output of rice (measured in metric tons), total gross cropped area in each district (measured in hectares, and accounting for rice cultivation) under HYV seeds for rice (measured in hectares, again accounting for multiple cropping).
In our study the dependent variable is mean yield of Rice (metric tons of output per hectare). Another independent variable from this database is the gross cropped area of high yielding variety seeds (tons) under all available varieties of rice (Aus, Aman, Boro) under HYV seeds.
2. Climate Data
Climate data has been procured from Bangladesh Meteorological Department (BMD).
The department contains 30 years of monthly climate data for 64 districts of Bangladesh
28
for variables like rainfall, temperatures, cloud cover, humidity, and ground frost frequency, among others. The database used to compile this meteorological dataset is the publicly available Bangladesh Bureau of Statistics (BBS). It consists of monthly data on variables such as rainfall and temperature from 2011 to 2018.
The three independent variables used for this study from this dataset are rainfall, temperature and humidity. Rainfall is defined as the 12 month summation of monthly rainfall values. Temperature is the 12 month average of monthly average temperatures.
Humidity is defined as the 12 month average of monthly average humidity.
The number of districts selected for each of the crops is 64 for rice which account for 100 percent of all Bangladesh crop production in recent years.
As previously mentioned, districts included in the BBS database are those that existed as of 1978. However, climate dataset has been created taking into account district boundaries as of 2002, which are very different from those of 1978. Districts that comprise the panel sample have been selected on the basis of districts that existed in the BBS database, and climate variables for these districts have been approximated from the district to which the largest area of the parent district was allocated (provided that it is more than 50 percent of total area of the parent district).
3.4. Data Input
As secondary data has been used for this study and the study is a dry research procedure, the data are collected from different sources, mainly the Statistical Yearbooks published by Bangladesh Bureau of Statistics. The data was inputed on an Microsoft Excel sheet according correctly with no compromise with mistakes. The data was mainly drawn as per the desired variables of the study which were selected by the author and her Supervisor.
3.5. Software and Statistical Packages
Analytical software STATA version 12.0 has been used for data analysis in this study along with Microsoft Excel 2007.
29