Data Analysis - DOC Wealth Distribution - Universiti Tunku Abdul Rahman

research depends on the type of interviewing method used in the survey and also the availability of equipment.

3.6.5 Data Cleaning

Consistency checks and treatment of missing responses are both included in data cleaning (Malhotra, 2007). The consistency checks in this phase are more thorough and extensive than the checks during editing phase because the checks now involve the use of computer. Consistency checks are used to identify data that are logically inconsistent, out-of-range, or with outliers (Malhotra, 2007). Out-of range data with values that cannot be defined by coding scheme are unacceptable and corrections must be made. Computer packages like SPSS and EXCEL are very useful in identifying and systematically checking out-of-range values for each variable. Apart from that, missing responses are the unknown values of a variable due to either the ambiguity of answers provided by respondents or improper recorded answers.

draw conclusions to the rest of the population (Independent Institute of Education, 2012).

3.7.1 Descriptive analysis

Descriptive studies or analysis is used to identify and describe the key features of the variables in a circumstance (Sekaran and Bougie, 2010). In descriptive statistics, the main focus is on the collection, summarization, presentation, and analysis of the data. The main objective of conducting descriptive analysis is to provide the researchers with a more thorough profile and relevant aspects of the area of interest from different viewpoints. Descriptive analysis includes frequency analysis, measures of central tendency, and measures of dispersion.

Quantitative data in terms of frequencies, mean, standard deviations, and coefficient of variation are vital in descriptive analysis (Sekaran and Bougie, 2010).

3.7.1.1 Frequency Analysis

Frequency analysis used to 51nalyse the frequency occurrence of an observation. The mean, mode, and median can be determined using frequency analysis. Through these data, probabilities and confident intervals can be obtained. With probability, the chances of an event happening can be known.

With confident intervals, the percentage of how true the hypothesis is can be determined (Malhotra, 2007). For research on demographic characteristics and bequest motive specifically age, gender, ethnic, marital status, education levels and health status, it is important to have frequency analysis to find out just how many percentage of the sample is from these demographics.

Frequency analysis also gives an overview of the background of the respondents. With this, more detailed information can be found and a better understanding of the respondents is achieved. With this information, it can

ensure that all the results are non-biased and fair as well as accurate and complete.

3.7.2 Inferential Analysis

Inferential statistics is the mathematics and logic of how this generalization from sample to population can be made (Gabrenya, 2003). Research often uses inferential analysis to determine if there is a relationship between an intervention and anoutcome, as well as the strength of that relationship. Data that have been collected from sample can be used to draw conclusion about the population of the research with inferential analysis.

3.7.2.1 Logistic Regression

Logistic regression opens the possibility for multivariate analysis for data that are incompatible with linear regression. Logistic regression is a common statistical technique used in sociological studies. It provides a method for modelling a binary response variable, which takes values 0 and 1. Logistic regression is useful in non-linear regression. Logistic regression is highly effective at estimating the probability that event will occur, given a set of condition.

Logistic regression offer same advantages as linear regression, including the ability to construct multivariate models and include control variable. It can perform analysis on two types of independent variable which are numeric and dummy variable just like linear regression. In addition, logistic regressions offer a new way of interpreting relationships by examining the relationships between a set of conditions and the probability of an event occurring.

Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event’s occurrence.

It does this by examining the relationship between one or more independent

variables and the log odds of the dichotomous outcome by calculating changes in the log odds of the dependent as opposed to the dependent variable itself.

There are two models of logistic regression to include binomial/binary logistic regression and multinomial logistic regression. Binary logistic regression is typically used when the dependent variable is dichotomous and the independent variables are either continuous or categorical variables. Logistic regression is best used in this condition. Binary logistic regression has been chosen in this paper on testing of the pattern of wealth distribution among the married non-Muslim in Malaysia. When the dependent variable is not dichotomous and is comprised of more than two cases, a multinomial logistic regression can be employed. Also referred to as logit regression, multinomial logistic regression has very similar results to binary logistic regression.

Logistic regression is used when the dependent variable is binary, but can be useful for other dependent variable if they are recoded to binary form. When constructing the binary dependent variable, it is important that the categories be mutually exclusive, so that a case cannot be in both categories at the same time (Stephen A.Sweet, 2003).

Assumptions of logistic regression

 Logistic regression does not assume a linear relationship between the dependent and independent variables.

 The dependent variable must be a dichotomy (2 categories).

 The independent variables need not be interval, nor normally distributed, nor linearly related, nor of equal variance within each group.

 The categories (groups) must be mutually exclusive and exhaustive; a case can only be in one group and every case must be a member of one of the groups.

 Larger samples are needed than for linear regression because maximum likelihood coefﬁcients are large sample estimates. A minimum of 50 cases per predictor is recommended.

Dalam dokumen DOC Wealth Distribution - Universiti Tunku Abdul Rahman (Halaman 50-54)