Data collection is the most important process in a study. This is because the collection of accurate data, data analysis and processing are needed to answer the research questions or objectives in a study. The data we collected or observed should correspond to the purpose of the study. There are two types of data that can be used in a study, namely primary and
8.2.2 What is not Secondary Data?
Whatever data a researcher has obtained first hand by himself/herself are NOT secondary data and they are considered as primary data. Examples of primary data sources are:
Direct observation
In situ readings
Questionnaires and surveys
Interviews
Laboratory experiments
8.2.3 Sources of Secondary Data
Official Statistics: Official statistics are statistics collected by governments and this information is readily available in the annual statistic reports. For example the Department of Statistics Malaysia would have census and surveys for various activities and even on the environment. The Department of Environment also produces annual report on the quality of the environment on their website.
Technical reports from completed research project or on-going research projects.
Scientific journals are a good source of secondary information as they usually undergo peer review and they are first hand reports of original findings.
Review articles are assembles and summaries of the related publications on a specific topic. Reviews are usually written by experts in the relevant field. The review article will attempt to give any overview of the latest development and list all the relevant publication from which the information is derived.
Book (of course).
International organizations: World Health Organization (WHO), Food and Agricultural Organization (FAO), United Nations Environmental Programme (UNEP) and others.
8.2.4 Where to Begin?
The internet (look for reliable/valid websites)
The library
The references list at the back of a journal, report, book etc.
Various governmental agencies
Chapter 8: Secondary Data Sources for Research
Quality control is the key that able to judge the quality of such data. Questions to ask:
Is the source reliable?
Does if include a method section and is the method sound?
When is the source published, is it consistent with the information reported (Sometimes a year 2011 data can exist in a 2008 report)? Is it up-to-date?
Is it a primary or secondary data?
Is it well referenced?
Does it make sense?
Most of the time, secondary data (as in Literature Review) are used to help us have a better understanding of the topic that we are researching.
It can also be valuable in generating hypothesis and identifying the areas of interests.
It helps to plan for primary data collection to ensure that the data collected are comparable with the secondary data.
The analysis of secondary data will also help in identifying the possible root of a problem.
More readily available (can be obtained from public sources) - the secondary data readily available either online or manually. Some departments or agencies periodically upload data unto their department's websites
Cost-saving / It is less expensive.
Provide basic idea in designing the new study.
Serve as starting point in preparing the formation of the research problem, research hypothesis or research methods.
Reliability of data collected by government and commercial research institutions is probably higher
Time-saving. The study does not need to start or collect any information that was known.
Help decide whether a research should be done.
Help shape the various hypotheses.
There is no hassles for data collection
It is not time consuming
It may allow the researcher to cover a wider spatial or temporal range 8.3 QUALITY CONTROL
8.4 WHY DO WE USE SECONDARY DATA?
8.5 ADVANTAGES OF USING SECONDARY DATA
Chapter 8: Secondary Data Sources for Research
Information may be outdated or obsolete. This is because the old data will often cause dispute if it becomes the primary data in a research.
Concept definition may differ from other studies. Data acquired in the past is to answer the question at that particular period of time, and it may not be able to answer the objectives in the current study.
Units of measurement may differ.
Difficult to ascertain the previous research design.
The data may be incomplete and inaccurate (Some researches may be bias during data collection).
Perhaps there is a conversion of data, thus the secondary data do not follow the format required by the researchers.
The researcher cannot decide what is collected
One can only hope that the data is of good quality
Incompatibilities
Limited access
Usually researchers will use data observed in the field as control or benchmarking to maintain and ensure the quality of the secondary data.
8.6 DISADVANTAGES OF USING SECONDARY DATA
Chapter 8: Secondary Data Sources for Research
Normal Distribution- The normal distribution is a descriptive model that describes the distribution of a set of data to represent the situations or phenomena or an experiment. A normal distribution curve is bell shape. The shape and position of normal distribution curve depend on two parameters, the mean and standard deviation. The larger the standard deviation, the more dispersed or spread out, the distribution is. So, normal distribution is important to identify the distributions of a set of data before analyzing it further with statistical calculations.
Same Means
Different Standard Deviations
Different Means Same Standard Deviations
Different Means Different Standard Deviations
The bell shape normal distribution graph covered 100% or probability-1 of total area under the graph. The area is important to find the percentage or probability of significant for decision making for hypothesis of situation or phenomena or experiment to be solved.
Normal Distribution Table (can find it from any statistical text book) is used as a reference to find the critical value of alpha (α).