3.2 DATA COLLECTION
3.3.1 DATA MANAGEMENT
The S02 and PMIO data were subjected to internally standardised verification in keeping with the quality assurance methods, from analyser installation, data acquisition to data analysis. The following explains how collected pollution data was handled to ensure data quality and integrity.
S02 data was handled in Microsoft Excel spreadsheets, since data was down loaded and saved in electronic format as text files it was exported into Microsoft Excel, by setting delimiters to separate the data into station number, date, time, S02 concentrations measured and internal instrument temperature. This data was then checked and appended to the last data point from the previous down loads, matched to the preceding date and time. During periods of missing data, the date and time was recorded while S02 concentrations were represented as not available (N/A).
Missing data was identified resulting from instrument failure, power failure or when instrument temperature increased etc. These events were flagged as such and the resultant data censored for exclusion from data analysis.
Other data events that warranted data flagging and censoring were results from the span and zero checks and quarterly calibrations. These data points were identified as extreme outliers that would adversely influence the true atmosphericSOzconcentrations.
The raw SOz data was subjected to a series of quality control checks to achieve a cleaned and useable data set. The criteria used for the coding, adjustment, cleaning and censoring of the raw SOz data for specified periods is referred to as the technical quality control process. This process was reliant on the use of the following records:
• Extemal quarterly calibration reports compiled for each SOz analyser following calibration;
• Intemal span and zero checks;
• Records retrieved from the analyser event logs following data acquisition;
• Recorded events and actions noted in the analyser logbooks by field technicians during data acquisition, maintenance and site visits.
At each site except Femdale SOzdata was collected as 5 min averages, which was processed to 1 hour averages if half of the data for that hour were available. The 1 hour averages were processed to 24 hour averages, from noon to noon, if half of the hourly data in the period were available. Several of the monitors at the school sites experienced drift problems, probably a result of inadequate temperature control in the instrument or in the enclosure. Drift resulted in slowly varying negative or positive biases that was easily detected. Bias was corrected on a monitor-specific basis by subtracting the long term baseline, computed as a running average (typically considering a 400 h window) of low (1SI) percentile hourly concentrations in a time window typically 68 hours before and 4 hours after the current value. Minimum values were allowed to vary only slowly
«
5 ppb) otherwise a new window was utilised. This approach is reasonable since background SOz values at all sites approached zero almost every day due to strong variation in source emissions and meteorology, and since background levels were negligible. Statistical and visual checks ensured that this approach yielded reasonable and robust values. Some small«
-I ppb) negative values remain after this correction, a normal result for this measurement, even at monitors that do not experience excessive drift r29].PM10 data was received in a Microsoft Excel Spreadsheet containing a unique identification number, date, time, temperature, humidity and individual weightings of filters for each weighting session and labelled according to the sampler in which it was to be deployed.
(APPENDIX 3: Gravimetric data sheet for mass calculation).
Following the 24 hour sampling period, "field filter data" relating to a sampled filter, documenting the start and end time, date, total time sampled, average total volume of air sampled, average flow rate, average temperature, average pressure was retrieved manually from the sampling equipment. This "field filter data" information was for the filter that was sampled on the previous day of sampling. All field and laboratory data were compiled into this Microsoft Excel spreadsheet, which included the formulae for automatic calculation of the final mass concentration calculation for each filter.
Mass concentration calculations were also performed on the field and laboratory blanks to determine possible sources of contamination, inside or outside the laboratory or during deployment and retrieval of the sampled filters. Extreme negative and positive values for both control and sample filters were flagged and investigated to identify the cause, such as filter contamination or filter loss due to tears etc. These filters were compared to laboratory blanks for similar variations. The dataset was further explored to identify times when analyser failure occurred. These episodes were identified from analyser deployment records ("field filter data") and from examining the total sample time (ideally 1440 minutes) and total volume (depending on sampler type). Data which was determined to result from these episodes were flagged or censored from further data analysis.
A flag coding was established for the data set to aid In identifying when PMIO data was acceptable for use, these were:
I. The sample duration: - was described as the total time a sampler was in operation during a 24 hour sampling period.
Table 3: Flag codes for sample duration
Flag Code Description - Sample Duration Time (min) 0 Sample duration of 90% to 100% 1296 -- 1440
1 Sample duration of 80% to 90% 1152 -- 1440
2 Sample duration less than 80% < 1152
3 Not sampled 0
4 Sample duration greater than 100% > 1440
A flag code of 0 and I was used to identify acceptable sample durations, indicating that the data was of acceptable quality for further use. Data with a flag code of 2 or 4 could not be used due to reliability concerns about the data. Data with flag code 3 did not produce any results and therefore censored.
2. The volume sampled: - was described as the total volume of air passed through the filter during the 24 hour sampling period (this value varied according to the type of PM analyser).
Table 4: Flag codes for sample volume
Flag Code Description - Sample Volume 0 Volume OK - normal operation 1 Volume <5% of operating volume 2 Volume <10% of operating volume 3 Zero volume due to analyser failure 4 Volume> operating volume of sampler
A flag code of 0 and 1 was used to identify a sample volume as being acceptable and representative of a full sample volume. Data with a flag code of 2 or 4 was not used due to reliability issues of the data. Data with flag code 3 did not produce any results and therefore censored.