• Tidak ada hasil yang ditemukan

Regression analysis: results and discussion

Dalam dokumen University of Cape Town (Halaman 83-88)

In this section I statistically analyse a number of socio-demographic and environmental factors that can affect variability in toddlers’ vocabulary production. Due to the continuous nature of the dependent variable (percentage words produced) I perform an OLS multiple linear regression analysis, which estimates the unknown parameters in the MLR by minimising the differences between the collected observations in the data set and the responses predicted by the linear approximation of the data. The results presented are for independent variables post the exclusion restrictions (F-test). As explained in Chapter 3, the F-test is run to determine whether the variables ‘Ear problems’, ‘Number of adults in the home’, ‘Number of children in the home’, and ‘Household income’ have a jointly significant effect on vocabulary production, which is not the case. The model is thus specified as follows:

Percentage_words_produced = "#Child_Age_Mths + "$Gender + "%not_completed_sec +

"&completed_sec + "'tertiarty + "(Sibling_caregiver + ")Twin + "*First_born +

"+Creche_attendance + "#,Number_of_caregivers + "##Number_of_siblings + "#$Agegender

Table 4-1 provides the regression coefficients and significance levels for the model.

72

Table 4-1 Multiple linear regression output (from STATA)

VARIABLES Coefficients

Child age (months) 0.0251**

(0.00808)

Gender 0.541*

(0.256) Mother education: not completed secondary school 0.167**

(0.0684) Mother education: completed secondary school 0.123

(0.0787) Mother education: completed higher education 0.253*

(0.124)

Sibling secondary caregiver -0.186***

(0.0520)

Twin -0.0771

(0.0434)

First born -0.286***

(0.0619)

Crèche attendance 0.368***

(0.0826)

Number of secondary caregivers 0.113***

(0.0258)

Number of siblings -0.0505**

(0.0207)

Age-gender -0.0268*

(0.0114)

Constant -0.316

(0.188)

Observations 20

R-squared 0.964

Standard errors in parentheses

*** p<0.01, ** p<0.05, * p<0.1

73 Positively, a large number of the variables are statistically significant, indicating that despite the small sample size, sound conclusions can be drawn about the majority of the variables. Additionally, as indicated by the coefficient of determination (R-squared), 96,4% of the variability in the percentage of words produced by toddlers is explained by this model, indicating a very affirming goodness-of-fit. It must be cautioned, however, that the results in this chapter apply to rural context only, and whilst for many monolingual isiXhosa-speaking children the realities may be similar with regard to living arrangements and SES, an urban context can be markedly different in many ways. Thus what may be true for a rural child, cannot automatically be assumed to be true for an urban child.

Based on the regression results supplied in Table 4-1, I start my discussion by focussing on the variable for age, followed by gender, and then the interaction between age and gender. Perhaps surprisingly, given the variability in the scatter plot analysis, the coefficient on age is statistically significant at the 5% level. However, this indicates that increasing age by one month is associated with a 2.5 percentage point increase in vocabulary production on average (see coefficient of 0.0251 in Table 4.1), a relatively small effect considering that holding all else constant, being female is associated with producing 54.1 percentage point more words on average.

Gender is significant at the 10% level (p<0.1), which for small sizes is considered pertinent due to the fact that estimates from small samples are less precise (Wooldridge, 2003). However, this effect is for when age is zero, which is an uninteresting and implausible scenario. In order to capture the effect of gender when a child is older than zero, the coefficient on the age-gender interaction term needs to be considered. The statistically significant coefficient on the age-gender interaction term indicates that there is a significant effect of gender on vocabulary production, which differs with age. What this means is that with every month that a female child becomes older, the gender effect is reduced by 2.68 percentage points. Therefore, when a female is 17 months old (1;5, the youngest of the cohort), the effect of being female, holding all else constant, is only an 8.44 percentage point (54.1-45.56)51 increase in percentage words produced on average. Whilst this is a bigger effect of gender than that which is found by Fenson et al. (1994), it is not out of line with findings from Simonsen et al. (2013) (see Section 2.3.3), and it nonetheless corresponds to being one of the relatively smaller effects observed when considering the group of variables as a whole.

51 17 x 2.68 = 45.56

74 Dummy variables for mother’s education level use ‘completed primary school’ as the base category, since no mother was reported to have been in the lowest category, ‘not completed primary school’.

Holding all else constant, a child with a mother who has not completed secondary school (it is implied that they have started secondary school, else they would fall into the category below) will produce 16.7 percentage point more words on average than a child whose mother has only completed primary school (statistically significant at p<0.05). Having a mother who has completed secondary school is associated with 12.3 percentage point more words produced by a child on average, compared to a child with a mother who has completed primary school only. This result, however, is not statistically different from 0. Intuitively this makes sense since it would be odd that a child whose mother has lower education (not completed secondary school, compared to completed secondary school) would produce a higher percentage (16.7 vs. 12.3) of words. The statistical insignificance of this variable may stem from the fact that only four mothers were in this category. Despite the fact that only one mother in the sample had completed a tertiary education, the effect is statistically and practically significant (p<0.1): compared to a child whose mother has only completed primary school, a child whose mother has some level of tertiary education will produce 25 percentage point more words on average, holding all else constant. This corresponds to qualitative findings from pre-pilot research in Cata where an extremely talkative toddler (aged 2;10) had a mother who was a teacher. When we visited her home we observed theextraordinary level of stimulation she and her older brother were getting.

As suggested in prior literature, the results confirm that the presence of a sibling as a secondary caregiver has a negative effect on vocabulary production. On average, having a sibling secondary caregiver is associated with a relatively large 18.6 percentage point decrease in expressive vocabulary size, holding all else constant (p<0.01). This echoes Vogt, Mastin and Aussems’s (2015) findings that children (aged 1-2;1) who are reported as having a sibling as a secondary caregiver produce 18% fewer words than a child with a primary caregiver or adult secondary caregiver only.

On the other hand, increasing the presence of a secondary caregiver by another one, is associated with a highly significant 11.3 percentage point increase in words produced (p<0.01). It is important to note that sibling secondary caregivers were not excluded from the ‘Number of secondary caregivers’

variable.52 This potential endogeneity would only work to reduce the statistical significance of these variables, but given that both are significant this is unproblematic. Vogt, Mastin and Aussems (2015)

52 This problem should be considered in the development of the final CDI questionnaire.

75 find a negative but not statistically significant effect of an adult secondary caregiver. My results suggest a slightly more nuanced understanding, namely that whilst having a sibling secondary caregiver is associated with lower vocabulary production, holding all else constant, the presence of multiple secondary caregivers may in fact benefit the child, arguably through increased interaction and stimulation. In Vogt, Mastin and Aussems’s (2015) sample, having a secondary caregiver is much more strongly associated with the rural community. If a similar pattern persists in South Africa, these finding may be moderated by the inclusion of urban data.

The coefficient of being a twin (or alternatively, being born early) is not statistically different from 0, which is more in line with findings from Saudino et al. (1998) than Thorpe (2006), who finds mild language delay. Being a mother’s first child is associated with the production of 28.6 percentage point fewer words on average, which is highly significant at the 1% level (p<0.01). This is in the opposite direction to the effect which is found by Fenson et al. (1994), and is slightly counterintuitive, since one may expect the first born to receive a larger portion of the mother’s attention in the early years of its life, before siblings arrive. However, it could be that this effect is channelled through the fact that being in the presence of older siblings since birth (i.e. if it is not the first born) may encourage interactive play and thus stimulate the production of words.

Surprisingly though, the coefficient on the number of siblings a child has, whilst low, is negative, indicating that for every additional sibling a child has, 5.05 percentage point fewer words are produced by the child on average (p<0.05). This does not rule out the potential explanation for the negative sign on ‘First born’ since the ‘Number of siblings’ variable does not distinguish between siblings born before or after the child. In fact, this result is in line with findings from Harkness (2009, in Vogt, Mastin and Aussems, 2015: 11), who shows that children growing up in rural Kenya who socialise more with siblings have smaller vocabularies than those who socialise more with their mothers.

In this case, it may be that more siblings direct the mother’s attention away from the child in question, explaining the negative coefficient on number of siblings – although this does not account for the negative effect of being a firstborn. Alternatively, the effect of number of siblings could operate through similar channels as having a sibling secondary caregiver. The most plausible explanation though, for explaining the negative coefficients on both first born and number of siblings, may arise to from educational factors. Mothers are younger when they have their first child, so it is not improbable that they will have lower education too, and in fact many of the mothers we interviewed reported still

76 being in school.53 Additionally, if the child spends a large amount of time with siblings, who also have a low education by virtue of their age, the effects of education are likely to be similar.

Finally, pointing towards the critical importance of early childhood development centres in South Africa is the large and statistically significant (at the 1% level, p<0.01) coefficient on crèche attendance.

Holding all else constant, crèche attendance is associated with a 36.8 percentage point increase in vocabulary production on average. After gender (the effect of which is shown to be diminished by the age-gender interaction), the variable for crèche attendance has the largest effect on vocabulary production, even surpassing the evident importance of maternal education (particularly indicative is the attainment of post-schooling education, to be confirmed by future studies). Despite the lack of resources and educational toys in crèches in rural South Africa, it is nevertheless likely that children will be engaged in more stimulating and educational play and activities that enrich their vocabularies and their communicative development, as shown by these results.

Dalam dokumen University of Cape Town (Halaman 83-88)