Statistics in language research Analysis

(1)

Review

Statistics in language research: Analysis of variance

.

Tony Rietveld and Roeland van Hout (2005)

New York: Mouton de Gruyter. viii, pp. 265. ISBN 978–3-11–018580–5

Reviewed by Jenifer Larson-Hall

With so many books on statistics in existence, I believe that any new book on statistics should contain a clear raison d’être. In my opinion, Rietveld and van Hout’s book lacks a clear target audience and I would recommend other books on statistics instead. For example, this book, at only 265 pages, is not one of the large, rather weighty, authoritative surveys on statistics like Howell (2006), Tabachnick and Fidell (2006) or Kirk (1995). hese books give detailed explanations of a large variety of statistical methods using examples from the behavioral sciences and provide both mathematical and conceptual explana-tions of statistical procedures. In more recent ediexplana-tions they also give information about how to calculate statistics with programs like SPSS, SAS, SYSTAT and MINITAB. (I confess an especial fondness for Howell, whose explanative style I ind very clear. I stay awake while walking through an example of how to calculate sums of squares with Howell, which I consider a high compliment!)

Rietveld and van Hout is compact and provides some explanation about how to use SPSS, which is clearly a widely-used statistical program in language research, however it does not provide as much detail as one would really need to understand the statistical procedure and results well. For example, although the book gives instructions about which menus to follow in SPSS to conduct

Affiliation

(2)

statistical procedures such as t-tests, one-way ANOVA, and MANOVA, it is skimpy on procedures to assess the assumptions of the tests and never tells how to call for any graphs (a practice which is recommended by Wilkinson and the APA task force on statistical inference, 1999, in conjunction with looking at numerical results). Rietveld and van Hout’s book is certainly not SPSS Survival Manual (Pallant 2007), which at 352 pages is a functional introduction to the essentials of SPSS statistical analysis without being overly simplistic.

One might think then that the value of this book is that it uses pertinent examples to illustrate how statistics works in practice in the ield of linguistics. One would also be mistaken here as well. he book contains relatively few concrete examples of linguistic analysis. As an example, in discussing the four scales of measurement, the nominal scale is illustrated by referring to language background but the ratio scale is explained by talking about the price of cars and the weight of objects. I found it highly annoying that in a book that would have seemed to be written for a linguistic audience the paired-sample t-test was illustrated by referring only to the variables ‘Group’ and ‘Dependent [variable]’. Although some tests are exempliied with a speciic linguistic research design, such as repeated measures over time on patients with dysfunctional velums by a language pathologist, this turns out to be less common that I had hoped, and the data are always intentionally fabricated. he linguistic examples which are given range over a variety of subdisciplines of linguistics:

•

Phonetics: Was a standard or non-standard allophone used?

•

Psycholinguistics: What were the reaction times on diferent word types?

•

Applied Linguistics: Which of three methods of vocabulary learning is best?

•

Clinical Linguistics: What are intelligibility scores of children with clet palates?

•

Sociolinguistics: Two Dutch-speaking language communities in Belgium are studied, each one containing two dialect groups.

However these examples continue to be repeated in a vague form (they never refer to any speciic studies) with little diference from chapter to chapter. For example, no other type of psycholinguistic research is referred to except for reaction times on words.

(3)

an appendix. It is hard to see why linguists would need to understand matrix algebra in order to use SPSS for their own data analysis, and the mathematical examples are not explained in the kind of fashion, step-by-step, that most linguists could follow. he book is certainly not one I would use for an intro-ductory statistics class, as I found myself, no statistical novice, baled by some explanations, and turned glassy-eyed in most of the mathematical excursuses. Another problem with using this book for introductory statistics is that most exercises are formal, technical questions (‘What are the p values associated with z > 1.64 and with t29 > 1.699’, p. 30) rather than conceptual work with situations that would lead beginners to understand what they need to do to analyze their own data.

On the other hand, there is little here that is enlightening for those who are more familiar with statistics. One reason is that the statistical topics are not as wide-ranging as Rietveld and van Hout’s previous book (1993), as the focus here is mostly on ANOVA techniques (with two chapters on one-sample and two-sample t-tests). I did think that the illustration of various research designs for ANOVA, which can be confusing, was handled quite well in Chapter 1. he design boxes help clarify how many variables and which kinds are necessary for data set-up.

(4)

126). Statistical simulation studies have found that this is true when the actual situation is that there are no group diferences, but when group diferences do exist, even small violations of assumptions can lead researchers to a Type II error (inding no diferences when they really exist) (Hampel 1973; Tukey 1960; Wilcox 2003). It would of course be wonderful if researchers determined sample sizes using power analysis, but robust methods of analysis could also be helpful in improving power and accuracy of statistical analysis as well.

In short, I do not know where this book its in the panoply of statistical works that already exist. It does not seem appropriate for beginners but may bale even those who feel they have a irm understanding of which test to choose for which occasion. It is not comprehensive enough to be an authoritative text but it is not simple enough to be used to quickly get up and going with using SPSS. It seems to be trying to ill a niche for those who want to understand statistics by seeing examples in their own ield, but I believe it does not succeed as well as a few quite recent books that have appeared or will appear: Baayen (2008) successfully explains how to use advanced techniques such as principal components analysis, factor analysis and linear mixed models that can be used in studying language processing; Johnson (2008) provides a book organized by linguistic topic including phonetics, psycholinguistics, sociolinguistics, histori-cal linguistics and syntax, and which covers both elementary statistics such as t-tests and one-way ANOVA as well as more advanced techniques like cluster analysis, mixed-efects models and an interesting comparison between logistic regression and Varbrul for Sociolinguists; or even my own book (Larson-Hall forthcoming) which aims to be an introductory text for Applied Linguists. All these books are illustrated with real data sets from real experiments and provide many more opportunities for those working in their ields to see how real, messy data sets can be analyzed with statistical sotware.

References

Baayen, R. H. (2008) Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.

Banks, D. (1995) Review: Statistical techniques for the study of language and language behavior. IRAL: International Review of Applied Linguistics in Language Teaching 33(1): 76–77.

Hampel, F. R. (1973) Robust estimation: A condensed partial survey. Zeitschrit für Wahrscheinlichkeitstheorie und verwandte Gebiete 27: 87–104.

Howell, D. C. (2006) Statistical methods for psychology. (6th ed.) Paciic Grove, CA: Duxbury/homson Learning.

(5)

Kirk, R. E. (1995) Experimental design: Procedures for the behavior sciences. Boston: Brooks/Cole Publishing Company.

Larson-Hall, J. (forthcoming) A guide to doing statistical analysis in second language acqui-sition. Mahwah, NJ: Lawrence Erlbaum Associates.

Meara, P. (1995) Review: Statistical techniques for the study of language and language behavior. Language Learning 54(2): 341–343.

Pallant, J. (2007) SPSS Survival manual. (3rd ed.) Philadelphia: Open University Press. Rietveld, T. and Van Hout, R. (1993) Statistical techniques for the study of language and

language behaviour. Berlin: Mouton de Gruyter.

Tabachnick, B. G. and Fidell, L. S. (2006) Using multivariate statistics. (5th ed.) Boston: Allyn and Bacon.

Tukey, J. W. (1960) A survey of sampling from contaminated distributions. In I. Olkin, S. G. Ghwyne, W. Hoefding, W. G. Madow and H. B. Mann (eds) Contributions to prob-ability and statistics: Essays in honour of Harold Hotelling 448–485. Stanford: Stanford University Press.

Wilcox, R. (2001) Fundamentals of modern statistical methods: Substantially improving power and accuracy. New York: Springer.

Wilcox, R. (2003) Applying contemporary statistical techniques. San Diego: Elsevier Science.

Wilkinson, L. and Task force on statistical inference, APA, Science Directorate,

(6)