shot placement; and
•
net strategy.
•
Each of these can be further decomposed, through the process of abstrac- tion. For example, service strategy can be based on serve type (slice, fl at or kick), pace of serve and placement of serve. Timing factors, such as the duration of a rally, can be also be used as indications of strategy. The per- centage of points where an ace is served is a variable that is an abstract representation of serving ability that abstracts away from details of when aces were served, precise service technique, ball placement aspects and receiver positioning. The author (O’Donoghue and Ingram, 2001) used fi ve timing factors, two outcome indicators and nine other variables as a fi nger- print representing the concept of strategy:
mean rally duration;
•
mean time between serves played within points where more than one
•
serve was required;
mean time between points within games;
•
mean time between games when the players had to change ends of the
•
court;
mean time between games when the players did not have to change ends
•
of the court;
the percentage of points where the fi rst serve was played in;
•
the percentage of points won given that the fi rst serve was played in;
•
the percentage of points won when a second service was required;
•
the percentage of points where an ace was served;
•
the percentage of points where a double fault was played;
•
the percentage of points where a serve winner was played;
•
the percentage of points where a return winner was played;
•
the percentage of points where the server approached the net fi rst;
•
the percentage of points where the receiver approached the net fi rst;
•
the percentage of points that were baseline rallies;
•
the percentage of net points won by the player who went to the net fi rst.
•
in other sports such as soccer different teams use different formations and systems of play, making positional role a complex factor. The concept of
‘total football’ where players perform multiple roles can make the concept of positional role rather fl uid. Positional role can also vary from situation to situation within the same match. The author once analysed the on-fi eld activity of a soccer player who seemed to perform as a left centre back, a left wing back and a left of centre holding midfi elder at different points of the same game. On another occasion, the author watched a central defender play as a centre forward when his team required an equalising goal in the last fi ve minutes of a match. Is the switch of position from centre back to centre forward part of the role of the centre back?
In order to be objective, it is necessary to produce an operational defi nition of positional role so as the classifi cation of player position is independent of the researcher’s personal opinion of the player’s positional role. Objectivity has other disadvantages and the defi nition of an elite soccer player used by O’Donoghue (1998) is a good example of this. The defi nition was that an elite player was an English FA Premier League player with at least one full interna- tional cap for his country. The problem with this defi nition is that some players who have not made any appearances for highly ranked national teams may be better soccer players than some players who have been capped many times by national squads that are not ranked as highly.
Despite the widely held view that notational analysis is a quantitative observa- tional analysis method, a great deal of notational analysis activity involves subjec- tive evaluation of player behaviour. For example, the classifi cation of locomotive movement during time-motion analysis involves observer perception of move- ment and largely subjective judgements when classifying movement. The point at which a player makes the transition from jogging to running and from running to sprinting will vary from observer to observer. The frequencies of movement instances are simply numerical counts of subjective classifi cations.
There are, however, occasions where it is possible to defi ne variables in a precise and objective manner. Point type in tennis is used as an example of producing operational defi nitions. It is often useful to specify some basic terms fi rst. For example, if we start by trying to defi ne a double fault, we might defi ne it as follows:
A double fault is served when both the fi rst and second services fail to land in the correct service box.
The problem with this defi nition is that it is incomplete and a smart tennis lawyer could argue that a table tennis-type serve could be counted as being in because we did not specify that the fi rst contact of the ball with the court had to be in the correct court on the other side of the net from the server.
Specifying all of the elements of a double fault including foot faulting, excluding lets, the ball having to pass over the net to be a good serve and regulations regarding the service action and multiple ball contact of the
racket can become very complicated when specifying a double fault due to the fact that there are at least two serves involved. So initially, it is a good idea to defi ne a ‘good serve’ and a ‘service fault’. These terms can then be used within the defi nition of an ace and a double fault.
The defi nition of a net point presents similar problems. How far does the player have to travel towards the net before the point is considered to be a net point? O’Donoghue and Ingram (2001) counted a player as having approached the net if the player had crossed the service line (back of the service boxes) and the player or the opponent still had to play one or more shots in the point. So if the player hit a winner from behind the service line and ran towards the net after playing this shot, it was not a net point. If the opponent reaches this shot but plays an error (another term we need to defi ne), but before the player crosses the service line, it is not a net point. However, if the opponent reached the shot after the player has crossed the service line, whether successfully returning the ball or not, the point counts as a net point. If the player played any shots after crossing the service line, then the point is a net point, irrespec- tive of the outcome of these shots.
There are occasions where a single variable may not be enough to charac- terise the concept of interest. If we have operationally defi ned the duration of a rally, the mean rally duration for the match can be computed but this will not tell the reader if all of the rallies in matches are consistently around the mean duration or if there is a wide spread of rally durations within matches. The standard deviation of the mean rally duration will simply represent the spread of mean rally durations across matches rather than within matches. A match may have a mean rally duration of 4s with some rallies being longer than 10s.
It is not possible that there will be any rallies of negative durations and so rally duration within matches has a skewed distribution of values. Therefore, there is a case for using the median rally duration of a match as a measure of average rally duration. However, the median will not provide an indication of the spread of rally durations within matches. O’Donoghue and Liddle (1998) used a series of variables to represent the percentage of points where the rallies were in different 2s duration bands (less than 2s, 2s – under 4s, 4s – under 6s, 6s – under 8s, 8s – under 10s and so on). This not only allowed the modal rally duration to be determined but also allowed the distribution of rally durations within matches to be studied.
FORMING HYPOTHESES
To use hypotheses or not?
The use of operational defi nitions and formally expressed hypotheses can add considerably to the word count of the introduction of a dissertation.
When one considers the number of words in a mobile phone contract, it
becomes apparent that a large number of words are required to specify any- thing completely and unambiguously. Therefore, some research projects will use a less precise set of aims and a statement of the purpose within the introduction. Another reason for not expressing formal hypotheses is that the student may be producing an initial proposal for a project and part of the research project is to identify performance indicators and develop a system to allow these to be determined. When such a student eventually completes their dissertation, the operational defi nitions may appear in the methods section rather than in the introduction. Other students may specify some fundamental terms in the introduction and defi ne other terms within the methods chapter.
There is a more fundamental reason why formal hypotheses may not be specifi ed in some projects. There are some research projects that simply describe the performance of a group of interest using a single sample and descriptive statistics. Inferential statistics are not relevant to the purpose of such studies unless values are being compared with some benchmark standard. Therefore, research projects can be done using a single sample that is described with - out any inferential testing and hence without any need to specify formal hypotheses.
The remaining sections on hypotheses are relevant when formal hypotheses are to be used. They describe how to specify hypotheses when different types of variable are being used. A hypothesis is a precisely specifi ed outcome of a study or experiment. In Figure 4.1, the formulation of hypotheses is done once an important research question of interest has been decided, structured and operationalised. The hypotheses specify precisely the research question that the research project seeks to answer. When undertaking a quantitative study, the hypotheses are the possible outcomes of the study, and the research question becomes one of determining which outcome to accept and which to reject.
Typically there are two hypotheses: a null hypothesis (H0) and an alternative hypothesis (HA), which are mutually exclusive as well as being the only possi- ble outcomes of the study. The alternative hypothesis is sometimes referred to as the ‘research hypothesis’ because it is the link or association between con- cepts suggested during the structuring of the research problem. The null hypothesis is a potential outcome refl ecting no association or link between the concepts of interest. ‘Mutually exclusive’ means that there is no other outcome of the study that can be interpreted as agreeing with both or neither hypothesis.
The hypotheses must be observable or testable so that they can be accepted or refuted based on the evidence of the data collected. Well formed hypotheses are also an excellent starting point when designing the methods to be used.
Hypotheses involving only nominal variables
If we wished to determine if the court surface had an infl uence on the chances of an upset in Grand Slam singles tennis, we would be dealing with two categorical variables that just happen to be nominal variables. The specifi ca-
tion of the research question in this case must not attempt to impose tests of order on either variable. Logically we do have a hypothesised dependent variable – match outcome, which is a dichotomous variable with values being ‘a win for the higher ranked player’ or ‘an upset’ with respect to the World rankings used in professional tennis. This variable is hypothesised to be the dependent variable rather than court surface because the court will not turn from cement to grass even if the World number 1 is defeated by the World number 500! The study may be surveying matches played on two or more court surfaces. If more than two court surfaces are included in the study, the researcher may wish to know which pairs of court surfaces are different if there is a surface effect. However, this is best described in the methods and the eventual results. The hypotheses will be whether there is a surface effect on match outcome or not.
In the following examples of hypotheses, it is assumed that the scope of the investigation has been restricted to singles tennis matches played at Grand Slam tennis tournaments. The following are examples of poorly presented hypotheses:
H0 – Court surface has a small infl uence on match outcome.
HA – Court surface has a large infl uence on match outcome.
These null and alternative hypotheses do not cover every possible outcome of the study. This is because there may be no infl uence at all of court surface on match outcome, with exactly the same proportion of upsets being observed on all court surfaces. Furthermore, the distinction between a small and large infl uence may not have been defi ned. The following hypotheses are vague if match outcome has not been defi ned:
H0 – Court surface has no infl uence on match outcome.
HA – Court surface has an infl uence on match outcome.
Assuming that an ‘upset’ has been defi ned as any tennis match won by the lower ranked player according to the World rankings, then the hypotheses can be expressed satisfactorily as follows:
H0 – Court surface has no infl uence on the proportion of matches that are upsets.
HA – Court surface has an infl uence on the proportion of matches that are upsets.
If there were only two court surfaces included in the study (for example clay and grass), the hypotheses might be expressed as follows, which would be problematic. The problem here is that the alternative hypothesis is one- tailed assuming a direction of any difference: that if there is a difference it is that there is a greater proportion of upsets on grass. One tailed assumptions
can be used, but only when there is strong theoretical grounds to be assum- ing the direction of any difference found. The hypotheses below introduce an additional problem in that they do not include the possible outcome that there may be a higher proportion of upsets on clay:
H0 – The proportion of matches that are upsets is similar between matches played on grass and clay courts.
HA – A greater proportion of matches are upsets on grass surfaces than on clay surfaces.
A two-tailed version of the alternative hypothesis is shown below where the outcome is a difference in the proportion of outcomes observed without any assumption about which the surface is associated with more upsets if there is a surface effect:
H0 – The proportion of matches that are upsets is similar between matches played on grass and clay courts.
HA – The proportion of matches that are upsets varies between matches played on grass and clay courts.
At the moment, these hypotheses could be used for a study of just men’s singles matches, or a study of just women’s singles matches, or a study that combines both. However, with both men’s and women’s matches involved, it would be a good idea to include gender as another categorical factor that might have an infl uence on the number of upsets. This could be done by having two sets of hypotheses as follows:
H01 – Court surface has no infl uence on the proportion of matches that are upsets.
H02 – The proportion of matches that are upsets is similar between men’s and women’s singles matches.
HA1 – Court surface has an infl uence on the proportion of matches that are upsets.
HA2 – The proportion of matches that are upsets differs between men’s and women’s singles matches.
This example looks at the infl uence of gender on match outcome and the infl uence of court surface on match outcome separately. The question about the effect of court surface involves both men’s and women’s singles matches pooled together. Similarly, the question about the effect of gender on match outcome involves matches played on different surfaces. The readers of the eventual study might like to know whether the surface effect occurs in men’s singles and women’s singles when considered separately. This would require the hypotheses to be worded as follows:
H0 – Court surface has no infl uence on the proportion of matches that are upsets in men’s or women’s singles tennis.
HA – Court surface has an infl uence on the proportion of matches that are upsets in men’s singles tennis and/or women’s singles tennis.
Note that the use of the ‘and/or’ in the alternative hypothesis is necessitated by the need to ensure that the alternative hypothesis is the exact opposite outcome to the null hypothesis. Therefore, if the eventual results show a signifi cant infl uence of court surface on match outcome in men’s singles, women’s singles, or both men’s and women’s singles tennis then the null hypothesis can be rejected.
Hypotheses involving only numerical variables
There are occasions where the research question is about the association or relationship between two or more numerical scale variables. In such cases the hypotheses are not expressed in terms of samples, groups or conditions unless some additional categorical variable is included. Instead, the alterna- tive (research) hypothesis is that there is some relationship between the numerical variables.
Consider the example of serving in Grand Slam singles tennis where a researcher anticipates that if the fi rst serve is played fast and close to the lines of the service box, it will be out more often than a slow fi rst serve aimed well inside the target service box. However, on those occasions where such a fast serve is successfully played into the court, it will be more likely to lead to a winning point for the server than if a slow service aimed well inside the target service box were used. This reasoning could justify the use of a one-tailed hypothesis that assumes that any association between the percentage of fi rst serves that are played in and the percentage of points won by the server must be a negative relationship. A negative relationship between two numerical var- iables X and Y is one where as X increases, Y decreases. The hypotheses in this tennis serving example might be expressed as follows:
H0 – There is a positive relationship or no relationship between the percentage of points where the fi rst serve is in and the percentage of points that are won by the server, given that the fi rst serve is played in.
HA – There is a negative relationship between the percentage of points where the fi rst serve is in and the percentage of points that are won by the server, given that the fi rst serve is played in.
A more cautious researcher might express these hypotheses in a two-tailed manner, not assuming a particular direction of any relationship that might be found between the variables: