Technical Background
2.1.1 The Display Pipeline
The display pipeline refers to the transformation of object data from its original defined space through a series of spaces until its final mapping onto the screen.
The object data are transformed into different spaces in order to efficiently com- pute illumination, clip the data to the view volume, and perform the perspective transformation. After reviewing these spaces and their properties and the transfor- mations that map data from one space to the next, this section defines useful ter- minology. The names used for these spaces vary from text to text, so they will be
Spaces and Transformations 33
reviewed here to establish a consistent naming convention for the rest of the book.
Clipping, while an important process, is not relevant to motion control and there- fore is not covered.
The space in which an object is originally defined is referred to as object space. The data in object space are usually centered around the origin and often are cre- ated to lie within some limited standard range such as minus one to plus one. The object, as defined by its data points (which are also referred to as its vertices), is transformed, usually by a series of rotations, translations, and scales, into world space, the space in which objects are assembled to create the environment to be viewed.
In addition, world space is the space in which light sources and the observerare placed. For purposes of this discussion, observerposition is used synonymously and interchangeably with camera position and eye position. The observer parameters include its position and its orientation, consisting of the view direction and the up vector. There are various ways to specify these orientation vectors. Sometimes the view direction is specified by giving a center of interest (COI), in which case the view direction is the vector from the observer or eye position (EYE), also known as the look-from point, to its center of interest, also known as the look-to point. The default orientation “straight up” defines the object’s up vector as the vector that is perpendicular to the view direction and in the plane defined by the view direction and the global y-axis. This vector can be computed by first taking the cross prod- uct of the view direction vector and the y-axis and then taking the cross product of this vector with the view direction vector (Equation 2.1). Head tilt information can be provided in one of two ways. It can be given by specifying an angle devia- tion from the straight up direction, in which case a head tilt rotation matrix can be incorporated in the world- to eye-space transformation. Alternatively, head tilt information can be given by specifying an up direction vector. The user-supplied up direction vector is typically not required to be perpendicular to the view direc- tion as that would require too much work on the part of the user. Instead, the vec- tor supplied by the user, together with the view direction vector, defines the plane in which the up vector lies. The difference between the user-supplied up direction vector and the up vector is that the up vector by definition is perpendicular to the view direction vector. The computation of the up vector is the same as that out- lined in Equation 2.1, with the user-supplied up direction vector replacing the y - axis.
(Eq. 2.1)
w = COI – EYE view direction vector u = w × (0, 1, 0) cross product with y-axis v = u × w up vector
34 2: Technical Background
Care must be taken when using a default up vector. Defined as perpendicular to the view vector and in the plane of the view vector and global y-axis, it is unde- fined for straight up and straight down views. These situations must be dealt with as special cases or simply avoided. In addition to the undefined cases, some observer motions can result in unanticipated effects. For example, the default head-up orientation means that if the observer has a fixed center of interest and the observer’s position arcs directly over the center of interest, then just before and just after being directly overhead, the observer’s up vector will instantaneously rotate by up to 180 degrees (see Figure 2.1).
In addition to the observer position and orientation, the field of view has to be specified, as is standard in the display pipeline. This includes an angle of view (or the equally useful half angle of view), hither clipping distance, and yon clipping dis- tance. Sometimes the terms near and far are used instead of hither and yon. The field of view information is used to set up the perspective projection.
The view specification discussed above is somewhat simplified. Other view specifications use an additional vector to indicate the orientation of the projec- tion plane, allow an arbitrary viewport to be specified on the plane of projection that is not symmetrical about the view direction to allow for off-center projec- tions, and allow for a parallel projection. The reader should refer to standard graphics texts such as the one by Foley et al. [2] for in-depth discussion of such view specifications.
The visible area of world space is formed by the observer position, view direc- tion, angle of view, hither clipping distance, and yon clipping distance (Figure 2.2). These define the view frustum, the six-sided volume of world space contain- ing data that need to be considered for display.
In preparation for the perspective transformation, the data points defining the objects are usually transformed from world space to eye space. In eye space, the observer is positioned along the z-axis with the line of sight made to coincide with
Figure 2.1 Demonstrating the up vector flip as observer’s position passes straight over center of interest
z y
x Center of
interest
Observer
View vector Up vector
Spaces and Transformations 35
the z-axis. This allows the depth of a point, and therefore perspective scaling, to be dependent only on the point’s z-coordinate. The exact position of the observer along the z-axis and whether the eye space coordinate system is left-handed or right-handed vary from text to text. For this discussion, the observer is positioned at the origin looking down the positive z-axis in left-handed space. In eye space as in world space, lines of sight emanate from the observer position and diverge as they expand into the visible view frustum, whose shape is often referred to as a truncated pyramid.
The perspective transformation transforms the objects’ data points from eye space to image space. The perspective transformation can be considered as taking the Figure 2.2 Object- to world-space transformation and the view frustum in world space
Observer Object transformed
into world space
View vector Center of interest World
Space Object
Space
Angle of view Up vector
Yon clipping distance
Hither clipping distance
View Frustum x
z y z
y
x
36 2: Technical Background
observer back to negative infinity in z and, in doing so, makes the lines of sight parallel to each other and to the z-axis. The pyramid-shaped view frustum becomes a rectangular solid, or cuboid, whose opposite sides are parallel. Thus, points that are farther away from the observer in eye space have their x- and y- coordinates scaled down more than points closer to the observer. This is some- times referred to as perspective foreshortening. Visible extents in image space are usually standardized into the minus one to plus one range in x and y and from zero to one in z (although in some texts visible z is mapped into the minus one to posi- tive one range). Image space points are then scaled and translated (and possibly rotated) into screen space by mapping the visible ranges in x and y (minus one to plus one) into ranges that coincide with the viewing area defined in the coordinate system of the window or screen; the z-coordinates can be left alone. The resulting series of spaces is shown in Figure 2.3.
Ray casting (ray tracing without generating secondary rays) differs from the above sequence of transformations in that the act of tracing rays through world space implicitly accomplishes the perspective transformation. If the rays are con- structed in world space based on pixel coordinates of a virtual frame buffer posi- tioned in front of the observer, then the progression through spaces for ray casting reduces to the transformations shown in Figure 2.4. Alternatively, data can be transformed to eye space and, through a virtual frame buffer, the rays can be formed in eye space.
In any case, animation is typically produced by the following: modifying the position and orientation of objects in world space over time; modifying the shape of objects over time; modifying display attributes of objects over time; transform- ing the observer position and orientation in world space over time; or some com- bination of these transformations.