GIS DATA MODELS - GIS-Based Technologies for Problem-Solving and Decision-Making

A model is a simplified representation of a phenomenon or a system. GIS modelling involves the symbolic representation of the location properties (where), as well as the thematic (what) and temporal (when) attributes describing the characteristics and conditions of space and time. A GIS model attempts to emulate processes of the real world at some point of time or for a limited time period. It allows the testing of a hypothesis with different data sets related to a geographical scenario. A model can

Figure 1.8 (a) Raster data and (b) Vector data

be embedded into a GIS application for easier reproduction of data. A GIS model can be exported as a flow chart or modelling data structure. There are different types of GIS models with some fundamental characteristics such as scale, extent, purpose, approach, technique, association, and aggregation. A large number and variety of data models are used in GIS, some of which are as follows (John 1997).¹

• Vector data models Spaghetti data model Topological data model

• Raster data models (more specifically, tessellation model)

• Surface models

Triangular irregular network (TIN) model Digital elevation model (DEM)

• Conceptual models

Entity-relationship model

Enhanced entity-relationship model

• Network models

• Relational models

• Object-oriented models

• Hierarchical models

• Semantic data models

• Conceptual models

Vector Data Models

Vector data models use points, lines, and polygons to represent any geographical location. In vector representation, the boundaries are defined as a series of points and each point is uniquely mapped to the x–y coordinates of a geo-reference coordinate system. The non- spatial attributes of these locations are stored in conventional database management systems. Two very common types of vector data models are spaghetti data model and topological data model.

Spaghetti data model

A vector-based data model where each element on the map becomes a logical record in a digital file and is defined as a string with x–y coordinates is called a spaghetti data model. This is a simplest data model where every object is stored independently. Objects in the

1 Details available at <www.gisca.adelaide.edu.au/kea/gisrs/courses/postgrad/ introgis/

chapter6/chi6.html>

spaghetti model are stored as a set of two elements—name of the object and the x–y coordinate value of the object location. A spaghetti model is illustrated in Figure 1.9.

Some properties of the spaghetti model are as follows.

• A common boundary between two polygons is recorded twice. Hence, redundant data exist in a spaghetti model.

• Lines are encoded as strings of x–y coordinates, while polygons are encoded as curved loops.

• No spatial relationships are stored in the spaghetti model.

Topological data model

A vector-based data model that encodes spatial relationship of points, lines, and polygons and defines how they share geometry represents a topological data model. A topological model introduces two new elements of discrete mathematics—node and edge. A node is a uniquely defined point that joins several arcs. An edge is an arc that has a defined starting node and ending node. This model stores geometry as a series of nodes and arcs. A shared geometry, such as a common boundary between two polygons, is stored only once in a topological model; hence, redundancy is eliminated. A topological data model is illustrated in Figure 1.10.

Some properties of a topological model are as follows.

• The node is the basic entity in this kind of model. It is the point where several arcs meet.

• An arc is a series of nodes having a starting node and an ending node.

• A point is a single x–y coordinate and is considered a polygon with no area.

Figure 1.9 A spaghetti data model

• A polygon is a closed loop of arcs that represents the boundary of the polygon.

• Every object of the model is composed of a less complex structure.

• Topological models provide opportunities for geometric analysis of location without actual access to the location.

Raster Data Models

Raster data models represent geographical location as a series of interconnected cells where each cell is limited and represents an equal area of earth surface. Raster data models use raster data type to encode spatial data of the area of interest. The matrix (row–column structure) of cells is called a grid. In raster data models, the accuracy of data depends on the cell size, since the cell is the smallest unit that contains spatial information of a location.

Each cell of a raster data model contains an associated data value. For a 1 bit raster file, there are only two possible values for the cell, 0 or 1, while for an 8 bit raster file, there are 256 possible values for each pixel.

Figure 1.11 shows a 4 bit raster file. A data value can represent a colour or grey value, depth or height, and measurements or any other thematic value. The area covered by each pixel is known as spatial resolution. An important property of a raster model is that all 0-dimensional (points) and 1-dimensional (lines) features will be located towards the centre of the cell. There are several raster-based models, and the common ones include eGrid ESRI files, digital orthophotos, and satellite imagery.

Some properties of a raster data model are as follows.

• Often used for biological and physical subsystems of the geosphere, such as temperature, elevation, and vegetation cover.

• Focuses on analysis and modelling of images.

Figure 1.10 A topological data model

• Lines and points move towards the centre of cells in a raster model.

• The spatial position of each cell in a raster model can easily be calculated by defining the origin of the raster and the spatial resolution (cell size) of each cell.

• Tiff, jpeg, and bmp are various data formats based on the raster data model.

• Landsat TM satellite imagery data are raster data with a spatial resolution of approximately 30 m on one side.

Surface Models

Triangulated irregular network model

Triangulated irregular network model uses contiguous, non-overlapping triangles to represent a three-dimensional surface (length, width, and height). A geographical region can be divided into both regular (raster) and irregular non-overlapping polygons for modelling and analysis. A TIN model allows surface models to be generated efficiently to analyse and display terrain and other types of surfaces. The elevation value of a specific point on the earth’s surface is modelled as the vertex of a triangle, whereas arcs represent the estimation of elevations between two vertices (two points on the earth’s surface). To maintain the accuracy in drawing the triangles, that is, to maintain the accuracy in elevation modelling, the Delaunay construction rule is exercised. According to the Delaunay construction rule, “three points form a Delaunay triangulation if and only if (iff) a circle which passes through all three points contains no other points in the set.” This rule can be devised to divide areas of similar slope into irregular triangles. For example, a rectangular region can be divided into two rectangles by joining the north-east and south-west corners of the rectangle. By placing a point in the centroid of each triangle, six more non-overlapping triangles can be constructed

Figure 1.11 A 4 bit raster file

(Figure 1.12). This process proceeds until a predefined threshold value is generated.

Nodes are the elementary building blocks of the TIN data. They are connected to their nearest neighbours by edges, according to a set of rules. The user is not responsible for selecting the nodes; all the nodes are added according to a set of rules. The TIN creates triangles from a set of points called mass points, which always become nodes. Mass points can be located anywhere, but the accuracy of the model depends on the proper selection of mass points. Every triangle is assigned a unique identifier defined by three nodes and its two or three neighbouring triangles.

Some properties of a TIN model are as follows.

• The model was developed in the early 1970s as a simple way to build a surface from a set of irregularly spaced points.

• It is a vector-based model (in the form of lines, points, and polygons) dividing a surface into polygons having the attributes of slope, aspect, and area, with three vertices having elevation attributes and three edges with slope and direction attributes.

• A fewer number of points is required to model the surface; hence, it has a smaller file size.

Figure 1.12 Triangulated irregular network model

• It is an irregular model because vertices are scattered in ad hoc fashion.

• It is simple and economic.

• A TIN can be created using contours [a line through all contiguous points with equal height (or other values)] and breaking lines (linear features that define and control the surface behaviour in terms of smoothness and continuity).

Digital elevation model

A digital elevation model is a sampled array of spot heights at regular intervals in any surface. The height of the highest point in a given area is expressed in feet or metres above sea level, as marked on topographical charts. In a DEM, digital information about surface elevations is presented in raster format. Each pixel value in the grid structure represents the spot height on the surface. Surfaces like the earth’s surface are continuous phenomena; hence, they require an infinite number of points to be represented with a finite data set. Specific computer software interprets the DEMs by converting them into a three-dimensional depiction of the surface (Figure 1.13). A DEM is the most common and simplest form of topography. It is called a digital terrain model when it represents the earth’s surface without objects on it (the bare earth’s surface). It is

Figure 1.13 A DEM diagram

Source <http://grass.osgeo.org/uploads/images/Gallery/3D/landsat_RGB_nviz_trento.png>

called a digital surface model when it represents heights of landscape features such as trees and buildings. Elevation and height are technically different. Elevation is the height above a given level, especially that of the sea, whereas height is the measurement from base to top.

Some properties of DEM are as follows.

• The accuracy of a DEM is measured by resolution and height.

• A DEM contains only the specific elevation values at specific grid point locations.

• Elevation contours are specified in DEM representation.

• A DEM is specifically used for many geo-analysis processes such as landslide study and topographical feature extraction.

• A DEM is widely popular for terrain analysis due to its simplicity and extensive software support.

• Resolution (distance between two grids) is the most critical parameter to be decided in a DEM model.

• A DEM is used to find features on the terrain, such as drainage basins and watersheds, drainage networks and channels, peaks and pits, and other landforms.

Network Models

Network models are graphs consisting of arcs that represent linear flows and nodes, which represent the interconnection between the arcs.

Nodes can be junctions, and edges can be roads in a network model (Figure 1.14). A network can also be considered a system of vertices and edges, mathematically defined as a graph G = (N, E), where N is the number of nodes and E is the number of edges in the network. Networks are used to store connectivity of source features. Because of its node–arc structure, network models preserve topology and are widely used for allocation, path finding, and tracing. The geometry or topology of a network model should be close to the real-world scenario.

Network models find a connected path through a network; they then analyse and manage the parts and assets associated with it. Arcs in a network model can be broadly classified into two types:

• Directed links are straight lines connected by two nodes (Figure 1.15a).

• Directed chains are topologies with intermediate shape points between two nodes (Figure 1.15b).

Two important aspects of a network model are network topology and feature connectivity. Network models are widely used to analyse

vehicle traffic over transportation systems, load analysis over an electric network, or pollution tracking over a river.

Relational Models

A model that organizes data into a tabular format is called a relational data model. Relational data models store data in tables. Each table has a unique name and identity. The table has two aspects—a set of columns representing field names and rows containing information.

Rows are known as tuple, and the order in which they occur in a table is immaterial. No two rows can represent the same values for all columns in the table. In a GIS, each row is usually linked to a separate spatial feature. Accordingly, each row would consist of several columns, each column containing a specific value for that geographic feature. Data are often stored in several tables (Figure 1.16). Tables can be joined or referenced to each other by common columns (relational fields). The possibility of joint operations in relational data models is what makes relational data models commonly used in GIS.

The relational database model is the most widely accepted model for managing non-spatial attributional data. It has emerged as the

Figure 1.14 A network model

Source <http://grass.osgeo.org/screenshots/vector/>

Figure 1.15 Arcs in network chains: (a) directed link and (b) directed chain

dominant commercial data management tool in GIS implementation and application. A relational data model has the following properties.

• It is simple to organize information into tables and model it.

• Data can be manipulated in an ad hoc manner by joining tables.

• It reduces data redundancy by a proper storage of data tables.

• There is no need to take into account the internal organization of data.

Object-oriented Models

Object-oriented models store data into objects. These objects can be accessed only by methods specified by its class (group of object with similar attributes and methods) (Figure 1.17). An object-oriented model incorporates the following fundamental concepts.

• Any real-world entity can be modelled as an object. Every object has a unique identification.

• Every object possesses a state (values of different variables at an instance of time) and behaviour (set of methods that operate on the state of the object). The state and methods of an object can be accessed by another object only by passing a message.

• Class is a group of all objects that share the same attributes and methods.

Figure 1.16 Relational database

Source <http://grass.osgeo.org/screenshots/vector/>

• Each class has the super class from which a class can inherit objects, methods, or both.

• The essence of an object-oriented model lies in its properties, which are explained as follows.

Encapsulation: Encapsulation is an attribute of object design by virtue of which all the data related to an object are contained by and hidden in the object. It can only be accessed by member of the object’s class.

Polymorphism: Polymorphism is the occurrence of something in many forms. It is a characteristic that allows an object to have more than one form.

Inheritance: Inheritance is an attribute that allows a super class to transfer its state and attributes to its children.

In GIS, object-oriented modelling not only allows the data to be held as an object (for example, an element on a map) but also allows these objects to be operated on by its methods and establishes relationships between these objects through message transfer. In this approach, querying is very natural, as features can be bundled together with attributes if the application requires. Object-oriented modelling thus holds many operational benefits with respect to geographic data processing.

Figure 1.17 Object data model

Hierarchical Models

Hierarchical models present data as family tree such that each record has only one member. Figure 1.18 presents a hierarchical data model representing an animal family. A classical data model sets layers of data set, and subsets are organized in a parent–child structure.

Hierarchical models are similar to the classic file structure of data in computers. These are the oldest type of data models. They support only a one-to-many relationship among data items. Actual geographical phenomena may not allow the number of parents to be limited; thus this model has very limited scope in GIS applications.

Semantic Data Models

Semantic data models (SDMs) represent data in logical structures.

They focus on providing the meaning of data along with attributes and interrelationships with other data. In semantic data models, an entity represents an aspect or a phenomenon of the real world. It supports dynamic schema evolution to capture new or evolving types of semantic information. Semantic models are widely used in natural language processing to define the semantic context of entities (words) used at any instance. They follow an arc–node structure where a node represents the basic entities and an arc represents the relationship between these entities (Figure 1.19). SDMs incorporate two types of relationship between entities—“is-a” (membership) relationship and

“has-a”(inheritance) relationship.

Conceptual Models

Conceptual models are a type of abstraction that uses logical concepts and hides the details of implementation and data storage. Conceptual models are the most abstract form of data. Detailed information, such

Figure 1.18 A hierarchical model

as data types, is omitted from conceptual data models. There are two standard ways in which spatial information is modelled conceptually—

object-based and field-based models.

Object-based models

Object-based models represent information as discrete geo-referenced entities. Each entity has a coordinate pair of x, y associated with it, defining its location in the real world. Because it is focused on objects, the implementation of this conceptual model will yield data models and structures that are focused on objects.

Field-based models

Field-based models represent information as collections of spatial relationships, where each relationship is formalized as a mathematical function from a spatial framework. The spatial framework indicates that the model will divide an area into a finite tessellation of spatial units.

Dalam dokumen GIS-Based Technologies for Problem-Solving and Decision-Making (Halaman 37-49)