• Tidak ada hasil yang ditemukan

Literature review - Computer Science - Rhodes University

N/A
N/A
Protected

Academic year: 2025

Membagikan "Literature review - Computer Science - Rhodes University"

Copied!
12
0
0

Teks penuh

(1)

Rhodes University

Department of Computer Science Computer Science Honours

Literature Review

A Comparative investigation and Evaluation of Oracle 9i and SQL Server 2000 with respect to Integrity and SQL Language Standards

By: Paul Tarwireyi, g05t5139 Supervisor: John Ebden

1. Abstract

The constantly evolving nature of RDBMSs has lead to database wars among the various vendors in the market. This is evidenced by each vendor in the market making claims of the superiority of his product, hence making the task of choosing a RDBMS not an easy one for a DBA. It is therefore imperative to research not only on how to use these tools properly, but also how they cohere to expectations and specifications. This review presents a comparative overview of Oracle 9i and SQL Server 2000 with respect to integrity and conformity to SQL Standards.

2. Introduction

For the modern business, information is the most valuable asset it has at its disposal in its endeavor to drive competitiveness. And it is only those who make better use of the information, who will yield better results. However, this is largely depended on the RDBMS in use. Data is just bits and bytes on a file system. Only a database can turn these bits and bytes into business information. [Oracle: 2002]. [Rob and Coronel, 2002:342] and [Mullins C S: 2002] state that the selection of the right RDMBS is critical to the information’s smooth flow and this requires skills, knowledge, and consideration.

(2)

However, [Mullins C S: 2002] is of the opinion that, the selection task is not as difficult as it used to be, mainly because the number of major DBMS vendors has dwindled due to industry consolidation and domination of the DBMS market by a few very large players.

This selection is depended on a number of factors [Rob and Coronel, 2002:342] which include maintaining data consistency, cost, and application requirements, personal

preferences, office politics and others. From the checklist we see that it is crucial to break the systems in every possible way to increase the probability of detecting shortfalls.

[Phadke M S: 2005]

3. DBMS theory

A wide variety of sources are available on RDBMSs and their related concepts but it is crucial to first know the basic principles and issues behind them. It is only after this, when one can examine how the different vendors implemented these principles in their tools. Thus, knowledge of cases is considered an important goal of comparative research, independent of any other goal. [Charles R: 1994].

4. Integrity

One of the major drives behind the development of RDBMSs was to ensure data consistency, yet this is one of those things that do not seem like an obvious topic for administrators to address directly and has been totally ignored by database benchmarks.

According to [Mullins C S: 2002] a database is of little use if the data it contains is inaccurate, and a proposed definition by [Houghton Mifflin Company, 2000] says, integrity is steadfast adherence to a strict moral or ethical code. [Mullins C S: 2002]

went on to say integrity can be classified as Database structure integrity and Semantic data integrity

4.1. Database Structure Integrity

This refers to the architectural, internal structures and pointers used to keep database objects in the proper order. If these are disturbed in any way, database access will be compromised. Transact-SQL has a set of DBCC statements used to verify this kind of integrity [Microsoft: 2005].

(3)

4.2. Semantic Data Integrity

This deals with the DBMS features and processes that can be used to ensure data consistency. RDBMSs automatically enforce integrity up to a point, and from there DBAs have to ask themselves how best they could enforce data integrity, because the RDBMSs will not protect them from inept handling of transactions. According to

[Rankins R, Bertucci P, Jensen P. 2002], [Oracle: 2005] and [Microsoft: 2005], semantic integrity can be classified as:

4.2.1 Entity Integrity

This type of integrity ensures that each record in a database is uniquely identified. This is usually enforced by making use of the PRIMARY KEY (PK). The SQL-92 standard requires that all values in a primary key be unique and that the column not allow null values. Both Oracle [Oracle: 2005] and Microsoft SQL Server [Microsoft: 2003] enforce uniqueness by automatically creating unique indexes whenever a PRIMARY KEY or UNIQUE constraint is defined. Additionally, primary key columns are automatically defined as NOT NULL.

4.2.2 Domain Integrity

This basically operates at field level. It is all about the permissible entries that a column can have. And according to [Oracle: 2005] and [Microsoft: 2005], you can enforce domain integrity by restricting the type (through data types), the format (through CHECK constraints and rules), or the range of possible values (through FOREIGN KEY

constraints, CHECK constraints, DEFAULT definitions, NOT NULL definitions, and rules). Oracle treats a default as a column property, and Microsoft SQL Server treats a default as a constraint. The SQL Server DEFAULT constraint can contain constant values, or NULL. It is also added that the syntax used to define CHECK constraints is the same in Oracle and SQL Server, and they create column constraints to enforce nullability.

Their columns default to NULL, unless NOT NULL is specified in the CREATE TABLE or ALTER TABLE statements [Sheldon R, Wilansky E. 2001].

(4)

4.2.3 Referential Integrity

This is concerned with keeping the relationship between tables synchronized. In order for this type of integrity to be maintained, a FOREIGN KEY (FK) in a “child table” should only accept values if they exist in the “parent table”. In SQL Server 2000 and Oracle 9i referential integrity is based on relationships between foreign keys and primary keys or between foreign keys and unique keys (through FOREIGN KEY and CHECK

constraints). This ensures that values are consistent across the tables. Basically the rules for defining foreign keys are similar in each RDBMS.

Threats to Referential Integrity

Update threat

This can produce orphans when either the (PK) in the parent table or the (FK) in the child is updated without any synchronization mechanism. This is where the ON DELETE and ON UPDATE clauses are used with the FOREIGN KEY constraint [Microsoft: 2005].

Insert threat

This occurs when we allow insertion of records in the child table, with no associated records in the parent table.

Delete threat

This occurs when we delete records from the parent table and not do anything about the corresponding records in the child table.

4.2.4 User-defined integrity

This refers the to specific business rules not covered by the types of integrity. In both DBMSs this is usually implemented by making use of triggers and procedures, though they have different implementations.

Also of utmost importance is the normalization of tables. The integrity rules will be useless unless your tables are normalized [Microsoft: 2005].

(5)

4.2.5 Transactions

According to [Greenwald R, Kreines D C. 2002], transactions are the cornerstone of data integrity in multi-user databases and the foundation of all concurrency schemes. A transaction is defined as a single indivisible piece of work that affects some data. All changes made to data within a transaction are either uniformly applied to a relational database with a COMMIT statement, or the data affected by the changes is uniformly returned to its initial state with a ROLLBACK statement [Rob and Coronel, 2002]. Once a transaction is committed, the changes made by that transaction become permanent and are made visible to other transactions and other users.

By default, SQL Server automatically performs a COMMIT statement after every insert, update, or delete operation. Because the data is automatically saved, you are unable to roll back any changes but in Oracle, a transaction is started automatically when an insert, update, or delete operation is performed. An application must issue a COMMIT

command to save changes to the database [Greenwald R, Kreines D C. 2002].

4.2.6 Locking and Concurrency

As [Greenwald R, Kreines D C. 2002] notes, one of the key functions of RDBMSs is to enable multiple users to concurrently read and or write records in a database without compromising the consistency of the data. Oracle and SQL Server use different approaches to solve this problem.

Oracle uses multi-version read consistency (MVRC), which guarantees that a user sees a consistent view of the data he requests. If another user changes the underlying data during the query execution, Oracle maintains a version of the data as it existed at the time the query began. If transactions were underway but uncommitted at the time the query began, Oracle will ensure that the query does not see the changes made by those

transactions. The data returned to the query will reflect all committed transactions at the time the query started [Greenwald R, Kreines D C. 2002] and [Oracle: 2005]. Thus with Oracle, "readers don't block writers and writers don't block readers”. This technique has its merits and demerits. For example performance is improved.

(6)

[Greenwald R, Kreines D C. 2002] and [Microsoft: 2003] point out that SQL Server, in contrast, uses shared locks to ensure that data readers only see committed data. These shared locks do not affect other readers. A reader waits for a writer to commit the changes before reading a record. A reader holding shared locks blocks a writer trying to update the same data. Thus with SQL Server "writers block readers and readers block writers" to ensure data integrity. This technique has its merits and demerits. For example this will result in a lot of delays/waits in heavy OLTP environments.

4.2.7. Handling Deadlocks

This is a scenario where process A locks a resource x which is needed by another process B and process B also locks a resource y which is needed by A. This can be illustrated by the example shown in the diagram below. [Microsoft: 2005]

Both Oracle and SQL Server have means and ways of detecting and resolving deadlocks thus enabling smooth concurrent access to database resources.

(7)

5 SQL Standards

According to [Rosenzweig B, Silvestrova E. 2003], [Kevin E. Kline: 2004] and [Beaulieu A, Mishra S. 2002] in the early 1970s the work of the IBM research fellow Dr. E. F.

Codd of the application of the mathematical relational theory in databases led to the emanation of a relational data model product called SEQUEL, or Structured English Query Language. SEQUEL ultimately became SQL, or Structured Query Language.

Due to the rapid increase in the number of vendors in the market, SQL dialects began to proliferate. Over time, SQL proved popular enough in the marketplace to attract the attention of the American National Standards Institute (ANSI), which released standards for SQL in 1986, 1989, 1992, 1999, and 2003. The reason for this was to provide an easier migration to third-party applications without a need to modify the SQL code, thus reducing vendor dependency. But even though there are now standards dialects continue to persist. According to [Kevin E. Kline: 2004], this is mainly because the user

community of a given database vendor often require capabilities in the database before the ANSI committee has created a standard and some of the earliest vendors from the 1980s have variances in the most elementary commands, such as SELECT, because their implementations predate the standards.

5.1. SQL2003

SQL99 had two main parts, Foundation: 1999 and Bindings: 1999. The mandatory portion of SQL: 2003 is known as Core SQL: 2003 and is found in SQL: 2003 Part 2 (Foundation) and Part 11 (Schemata), [Oracle: 2003]. The SQL2003 Foundation section includes all of the Foundation and Bindings standards from SQL99, but a new section called Schemata was created. [Kline K E. 2004]

Basically ANSI standards specify the syntax of commands. For example they specify datatypes. Each column in a table must be defined with a datatype that provides a general classification of the data that the column will store. This enables better, more

understandable queries and helps control the integrity of data. Although vendors specify

"datatypes" that correspond to the SQL2003 datatypes, these are not always true

(8)

SQL2003 datatypes. For example, MySQL's implementation of a BIT datatype is actually identical to a CHAR (1) datatype value.

5.2. Conformity

Database vendors and third-party software companies have had varying levels of conformance to these standards. Most major database vendors support the SQL-92 standard, but generally they have their own extensions to SQL; Oracle uses PL/SQL whilst Microsoft uses T-SQL.

Oracle has made efforts to maintain the ANSI standard. In Oracle 8i, Oracle 9i, and Oracle 10g, Oracle has introduced a number of enhancements to conform to the SQL-99 standard [Rosenzweig B and Silvestrova E, 2003] and according to [Mishra S, Beaulieu A: 2002] Oracle9i introduced the ANSI standard join syntax. The new join syntax was not only SQL92 compliant, but elegant and made the outer join syntax more intuitive.

Also according to [Microsoft: 2005] SQL Server 2000 supports the Entry Level of SQL- 92. These reveal that these DBMSs are still trying to comply with SQL: 92.

There are so many commands in SQL. Their use and syntax is what is specified by the standards. For example, [Gulutzan P: 2005] and [Troels A: 2005] looked at how views are implemented in DB2, SQL Server, and Oracle and discovered that all the three RDBMSs handle views according to the SQL: 99 without any problem. But there was a problem with Oracle 9i and SQL Server when you drop the table from which the views were created. There is no cascading deletion of the views. Thus the views will remain dangling and invalid. If you then create another table with the same name, the view on the old table will become valid again. Apart from the fact that this is a potential security flaw and a violation of the SQL Standard it shows that there might be a lot mess

happening internally even though conformance might be ok.

[Troels A: 2005] summarises the conformity to the views standards as is shown in the table below.

(9)

Views

Standard

SQL: 2003 has a rather complicated set of rules governing when a view is updatable, basically saying that a view is updatable, as long as the update- operation translates into an unambiguous change.

SQL-92 was more restrictive, specifying that updatable views cannot be derived from more than one base table.

PostgreSQL Has views, but breaks the standard by not allowing updates to views; offers the non-standard 'rules'-system as a work-around.

DB2 Conforms to at least SQL-92.

MSSQL Conforms to at least SQL-92.

MySQL Breaks the standard by not offering views.

Oracle Conforms to at least SQL-92.

He also looked at various other commands including Inserting several rows at a time

Standard

An optional SQL feature is row value constructors (feature ID F641). One handy use of row value constructors is when inserting several rows at a time, such as:

INSERT INTO tablename

VALUES (0,'foo') , (1,'bar') , (2,'baz');

— which can be seen as a shorthand for

INSERT INTO tablename VALUES (0,'foo');

INSERT INTO tablename VALUES (1,'bar');

INSERT INTO tablename VALUES (2,'baz');

PostgreSQL Not supported.

DB2 Supported.

MSSQL Not supported.

(10)

MySQL Supported.

Oracle Not supported.

He also went on to evaluate the datatypes from which it was found that DBMSs generally conform to standards.

6. Conclusion

Generally both RDBMSs have made great strides towards enforcing data integrity. At times they even go beyond the call for duty. But they are struggling to keep up their pace with SQL standards. This is evidenced by the fact that they can only confidently claim conformity to ANSI: 92. Probably this is so because vendors do not like standards.

And according to [Troels A: 2005] and [Rosenzweig B and Silvestrova E, 2003] there are still many aspects which ANSI have not yet addressed. These include the use of

programming languages to augment SQL. Microsoft uses Visual Studio and Oracle uses Java. There is no standard specifying the use of these programming languages.

(11)

7. References

Coronel C & Rob P. 2002, Database systems design, implementation and management 5th Ed, Course Technology, USA

Kline K E. 2004, SQL in a Nutshell, 2nd Edition,O'Reilly

Houghton Mifflin Company. 2000, The American Heritage. Dictionary of the English Language, Fourth Edition

Mullins C S. 2002, Database Administration: The Complete Guide to Practices and Procedures, Addison Wesley

Beaulieu A, Mishra S. 2002, Mastering Oracle SQL, O'Reilly

Rankins R, Bertucci P, Jensen P. 2002, Microsoft® SQL Server™ 2000 Unleashed, Second Edition, Sams Publishing

Greenwald R, Kreines D C. 2002, Oracle in a Nutshell, O'Reilly

Rosenzweig B, Silvestrova E. 2003, Oracle® PL/SQL by Example, Third Edition, Prentice Hall PTR

Sheldon R, Wilansky E. 2001, MCSE Exam 70-229: Microsoft® SQL Server 2000 Database Design and Implementation- Training Kit, Microsoft Press

Ragin, Charles: 1994, Constructing Social Research: The Unity and Diversity of Method, Northwestern University, Pine Forge, Thousand Oaks

Houghton Mifflin Company: 2000, The American Heritage. Dictionary of the English Language, Fourth Edition

Peter Gulutzan: 2005 http://www.dbazine.com/db2/db2-

disarticles/gulutzan9 last updated 18/04/05 accessed 21/06/05

(12)

Mishra S, Beaulieu A: 2002, http://www.oreillynet.com/pub/a/network/

2002/04/23/fulljoin.html accessed 10/03/05

Troels A: 2005, Comparison of different SQL implementations http://troels.arvin.dk/db/rdbms/ accessed 21/06/05

Phadke M S: 2005, http://www.isixsigma.com/library/

content/c030106a.asp#about accessed 03/06/05

Microsoft: 2003 http://msdn.microsoft.com/library/default.asp?

url=/library/en-us/tsqlref/ts_sa-ses_9sfo.asp accessed 16/03/05

Microsoft: 2005 http://msdn.microsoft.com/library/sql/ accessed 12/03/05

Oracle: 2005 http://download-west.oracle.com/docs/cd/

B12037_01/server.101/b10759/toc.htm accessed 10/04/05

Referensi

Dokumen terkait

Our method uses external knowledge in the form of symbols and predicted objects in two ways, as constraints for a joint image-text embedding space, and as an additive component for

Related Studies As the purposes of this study are to analyze how literary techniques are used in representing Asia including its places, people and cultures through the concept of

In the final chapter of the book Kyker discusses Tuku’s reception among Zimbabweans living in the diaspora and how his music bonds them through its musical imaginaries of what it means

This study served as preliminary exploratory research into the knowledge gains and retention of students in Computer Science after experiencing Computer Science through the lens of

The research tries to find out the effect of service quality on customer loyalty in retail outlets and to check if there is any impact of demographic variables on customer loyalty..

From the explanation above bases on the following statement “grammar is rules for forming words and making a sentence”.30 Based on the explanation above the writer concludes that

In the third stage, projecting, the final print is run through a projector; which shoots through the film a beam of light intense enough to reverse the initial process and project a

The corresponding author is asked to check through the final PDF to make sure that no errors have crept in during the transfer or preparation of the files.. Only errors introduced