Intro to Database
Prof. Ali Eldesouki
Dr. Mohamed Abd Elfattah
Course Objectives
• Describe the fundamental elements of relational database management systems
• Explain the basic concepts of relational data model, entity-relationship model, relational database design, relational algebra and SQL.
• Design ER-models to represent simple database application scenarios
• Convert the ER-model to relational tables, populate relational database and formulate SQL queries on data.
• Improve the database design by normalization.
• Create real projects based on windows applications (C#).
Some Definitions
What is a database?
A database is information that is set up for easy access, management and updating. Computer databases typically store aggregations of data records or files that contain information, such as sales transactions, customer data, financials and product information.
Databases are used for storing, maintaining and accessing any sort
of data. They collect information on people, places or things. That
information is gathered in one place so that it can be observed and
analyzed. Databases can be thought of as an organized collection of
information.
What are databases used for?
• Businesses use data stored in databases to make informed business decisions. Some of the ways organizations use databases include the following:
• Improve business processes. Companies collect data about business processes, such sales, order processing and customer service. They analyze that data to improve these processes, expand their business and grow revenue.
• Keep track of customers. Databases often store information about people, such as customers or users. For example, social media platforms use databases to store user information, such as names, email addresses and user behavior. The data is used to recommend content to users and improve the user experience.
• Secure personal health information. Healthcare providers use databases to securely store personal health data to inform and improve patient care.
• Store personal data. Databases can also be used to store personal information. For example, personal cloud storage is available for individual users to store media, such as photos, in a managed cloud.
Types of databases
There are many types of databases. They may be classified according to content type: bibliographic, full text, numeric and images. In computing, databases are often classified based on the organizational approach they use.
Relational. This tabular approach
defines data so it can be reorganized and accessed in many ways. Relational databases are comprised of tables.
Distributed. This database stores records or files in several physical locations. Data
processing is also spread out and replicated across different parts of the network.
Cloud. These databases are built in a public, private or hybrid cloud for a virtualized environment
NoSQL. NoSQL databases are good when dealing with large collections of distributed data.
They can address big data
performance issues better than relational databases.
Object-oriented. These databases hold data created using object-oriented
programming languages. They focus on organizing objects rather than actions and data rather than logic.
Graph. These databases are a type of NoSQL database. They store, map and query relationships using concepts from graph theory. Graph databases are made up of nodes and edges. Nodes are entities and connect the nodes.
Relational Database Model
Concepts
Tables − In relational data model, relations are saved in the format of Tables. This format stores the relation among entities. A table has rows and columns, where rows represents records and columns represent the attributes.
Tuple − A single row of a table, which contains a single record for that relation is called a tuple.
Relation instance − A finite set of tuples in the relational database system represents relation instance. Relation instances do not have duplicate tuples.
Relation schema − A relation schema describes the relation name (table name), attributes, and their names.
Relation key − Each row has one or more attributes, known as relation key, which can identify the row in the relation (table) uniquely.
Attribute domain − Every attribute has some pre-defined value scope, known as attribute domain.
The main highlights of this model are −
• Data is stored in tables called relations.
• Relations can be normalized.
• In normalized relations, values saved are atomic values.
• Each row in a relation contains a unique value.
• Each column in a relation contains values from a same domain.
Database Schema
A database schema is the skeleton structure that
represents the logical view of
the entire database. It defines
how the data is organized and
how the relations among them
are associated. It formulates all
the constraints that are to be
applied on the data.
A database schema can be divided broadly into two categories
Physical Database Schema − This schema pertains to the actual storage of data and its form of storage like files, indices, etc. It
defines how the data will be stored in a secondary storage.
Logical Database Schema − This schema defines all the logical
constraints that need to be applied on the data stored. It defines
tables, views, and integrity constraints.
Data Independence
A database system normally
contains a lot of data in addition to users’ data. For example, it stores data about data, known as
metadata, to locate and retrieve data easily. It is rather difficult to modify or update a set of
metadata once it is stored in the
database.
Logical Data Independence Logical data is data about
database, that is, it stores
information about how data is managed inside. For example, a table (relation) stored in the database and all its constraints, applied on that relation.
Physical Data Independence
All the schemas are logical, and the actual data is stored in bit format on the disk.
Physical data independence is the power to change the physical data without
impacting the schema or logical data.
For example, in case we want to change or upgrade the storage system itself −
suppose we want to replace hard-disks with SSD − it should not have any impact on the logical data or schemas.
1
2
People Who Work With Databases
▪ There are five classes of people associated with databases:
1. End users
▪ Store and use data in DBMSs
▪ Usually not computer professionals
2. Application programmers
▪ Develop applications that facilitate the usage of DBMSs for end-users
▪ Computer professionals who know how to leverage host languages, query languages and DBMSs altogether
3. Database Administrators (DBAs)
▪ Design the conceptual and physical schemas
▪ Ensure security and authorization
▪ Ensure data availability and recovery from failures
▪ Perform database tuning
4. Implementers
▪ Build DBMS software for vendors like IBM and Oracle
▪ Computer professionals who know how to build DBMS internals
5. Researchers
▪ Innovate new ideas which address evolving and new challenges/problems
Structured Query Language(SQL) as we all know is the
database language by the use of which we can perform
certain operations on the existing database and also we can
use this language to create a database. SQL uses certain
commands like Create, Drop, Insert, etc. to carry out the
required tasks.
Currently, the most common terms of describing Big Data are the five Vs:
•Volume — Big Data is big. Enterprises are confronted with an avalanche of ever- growing data of all types, easily accumulating terabytes — even petabytes — of information.
•Velocity — Big Data can lose its validity or needs an urgent call to action.
Sometimes 2 minutes are too late. For time-sensitive processes such as catching fraud, big data must be used as it streams into your enterprise in order to maximize its value.
•Variety — Big Data is diverse. It can appear in different formats and types such as text, sensor data, audio, video, click streams, log files and more. New insights are found while analyzing different data types together.
•Veracity — Big Data is varying in valid information. Filtering out the relevant data is key in a world where variety and number of sources grow continuously.
And the fifth most important V is
•Value — It is all great having access to big data but unless we can turn it into value it is useless. Businesses need to make a business case for any effort to collect and leverage big data to get a clear understanding of costs and benefits.