Designing a
Relational
Database
I
n this c hapter, I’m go ing to sho w yo u the way I design a database. While I’ll use SQL Server to o ls in this c hapter, the same c o nc epts c an be applied to any database system. The key to using this appro ac h is to understand the tec h-niques I’m go ing to sho w yo u and adapt them so that they wo rk fo r yo u.Overview of the Design Process
Designing a relatio nal database c an be as easy o r as hard as yo u c ho o se to make it. I generally use a seven-step appro ac h as o utlined belo w:1.Stating the pro blem 2.Brainsto rming fo r ideas
3.Mo deling entities and relatio nships 4.Building the database
5.Creating the applic atio n
I’m go ing to disc uss the first fo ur steps in this c hapter. The rest o f this bo o k will fo c us o n step five.
See Chapter 2 for inform ation about the relational database m odel.
Cross-Reference
3
3
✦
✦
✦
✦
In This Cha pter
Stating the pro blem Brainsto rming Mo deling entities and relatio nships Building the database
Stating the Problem
While stating the pro blem may seem easy, it’s a lo t harder than it lo o ks. The pro blem statement sho uld present an understanding o f what the o rganizatio n is trying to ac c o mplish, while at the same time trying to emphasize the mo st c ritic al business needs. If yo u want to replac e an existing applic atio n, yo u c an use that applic atio n as a basis fo r the answer. But if yo u are building an applic atio n fo r the first time, it is very impo rtant to understand what pro blem yo u’re trying to so lve. After all, if yo u c an’t identify the pro blem, ho w are yo u go ing to kno w if yo u really so lved it?
The pro b lem sho uld b e stated as simply as po ssib le. Fo r example, if the peo ple at Amazo n.c o m were to state their pro b lem, it might lo o k like this: we want to estab -lish Amazo n.c o m as a b rand name b y selling b o o ks to c o nsumers via the Internet at the lo west po ssib le c o st to the c o nsumer and with the b est po ssib le servic e, while b uilding market share that will ensure the lo ng-term stab ility o f the c o mpany. A spec ialty mail-o rder c atalo g c o mpany might want to so lve this pro b lem: we want to impro ve ho w we take c usto mer o rders o ver the telepho ne to reduc e mistakes and impro ve servic e to o ur c usto mers. A small elec tro nic s supplier might state their pro b lem in this way: we need to impro ve o ur invento ry c o ntro l to inc rease
Taking this info rmatio n into c o nsideratio n, I c an state my pro blem like this: I want to update Car Co llec to r to inc lude suppo rt fo r o ther types o f to ys and c o llec tibles and add suppo rt fo r trading to ys with o ther c o llec to rs o ver the Internet. With the new features I’m go ing to add, I’m go ing to c hange the name o f the applic atio n to To y Co llec to r.
It’s almost realistic: Toy Collector as used in this book isn’t m eant to be a com -plete application, but rather a fram ew ork for show ing you different techniques for building database applications in Visual Basic. Therefore, the database design m ay not be as com plete as a com m ercial application, nor w ill all of the features found in the com m ercial application be present in Toy Collector. Som e of the data struc-tures I use, as w ell as the feastruc-tures included in the application, m ay seem like overkill, but they are necessary to illustrate various points along the w ay. So, focus on the techniques that I’m going to show you and try to understand w hy and how I do things, rather than focusing on w hy this feature w as added or w hy that infor-m ation is infor-m issing.
Brainstorming
Onc e yo u have a basic understanding o f the pro blem yo ur applic atio n needs to so lve, yo u need to design the database to ac c o mmo date the info rmatio n related to yo ur applic atio n. The first step in this pro c ess it to take a lo o k at yo ur pro blem and try to determine all o f the info rmatio n and func tio ns that might be needed to so lve the pro blem. I c all this step brainsto rming.
Brainsto rming is the ac t o f disc ussing and rec o rding ideas witho ut regard to ho w feasible they are to implement. This helps yo u identify all o f the info rmatio n yo u need to keep and all o f the tasks yo ur applic atio n will need to perfo rm.
I like to c o nduc t brainsto rming sessio ns that inc lude everyo ne who will be invo lved with the pro jec t in a single ro o m with a white bo ard. It helps to have as wide a range o f peo ple present as po ssible. Everyo ne fro m end users, to pro grammers, to manage-ment sho uld be invo lved in this pro c ess.
Every idea that is raised sho uld be listed o n the white bo ard, even if it’s similar to an idea that’s already listed. After the meeting, the info rmatio n sho uld be o rga-nized, and similar ideas c an be c o mbined as a single item. The ideas sho uld be c las-sified as either a task that the applic atio n sho uld perfo rm o r a piec e o f info rmatio n that will need to be kept.
It’s impo rtant to understand that so me o f the ideas that c o me o ut o f the brain-sto rming sessio n may no t be prac tic al. At this stage o f the pro c ess, yo u sho uldn’t wo rry abo ut prac tic ality. It is far mo re impo rtant to be c o mplete. So metimes things that seem imprac tic al at this stage may pro ve easy to implement later, while o ther ideas that seem easy to implement at first may no t pro ve to be wo rth the time and effo rt.
Also , it’s impo rtant no t to make fun o f any ideas, no matter ho w bad they seem. This is espec ially true fo r ideas c o ming fro m the less tec hnic al attendees. Quite o ften, their ideas and c o mments may lead to a better understanding o f ho w the applic atio n sho uld wo rk.
Brainstorming Toy Collector
Sinc e yo u c an’t ac tively partic ipate in the brainsto rming sessio n, I sat do wn and held o ne myself, and c ame up with the fo llo wing list o f func tio ns that need to be perfo rmed by To y Co llec to r applic atio n:
✦Trac k items c urrently in the c o llec tio n
✦Create repo rts o f items in the c o llec tio n
✦Keep a mailing list o f c urrent and po tential c usto mers
✦Create a Web page with a list o f items c urrently fo r sale
✦Create a Web page with a list o f items wanted
✦Create an HTML listing fo r eBay
✦Evaluate the c o nditio n o f a to y
✦Trac k purc hases and sales
✦Pro c ess an o rder that sells o ne o r mo re to ys to a c usto mer
✦Pro c ess an o rder that purc hases a to y fro m a c usto mer
The func tio ns will b e used to maintain a datab ase c o ntaining the fo llo wing data elements:
Examining the functions to be performed
Lo o king at the abo ve results, yo u c an see a few c o mmo n threads. First, yo u need an invento ry system that trac ks all o f the to ys in the c o llec tio n. This is a fairly c o m-mo n applic atio n, alo ng with maintaining a mailing list.
The invento ry part o f an applic atio n usually requires a unique identifier fo r an item in invento ry. This wasn’t inc luded as part o f the o riginal brainsto rming sessio n. In a traditio nal invento ry system, the quantity o f the item is also inc luded. Ho wever, due to the fac t that different to ys may have different c harac teristic s based o n their c o nditio n, I c ho o se to igno re the quantity issue and require that eac h item in the invento ry must have its o wn unique rec o rd.
Pro c essing o rders isn’t diffic ult. Ho wever, a few mo re items need to be added to what was already identified abo ve. Fo r instanc e, in o rder to c o mpute sales tax, yo u need to kno w the sales tax rate. Fo r the purpo ses o f this bo o k, I’m go ing to assume that the sales tax rate is unifo rm ac ro ss a state, even tho ugh this isn’t nec essarily true. Yo u also need to c apture the c usto mer’s name o n the c redit c ard, sinc e it may be different than the name they entered into yo ur database. There sho uld also be an o ptio n to ship to an alternate destinatio n, rather than a regular mailing address.
The mailing list is pretty straightfo rward, exc ept that yo u sho uld give the c usto mer the o ptio n to no t rec eive mail. This is impo rtant, bec ause even in to day’s market-plac e, peo ple will c o mplain abo ut unwanted mail. Also , yo u may want to inc lude so me additio nal c o mments abo ut the c usto mer that wo uld let the user rec o rd pro b-lems that they may have had during previo us transac tio ns.
The c usto mer’s name is ac tually listed twic e: o nc e as simply name, and ano ther time as first, middle, and last names. Whic h way is best really depends o n yo u. Having a separate last name field makes it easy to searc h o n so meo ne’s last name. Ho wever, using a single field lets yo u fo rmat a name mo re naturally, whic h is impo r-tant when yo u have peo ple that have suffixes suc h as Ph.D., Junio r, Senio r o r III. It also allo ws so meo ne to enter a title suc h as Mr., Ms., Dr., etc . with fewer pro blems. In the lo ng run, it really do esn’t matter whic h metho d yo u pic k as lo ng as yo u’re c o nsistent thro ugho ut yo ur database.
M apping the results to data types
The last step o f the brainsto rming sessio n is to map the data elements o nto a series o f data types. After reviewing the brainsto rming info rmatio n, I like to assemble a list o f the data elements that were derived fro m the sessio n, alo ng with Visual Basic data type and a sho rt desc riptio n o f the elements, suc h as tho se sho wn in Table 3-1.
Table 3-1
Data Elements
Data Element Data Type Description
Custom erNam e String The nam e of the custom er.
Street String The street on w hich the custom er lives. City String The city w here the custom er lives. State String The state w here the custom er lives.
Zip String The proper ZIP code for the custom er’s address. Phone String The custom er’s telephone num ber.
EMailAddress String The custom er’s e-m ail address.
MailingList Boolean True if the custom er w ants to receive periodic notices.
OrderNum ber Long A num ber that uniquely identifies the order. OrderStatus String The current status of the order.
DateOrdered Date The date the order w as placed. DateShipped Date The date the order w as m ailed. ShippingCost Currency The cost to ship the order. SalesTax Currency The am ount of sales tax collected.
CreditCardNum ber String The credit card num ber used to purchase a toy. ExpirationDate String The expiration date on the credit card.
InventoryId Int A unique identifier for an item in the collection. ToyNam e String The nam e of the collectible toy.
Data Element Data Type Description
MintValue Currency The value of the toy if it w as in m int condition. Condition Long A num eric description of the toy’s condition. Question String A question used to evaluate a toy’s condition. TrueValue Currency The true value of the toy based on its condition. DatePurchased Date The date the toy w as purchased.
Im age Picture A picture of the toy.
US first: When building an e-com m erce application, one of the first things you need to plan for is how to handle international issues. For the m ost part, the only w ay these issues w ould affect your database design is that additional data ele-m ents, such as Country and CurrencyType, w ould need to be included, plus you w ould need to allow additional space for other fields such as ZIP code. Just because I don’t include these fields in Toy Collector isn’t a good reason for you not to include them in your application.
M odeling Entities and Relationships
The next step in the pro c ess is translating the info rmatio n fro m the brainsto rming sessio n into a database design.
Entity/ relationship modeling
Entity/Re latio nship mo de ling( also kno wn as E/ R mo deling) is a way o f desc rib ing the re latio nshipb etween e ntitie s. An entity is a thing that c an b e uniq uely identi-fied, suc h as a to y, a c usto mer, o r an o rder. Asso c iated with the entity is a set o f attrib utes, whic h helps to desc rib e the entity. Eac h c usto mer has a name and an address. Eac h to y has a name and a manufac turer. An o rder has an o rder numb er and a date o rdered. Relatio nships are fo rmed b etween two entities, suc h as c us-to mers and o rders, where a c usus-to mer plac es an o rder fo r a us-to y.
When drawing an E/ R mo del, I use rec tangles fo r entities, ellipses fo r attrib utes, and diamo nds fo r relatio nships. In Figure 3-1, yo u c an see a simple E/ R mo del that has two entities ( Custo mers and Orders) , with eac h entity having two attrib utes ( Custo mers-Address and Name, Orders-Order Numb er and Date Ordered) and a single relatio nship ( Custo mer-Order) .
Figure 3-1: Designing a sim ple database using an E/ R m odel.
Identifying entities and attributes
The first step in this pro c ess is to review the list o f data elements fo und in Tab le 3-1 and lo o k fo r c o mmo n gro upings. As yo u sc an thro ugh the list o f data elements, three main gro upings jump o ut almo st immediately: c usto mer info rmatio n, inven-to ry info rmatio n, and o rder info rmatio n. Eac h o f these gro upings represents a majo r entity in the To y Co llec to r datab ase.
At the same time, yo u need to lo o k at the vario us entities and their attributes fro m an implementatio n po int o f view. Yo u may find that a few o ther attributes are easy to inc lude and will add value to the applic atio n fro m the user’s po int o f view.
Dirt cheap disk drives: If you designed a database years ago, you w ill rem em ber that disk space w as very expensive, and you alw ays tried to use the least am ount of space possible. In today’s m arketplace, you can purchase a high-perform ance 9-gigabyte SCSI disk drive for less than $500. If you allow 2,000 bytes for each cus-tom er (w hich is very generous), you can store over 4 m illion cuscus-tom ers on a single disk drive. Since m ost applications w on’t store this m uch data, don’t let the cost of disk space drive your database decisions.
Customer information
Table 3-2 c o ntains the list o f data elements that are related to a c usto mer, plus a few mo re that po pped up while assembling the list. Finding so me additio nal data ele-ments at this stage is quite no rmal, sinc e we no w have a better understanding o f the applic atio n’s needs. In this c ase, I added fields to identify when the c usto mer
Tip
Address Name
Order Number
Customers Customer-Order
was o riginally added to the database ( DateAdded) and the last time the info rmatio n was updated ( DateUpdated) . I also added a field c alled Co mments that allo ws the user to rec o rd any c o mments they may have abo ut this partic ular c usto mer.
Table 3-2
Customer Information
Column Name Data Type VB Type Description
Custom erId Int Long A unique identifier for the custom er. Nam e Varchar(64) String The custom er’s nam e.
Street Varchar(64) String The street address w here the custom er lives.
City Varchar(64) String The nam e of the city w here the custom er lives.
State Char(2) String The nam e of the state w here the custom er lives.
Zip Int Long The ZIP code for the custom er’s address.
Phone Varchar(32) String The custom er’s phone num ber. Em ailAddress Varchar(128) String The custom er’s e-m ail address. DateAdded Datetim e Date The date the custom er w as added to
the database.
DateUpdated Datetim e Date The date the custom er’s inform ation w as last updated.
MailingList Bit Boolean When true m eans that the custom er w ishes to receive periodic m ailings. Com m ents Varchar(256) String Com m ents about the custom er.
Inventory information
Table 3-3
Inventory Items
Column Name Data Type VB Type Description
InventoryId Int Long A unique identifier for the item in the collection.
ToyTypeId Int Long A unique identifier for the type of toy in the collection.
Nam e Varchar(64) String The nam e of the toy.
ManufacturerId Int Long The nam e of the m anufacturer w ho m ade the toy.
YearIssued Datetim e Date The date the toy w as first m anufactured.
Description Varchar(256) String A description of the toy.
MintValue Money Currency The value of the toy if it is in m int condition.
Condition Int Long The condition of the toy using a num eric scale.
ConditionMask Varchar(64) String Answ ers to the condition questions for this type of toy.
TrueValue Money Currency The true value of the toy based on its current condition.
DatePurchased Datetim e Date The date the toy w as added to the inventory.
PurchasePrice Money Currency The am ount of m oney paid for the toy.
AskingPrice Money Currency The am ount of m oney you are w illing to sell the toy for. A value of zero m eans that you aren’t w illing to sell the toy at this tim e.
BuyingPrice Money Currency The am ount of m oney you are w illing to pay for a sim ilar toy.
Wanted Bit Boolean If true m eans that you w ant to buy the toy.
Column Name Data Type VB Type Description
Com m ents Varchar(256) Sting Any com m ents that w ould be displayed along w ith the toy. DateUpdated Datetim e Date The m ost recent tim e this
inform ation w as updated.
Table 3-4
Toy Types
Column Name Data Type VB Type Description
ToyTypeId Int Long A unique identifier for the type of toy in the collection.
Description Varchar(64) String A description of the type of toy.
Table 3-5
Condition Questions
Column Name Data Type VB Type Description
TypeId Int Long A unique identifier for the type of toy in the collection.
Seq Int Long A sequence num ber that is used to distinguish betw een m ultiple questions for a specific type of toy. Question Varchar(64) String A question used to evaluate the
condition of the toy.
Weight Int Long The relative im portance of the question w hen determ ining the toy’s condition.
Responses Int Long The highest possible value of the response.
sto re the numeric value in the database. Then yo u need to add a translatio n table that c an be used to translate the c o dified value into a text string. This is what the Manufac turers table ac c o mplishes ( see Table 3-6) .
Table 3-6
M anufacturers
Column Name Data Type VB Type Description
ManufacturerId Int Long A unique identifier for the nam e of the m anufacturer.
Nam e Varchar(64) String The nam e of the m anufacturer.
Ano ther area o f c o nc ern is that we need to sto re images fo r the to ys. While I believe it is ac c eptable fo r yo u to sto re images in a database, I also believe that they sho uld be sto red in a separate table. Sinc e I need to use a separate table fo r the image, I dec ided to add a sequenc e number c o lumn that will let me sto re multiple images fo r a single to y ( see Table 3-7) .
Table 3-7
Images
Column Name Data Type VB Type Description
InventoryId Int Long A unique identifier for the item in the collection.
Seq Int Long A sequence num ber that is used to distinguish betw een m ultiple im ages for a single toy.
Im age Im age Picture A large binary field that holds the actual im age of the toy.
Order information
Table 3-8
Orders
Column Name Data Type VB Type Description
OrderId Int Long A unique identifier for the order. Custom erId Int Long A unique identifier for a custom er. OrderType Int Long 1 = sale, 2 = purchase.
ShippingCost Money Currency The total cost of shipping. SalesTax Money Currency The total cost of sales tax. How Paid Int Long 1 = credit card, 2 = check.
CreditCardNum ber Varchar(32) String The custom er’s credit card num ber. ExpDate Varchar(16) String The custom er’s credit card expiration
date.
OrderStatus Int Long 1= order placed, 2= order shipped, 3= order received.
DateOrdered Datetim e Date The date and tim e the order w as placed.
DateShipped Datetim e Date The date and tim e the order w as shipped.
DateReceived Datetim e Date The date and tim e the order w as received.
Storing Images in a Database
While m any people recom m end against storing an im age in your database, I believe other-w ise. By storing im ages in the database, it is m uch easier to secure and access them . Storing im ages outside the database m eans that you have to m aintain a separate security system to protect the im ages. This can becom e very com plicated if you perm it the im ages to be accessed both by Web brow ser-based applications and traditional client/ server applications.
While I believe in storing im ages in the database, I also believe that the im ages should be stored in their ow n table, aw ay from any related data. Database perform ance is based m ostly on how m uch inform ation you can retrieve w ith a single disk I/ O. The m ore row s you can retrieve, the better.
Table 3-9
OrderDetails
Column Name Data Type VB Type Description
OrderId Int Long A unique identifier for the order. Seq Int Long A sequence num ber that is used to
distinguish betw een m ultiple item s in a single order.
InventoryId Int Long A unique identifier for the item in the collection.
PurchasePrice Money Currency The am ount paid for the toy.
The last entity I want to talk abo ut is the States entity. This entity exists mo stly to translate the two -c harac ter state abbreviatio n into a sales tax rate, whic h is used to c o mpute the amo unt o f sales tax that must be c o llec ted fo r an o rder. At the same time, I dec ided to add the StateName field to translate State into a mo re meaningful value.
Table 3-10
States
Column Name Data Type VB Type Description
State Char(2) String The tw o-character abbreviation for a state.
StateNam e Varchar(64) String The proper state nam e. SalesTaxRate Decim al Currency The sales tax rate for the state.
Identifying Relationships
There are three basic types o f relatio nships: o neto o ne, o neto many, and manyto -many. These relatio nships refer to the number instanc es o f data in o ne entity that are related to instanc es o f data in ano ther entity. In a o ne -to -o ne re latio nship,there is o nly o ne instanc e o f data in o ne entity that is related to a single instanc e o f data in ano ther entity. Fo r instanc e, assume that yo u have two entities — sto res and managers. Eac h sto re has a single manager, while eac h manager has a single sto re. Thus eac h sto re has a unique manager and eac h manager has a unique sto re.
In a o ne -to -many re latio nship,o ne instanc e o f data in the first entity is related to zero o r mo re instanc es o f data in the sec o nd entity. Fo r example, assume that yo u have an entity fo r c usto mers and an entity fo r o rders. Eac h c usto mer may plac e as many o rders as they desire. They need no t have plac ed any o rders if they have signed up to b e o n a mailing list. Fo r eac h o rder, there is exac tly o ne c usto mer who plac ed the o rder. Thus fo r eac h o rder there is o nly o ne c usto mer and fo r eac h c usto mer there may b e zero o r mo re o rders.
In a many-to -many re latio nship,multiple instanc es o f data in the first entity are related to multiple instanc es o f data in the sec o nd entity. This c an be illustrated by having an entity fo r parents and an entity fo r c hildren. Eac h parent may have zero o r mo re c hildren, while eac h c hild may have multiple parents. ( Remember, an o rphan c hild has no parents, while a c hild with divo rc ed parents, may have a mo ther, a father, a stepmo ther and a stepfather.)
Drawing the E/ R model
Drawing the E/ R mo del is a fairly simple task (see Figure 3-2) with the info rmatio n fo und in Tables 3-2 to 3-10. While I didn’t list the attributes fo r eac h entity bec ause it wo uld render the small drawing nearly unreadable, it is a fairly easy task. Of c o urse, c o mparing the abo ve tables to the diagram is pro bably even mo re meaningful.
When draw ing an E/ R m odel, I suggest using a tool like Visio rather than creating a draw ing w ith a paper and pencil. Visio allow s you to easily edit the draw ing to accom m odate the inevitable changes that w ill occur as various people review and com m ent on your docum ent. Of course, there are som e very expensive database design tools that offer sim ilar capabilities, but I find Visio w orks nearly as w ell for m ost database designs.
Figure 3-2: View ing the final Entity/ Relationship m odel for Toy Collector.
Building the Database
Translating an E/ R mo del into a database is a pretty straightfo rward pro c ess. Eac h o f the entities bec o mes a table and their attributes bec o me c o lumns in the table. Yo u c an see the final pro duc t in Figure 3-3 using the SQL Server database diagram fac ility.
Inventory Name
Order Details Order-Order Details
Orders
Customer-Order Images Toy Types Toy Type Question Questions
M anufacturers
Inventory-Images
Inventory-Toy Types
Customer-Order Customers
Figure 3-3: Looking at a database diagram of the Toy Collector database.
Thoughts on Database Design
Just because you have a valid database design doesn’t m ean that you w ill get the best per-form ance from it. There are a num ber of factors that w ill affect perper-form ance, such as the num ber of tables in the database, the size of the colum ns, and the num ber of indexes you are using. How ever, the biggest single factor that affects your database’s perform ance is the hardw are you’re using.
Believe it or not, having a faster CPU w ill not necessarily m ake your database server run faster. A database server is very I/ O intensive. Anything that allow s the database to retrieve data faster from disk w ill help the database server’s overall perform ance.
Adding m em ory to your server allow s the database server to cache m ore data in m em ory. After all, retrieving data from m em ory is m uch faster than retrieving it from disk. This is the biggest change you can m ake to im prove database perform ance.
After adding m em ory to your system , using SCSI disk drives in place of IDE drives is the next place you should look for perform ance gains. Not only can you m anage up to 15 disk drives on a single SCSI card, SCSI also allow s you to perform concurrent operations on each drive. Thus you can have m ultiple disk drives perform ing seeks, w hile other drives are transferring data. SCSI-III can transfer data faster than SCSI-II or SCSI-I and should be used for best perform ance.
Summary
In this c hapter yo u learned:
✦The five steps in an applic atio n design pro c ess.
✦Why stating the pro gram helps yo u c larify go als and o bjec tives fo r the entire design pro c ess.
✦Ho w to use brainsto rming to determine the data elements and func tio ns required in yo ur applic atio n.
✦Ho w to use Entity/ Relatio nship mo deling to design yo ur database.