PostgreSQL
,
Second Edition
by Korry Douglas; Susan Douglas
---Publisher: Sams
Pub Date: July 26, 2005
Print ISBN-10: 0-672-32756-2
Print ISBN-13: 978-0-672-32756-8
The Real Value in Free Soft w are
These days, it seem s t hat m ost discussion of open- source soft ware cent ers around t he idea t hat you should not have t o t ie your fut ure t o t he whim of som e giant corporat ion. People say t hat open- source soft ware is bet t er t han propriet ary soft ware because it is developed and m aint ained by t he users inst ead of a faceless com pany out t o light en your wallet .
I t hink t hat t he real value in free soft ware is educat ion. I have never learned anyt hing by reading m y own code[ 1 ]. On t he ot her hand, it 's a rare occasion when I 've looked at code writ t en by som eone else and haven't com e away wit h anot her t ool in m y t oolkit . People don't t hink alike. I don't m ean t hat people disagree wit h each ot her; I m ean t hat people solve problem s in different ways. Each person brings a unique set of experiences t o t he t able. Each person has his own set of goals and biases. Each person has his own int erest s. All of t hese t hings will shape t he way you t hink about a problem . Oft en, I 'll find m yself in a heat ed disagreem ent wit h a colleague only t o realize t hat we are each correct in our approach. Just because I 'm right , doesn't m ean t hat m y colleague can't be right as well.
[ 1 ] Maybe I should say t hat I have never learned anyt hing new by reading m y own code. I 've cert ainly looked at code t hat I 've writ t en and wondered what I was t hinking at t he t im e, learning t hat I 'm not nearly as clever as I had rem em bered. Oddly enough, t hose who have read m y code have reached a sim ilar conclusion.
Open-source soft ware is a great way t o learn. You can learn about program m ing. You can learn about design. You can learn about debugging. Som et im es, you'll learn how not t o design, code, or debug; but t hat 's a valuable lesson, t oo. You can learn sm all t hings, like how t o cache file descript ors on syst em s where file descript ors are a scarce and expensive resource, or how t o use t he select()funct ion t o im plem ent fine- grained t im ers. You can learn big t hings, like how a query opt im izer works or how t o writ e a parser, or how t o develop a good m em ory - m anagem ent st rat egy.
Post greSQL is a great exam ple. I 've been using dat abases for t he last t wo decades. I 've used m ost of t he m aj or com m ercial dat abases: Oracle, Sybase, DB2, and MS SQL Server. Wit h each com m ercial dat abase, t here is a wall of knowledge bet ween m y needs and t he vendor's need t o prot ect his int ellect ual propert y. Unt il I st art ed exploring open- source dat abases, I had an incom plet e underst anding of how a dat abase works. Why was t his part icular feat ure im plem ent ed t hat way? Why am I get t ing poor perform ance when I t ry t his? That 's a neat feat ure; I wonder how t hey did t hat ? Every com m ercial dat abase t ries t o expose a sm all piece of it s inner workings. The explainst at em ent will show you why t he dat abase m akes it s
opt im izat ion decisions. But , you only get t o see what t he vendor want s you to see. The vendor isn't t rying t o hide t hings from you ( in m ost cases) , but wit hout com plet e access t o t he source code, t hey have t o pick and choose how t o expose
inform at ion in a m eaningful way. Wit h open- source soft ware, you can dive deep int o t he source code and pull out all t he inform at ion you need. While writ ing t his book, I 've spent a lot of t im e reading t hrough t he Post greSQL source code. I 've added a lot of m y own code t o reveal m ore inform at ion so t hat I could explain t hings m ore clearly. I can't do t hat wit h a com m ercial dat abase.
There are gem s of brilliance in m ost open - source proj ect s. I n a well- designed, well- fact ored proj ect , you will find designs and code t hat you can use in your own proj ect s. Many open - source proj ect s are st art ing t o split t heir code int o reusable libraries. The Apache Port able Runt im e is a good exam ple. The Apache Web server runs on m any diverse plat form s. The Apache developm ent t eam saw t he need for a layer of abst ract ion t hat would provide a port able int erface t o syst em funct ions such as shared m em ory and net work access. They decided t o fact or t he port abilit y layer int o a library separat e from t heir m ain proj ect . The result is t he Apache Port able Runt im e—a library of code t hat can be used in ot her open - source proj ect s ( such as Post greSQL) .
Som e developers hat e t o work on som eone else's code. I love working on code writ t en by anot her developer—I always learn som et hing from t he experience. I st rongly encourage you t o dive int o t he Post greSQL source code. You will learn from it . You m ight even decide t o cont ribut e t o t he proj ect .
I nt roduct ion
Post greSQL is a relat ional dat abase wit h a long hist ory. I n t he lat e 1970s, t he Universit y of California at Berkeley began developm ent of Post greSQL's ancest or —a relat ional dat abase known as I ngres. Relat ional Technologies t urned I ngres int o a com m ercial product . Relat ional Technologies becam e I ngres Corporat ion and was lat er acquired by Com put er Associat es. Around 1986, Michael St onebraker from UC Berkeley led a t eam t hat added obj ect - orient ed feat ures t o t he core of I ngres; t he new version becam e known as Post gres. Post gres was again com m ercialized; t his t im e by a com pany nam ed I llust ra, which becam e part of t he I nform ix Corporat ion. Andrew Yu and Jolly Chen added SQL support t o Post gres in t he m id- '90s. Prior versions had used a different , Post gres- specific query language known as Post quel. I n 1996, m any new feat ures were added, including t he MVCC t ransact ion m odel, m ore adherence t o t he SQL92 st andard, and m any perform ance
im provem ent s. Post gres once again t ook on a new nam e: Post greSQL.
Today, Post greSQL is developed by an int ernat ional group of open - source soft ware proponent s known as t he Post greSQL Global Developm ent group. Post greSQL is an open- source product —it is not propriet ary in any way. Red Hat has recent ly com m ercialized Post greSQL, creat ing t he Red Hat Dat abase, but Post greSQL it self will rem ain free and open source.
P o st g r e SQ L Fe a t u r e s
Post greSQL has benefit ed well from it s long hist ory. Today, Post greSQL is one of t he m ost advanced dat abase servers available. Here are a few of t he feat ures found in a st andard Post greSQL dist ribut ion:
€ Obj ect - relat ional— I n Post greSQL, every t able defines a class. Post greSQL im plem ent s inherit ance bet ween t ables ( or, if you like, bet ween classes) . Funct ions and operat ors are polym orphic.
€ St andards com pliant — Post greSQL synt ax im plem ent s m ost of t he SQL92 st andard and m any feat ures of SQL99. Where differences in synt ax occur, t hey are m ost oft en relat ed t o feat ures unique t o Post greSQL.
€ Open source— An int ernat ional t eam of developers m aint ains Post greSQL. Team m em bers com e and go, but t he core m em bers have been enhancing Post greSQL's perform ance and feat ure set since at least 1996. One advant age t o Post greSQL's open - source nat ure is t hat t alent and knowledge can be recruit ed as needed. The fact t hat t his t eam is int ernat ional ensures t hat Post greSQL is a product t hat can be used product ively in any nat ural language, not j ust English.
€ Transact ion processing— Post greSQL prot ect s dat a and coordinat es m ult iple concurrent users t hrough full t ransact ion processing. The t ransact ion m odel used by Post greSQL is based on m ult i-version concurrency cont rol ( MVCC) . MVCC provides m uch bet t er perform ance t han you would find wit h ot her product s t hat coordinat e m ult iple users t hrough t able- , page-, or row- level locking.
€ Referent ial int egrit y — Post greSQL im plem ent s com plet e referent ial int egrit y by support ing foreign and prim ary key relat ionships as well as t riggers. Business rules can be expressed wit hin t he dat abase rat her t han relying on an ext ernal t ool.
€ Mult iple procedural languages— Triggers and ot her procedures can be writ t en in any of several procedural languages. Server- side code is m ost com m only writ t en in PL/ pgSQL, a procedural language sim ilar t o Oracle's PL/ SQL. You can also develop server - side code in Tcl, Perl, even bash ( t he open- source Linux/ Unix shell) .
€ Mult iple- client API s— Post greSQL support s t he developm ent of client applicat ions in m any languages. This book describes how t o int erface t o Post greSQL from C, C+ + , ODBC, Perl, PHP, Tcl/ Tk, and Pyt hon.
€ Unique dat a t ypes— Post greSQL provides a variet y of dat a t ypes. Besides t he usual num eric, st ring, and dat a t ypes, you will also find geom et ric t ypes, a Boolean dat a t ype, and dat a t ypes designed specifically to deal wit h net work addresses.
€ Ext ensibilit y — One of t he m ost im port ant feat ures of Post greSQL is t hat it can be ext ended. I f you don't find som et hing t hat you need, you can usually add it yourself. For exam ple, you can add new dat a t ypes, new funct ions and operat ors, and even new procedural and client languages. There are m any cont ribut ed packages available on t he I nt ernet . For exam ple, Refract ions Research, I nc. has developed a set of geographic dat a t ypes t hat can be used t o efficient ly m odel spat ial ( GI S) dat a.
W h a t V e r si o n s D o e s T h i s Bo o k Co v e r ?
The first edit ion of t his book covered versions 7.1 t hrough 7.3. I n t his edit ion, we've updat ed t he basics and added coverage for t he new feat ures int roduced in versions 7.4 and 8.0. Throughout t he book, I 'll be sure t o let you know which feat ures work only in new releases, and, in a few cases, I 'll explain feat ures t hat have been deprecat ed ( t hat is, feat ures t hat are obsolet e) . You can use t his book t o inst all, configure, t une, program , and m anage Post greSQL versions 7.1 t hrough 8.0.
Fort unat ely, t he Post greSQL developers t ry very hard t o m aint ain forward com pat ibilit y—new feat ures t end not t o break exist ing applicat ions. This m eans t hat all t he feat ures discussed in t his book should st ill be available and subst ant ially sim ilar in lat er versions of Post greSQL. I have t ried t o avoid t alking about feat ures t hat have not been released at t he t im e of writ ing—where I have m ent ioned fut ure developm ent s, I will point t hem out .
I f you are already using Post greSQL, you should find t his book a useful guide t o som e of t he feat ures t hat you m ight be less fam iliar wit h. The first part of t he book provides an int roduct ion t o SQL and Post greSQL for t he new user. You'll also find inform at ion t hat shows how t o obt ain and inst all Post greSQL on a Unix/ Linux host , as well as on Microsoft Windows.
I f you are developing an applicat ion t hat will st ore dat a in Post greSQL, t he second part of t his book will provide you wit h a great deal of inform at ion relat ing t o Post greSQL program m ing. You'll find inform at ion on bot h server -side and client -side program m ing in a variet y of languages.
Every dat abase needs occasional adm inist rat ive work. The final part of t he book should be of help if you are a Post greSQL adm inist rat or, or a developer or user t hat needs t o do occasional adm inist rat ion. You will also find inform at ion on how t o secure your dat a against inappropriat e use.
Finally, if you are t rying to decide which dat abase to use for your current proj ect ( or for fut ure proj ect s) , t his book should provide all t he inform at ion you need t o evaluat e whet her Post greSQL will fit your needs.
W h a t T o p i cs D o e s T h i s Bo o k Co v e r ?
Post greSQL is a huge product . I t 's not easy t o find t he right m ix of t opics when you are t rying t o fit everyt hing int o a single book. This book is divided int o t hree part s.
The first part , "General Post greSQL Use," is an int roduct ion and user's guide for Post greSQL. Chapt er 1, "I nt roduct ion t o Post greSQL and SQL," covers t he basics—how t o obt ain and inst all Post greSQL ( if you are running Linux, chances are you already have Post greSQL and it m ay be inst alled) . The first chapt er also provides a gent le int roduct ion t o SQL and discusses t he sam ple dat abase we'll be using t hroughout t he book. Chapt er 2, "Working wit h Dat a in Post greSQL," describes t he m any dat a t ypes support ed by a st andard Post greSQL dist ribut ion; you'll learn how t o ent er values ( lit erals) for each dat a t ype, what kind of dat a you can st ore wit h each t ype, and how t hose dat a t ypes are com bined int o expressions. Chapt er 3, "Post greSQL SQL Synt ax and Use," fills in som e of t he det ails we glossed over in t he first t wo chapt ers. You'll learn how to creat e new dat abases, new t ables and indexes, and how Post greSQL keeps your dat a safe t hrough t he use of t ransact ions. Chapt er 4, "Perform ance," describes t he Post greSQL opt im izer. I 'll show you how t o get inform at ion about t he decisions m ade by t he opt im izer, how t o decipher t hat inform at ion, and how t o influence t hose decisions.
Part I I, "Program m ing wit h Post greSQL," is all about Post greSQL program m ing. I nChapt er 5, "I nt roduct ion t o Post greSQL Program m ing," we st art off by describing t he opt ions you have when developing a dat abase applicat ion t hat works wit h Post greSQL ( and t here are a lot of opt ions) . Chapt er 6, "Ext ending Post greSQL," briefly describes how t o ext end Post greSQL by adding new funct ions, dat a t ypes, and operat ors. Chapt er 7, "PL/ pgSQL," describes t he PL/ pgSQL language. PL/ pgSQL is a server- based procedural language. Code t hat you writ e in PL/ pgSQL execut es wit hin t he Post greSQL server and has very fast access t o dat a. Each chapt er in t he rem ainder of t he program m ing sect ion deals wit h a client- based API . You can connect t o a Post greSQL server using a num ber of languages. I show you how t o int erface t o Post greSQL using C, C+ + , ecpg, ODBC, JDBC, Perl, PHP, Tcl/ Tk, Pyt hon, and Microsoft 's .NET. Chapt ers 8t hrough 18all follow t he sam e pat t ern: you develop a series of client applicat ions in a given language. The first client applicat ion shows you how t o est ablish a connect ion t o t he dat abase ( and how t hat connect ion is represent ed by t he language in quest ion) . The next client adds error checking so t hat you can int ercept and react t o unusual condit ions. The t hird client in each chapt er dem onst rat es how t o process SQL com m ands from wit hin t he client . The final client wraps everyt hing t oget her and shows you how t o build an int eract ive query processor using t he language being discussed. Even if you program in only one or t wo languages, I would encourage you t o st udy t he ot her chapt ers in t his sect ion. I t hink you'll find t hat looking at t he sam e applicat ion writ t en in a variet y of languages will help you underst and t he philosophy followed by t he Post greSQL developm ent t eam , and it 's a great way t o st art learning a new language. Chapt er 19, "Ot her Useful Program m ing Tools," int roduces you t o a few program m ing t ools ( and int erfaces) t hat you m ight find useful: PL/ Java and PL/ Perl. I 'll also show you how t o use Post greSQL inside of bash shell script s.
The final part of t his book (Part I I I, "Post greSQL Adm inist rat ion") deals wit h adm inist rat ive issues. The final six chapt ers of t his book show you how t o perform t he occasional dut ies required of a Post greSQL adm inist rat or. I n t he first t wo chapt ers, Chapt er 20, "I nt roduct ion t o Post greSQL Adm inist rat ion," and Chapt er 21, "Post greSQL Adm inist rat ion," you'll learn how t o st art up, shut down, back up, and rest ore a server. I n Chapt er 22, "I nt ernat ionalizat ion and Localizat ion," you will learn how Post greSQL support s int ernat ionalizat ion and localizat ion. Post greSQL underst ands how t o st ore and process a variet y of single-byt e and m ult i- byt e charact er set s including Unicode, ASCI I , and Japanese, Chinese, Korean, and Taiwan EUC. I n Chapt er 23, "Securit y," I 'll show you how t o secure your dat a against unaut horized uses ( and unaut horized users) . I n Chapt er 24, "Replicat ing Post greSQL wit h Slony," you'll learn how t o replicat e dat a wit h Post greSQL's Slony replicat ion syst em . Chapt er 25, "Cont ribut ed Modules," int roduces a few open-source proj ect s t hat work well wit h Post greSQL. I 'll show you how t o query a Post greSQL dat abase using XML, how t o configure and use TSEARCH2 ( a full- t ext indexing and search syst em ) , and how t o inst all and use PgAdm in I I I , a graphical user int erface specifically designed for Post greSQL.
W h a t ' s N e w i n t h e Se co n d Ed i t i o n ?
The first edit ion of t his book hit t he shelves in February 2003—at t hat t im e, t he Post greSQL developers had j ust released version 7.3.2. Release 7.4 was unleashed in Novem ber 2003. I n January 2005, t he Post greSQL developers released version 8.0—a m aj or release full of new feat ures. We t im ed t he second edit ion of t his book t o coincide wit h t he release of version 8.0 ( t he book will appear in bookst ores a few m ont hs aft er 8.0 hit s t he st reet s) . I n t his edit ion, we've added coverage for all of t he ( m aj or) new feat ures in 7.3, 7.4, and 8.0, including
€ I nst alling, securing, and m anaging Post greSQL on Windows host s
€ Schem as
€ New quot ing m echanism s for st ring values
€ New dat a t ypes (ANYARRAY,ANYELEMENT, VOID)
€ The st andards- conform ing INFORMATION_SCHEMA
€ Nest ed t ransact ions (SAVEPOINT's)
€ The new Post greSQL buffer m anager
€ Aut o-vacuum
€ Prepared- st at em ent execut ion ( t he PREPARE/EXECUTEm odel)
€ Set -ret urning funct ions
€ Except ion handling in PL/ pgSQL
€ libpqxx, t he new Post greSQL int erface for C+ + client s
€ New feat ures in ecpg( t he em bedded SQL processor for C)
€ New feat ures in t he ODBC, JDBC ( Java) , Perl, Pyt hon, PHP, and Tcl/ Tk client int erfaces
€ npgsql—t he Post greSQL .NETdat a provider
€ Ot her useful program m ing t ools ( PL/ Java, pgpash, pgcurl, et c.)
€ Point - in- t im e recovery
€ Replicat ion
€ Using Post greSQL wit h XML
€ Full- t ext search
Pa r t I : Ge n e r a l Post g r e SQL U se
Cha pt e r 1 . I n t r odu ct ion t o Post gr e SQL a n d SQL
Post g r eSQL is an op en - sou r ce, clien t / ser v er , r elat ion al d at ab ase. Post g r eSQL of f er s a u n iq u e m ix of f eat u r es t h at com p ar e w ell t o t he m aj or com m er cial d at ab ases su ch as Sy b ase, Or acle, an d DB2 . On e of t h e m aj or ad v an t ag es t o Post g r eSQL is t h at it is op en sou r ce —y ou can see t h e sou r ce cod e f or Post g r eSQL. Post g r eSQL is n ot ow n ed by an y sin gle com p an y . I t is d ev elop ed , m ain t ain ed , b r ok en , an d f ix ed b y a g r ou p of v olu n t eer d ev elop er s ar ou n d t he w or ld . You d on ' t h av e t o b u y Post g r eSQL—it 's f r ee. You w on ' t h av e t o p ay an y m ain t en an ce f ees ( alt h ou gh y ou can cer t ain ly f in d com m er cial sou r ces f or t ech n ical su p p or t ) .
Post g r eSQL of f er s all t h e u su al f eat u r es of a r elat ion al d at ab ase plu s q u it e a f ew u n iq u e f eat u r es. Post g r eSQL offer s in h er it an ce ( for y ou ob j ect - or ien t ed r eader s) . You can ad d y ou r ow n dat a t y p es t o Post g r eSQL. ( I k n ow , som e of y ou ar e p r ob ab ly t h in k in g t h at y ou can d o t h at in y ou r f av or it e d at ab ase. ) Most d at ab ase sy st em s allow y ou t o giv e a n ew n am e t o an ex ist in g t y pe. Som e sy st em s allow y ou t o d ef in e com p osit e t y pes. Wit h Post g r eSQL, y ou can ad d n ew f u n d am en t al d at a t y p es. Post g r eSQL in clu des su p p or t f or g eom et r ic dat a t y p es su ch as point,line segment, box, polygon, an d circle. Post g r eSQL u ses in d ex in g st r u ct u r es t h at m ak e g eom et r ic dat a t y p es f ast . Post g r eSQL can be ex t en ded —y ou can bu ild n ew f u n ct ion s, n ew oper at or s, an d n ew d at a t y p es in t h e lan g u ag e of y ou r ch oice. Post g r eSQL is bu ilt ar ou n d clien t / ser v er ar ch it ect u r e. You can bu ild clien t applicat ion s in a n u m b er of dif f er en t lan g u ag es, in clu din g C, C+ + , Jav a, Py t h on , Per l, TCL/ Tk , an d ot h er s. On t h e ser v er side, Post g r eSQL spor t s a p ow er f u l p r oced u r al lan g u ag e, PL/ p g SQL ( ok ay , t h e lan g u ag e is spor t ier t h an t h e n am e) . You can add p r oced u r al lan g u ag es t o t h e ser v er . You w ill f in d p r oced u r al lan g u ag es su p p or t in g Per l, TCL/ Tk , an d ev en t h e bashsh ell.
A Sa m ple D a t a ba se
Th r ou g h ou t t h is b ook , I 'll u se a sim ple ex am p le d at ab ase t o h elp ex plain som e of t h e m or e com p lex con cep t s. Th e sam p le d at ab ase r ep r esen t s som e of t h e dat a st or ag e an d r et r iev al r eq u ir em en t s t h at y ou m ig h t en cou n t er w h en r u n n in g a v id eo r en t al st or e. I w on ' t pr et en d t h at t h e sam p le d at ab ase is u sef u l for an y r eal- w or ld scen ar ios; in st ead , t h is d at ab ase w ill h elp u s ex plor e h o w Post g r eSQL w or k s an d sh ou ld illu st r at e m an y Post g r eSQL f eat u r es.
To b eg in w it h , t h e sam p le d at ab ase ( w h ich is called m ov ies) con t ain s t h r ee k in ds of r ecor ds: cu st om er s, t ap es, an d r en t als.
Wh en ev er a cu st om er w alk s in t o ou r im ag in ar y v id eo st or e, y ou w ill con su lt y ou r d at ab ase t o d et er m in e w h et h er y ou alr eady k n ow t h is cu st om er . I f n ot , y ou ' ll ad d a n ew r ecor d . Wh at it em s of in f or m at ion sh ou ld y ou st or e f or each cu st om er ? At t h e v er y least , y ou w ill w an t t o r ecor d t h e cu st om er ' s n am e. You w ill w an t t o en su r e t h at each cu st om er h as a u n iq u e iden t if ier —y ou m ig h t h av e t w o cu st om er s n am ed " Dan n y Joh n son , " an d y ou ' ll w an t t o k eep t h em st r aigh t . A n am e is a poor ch oice f or a u n iq u e iden t if ier —n am es m ig h t n ot be u n iqu e, an d t h ey can of t en b e spelled in dif f er en t w ay s. ( " Was t h at Dan n y , Dan , or Dan iel?" ) You ' ll assig n each cu st om er a u n iq u e cu st om er I D. You m ig h t also w an t t o st or e t h e cu st om er ' s bir t h d at e so t h at y ou k n ow w h et h er h e sh ou ld b e allow ed t o r en t cer t ain m ov ies. I f y ou find t h at a cu st om er h as an ov er d u e t ap e r en t al, y ou ' ll p r ob ab ly w an t t o p h on e h im , so y ou bet t er st or e t h e cu st om er ' s p h on e n u m b er . I n a r eal- w or ld b u sin ess, y ou w ou ld p r ob ab ly w an t t o k n ow m u ch m or e in f or m at ion ab ou t each cu st om er ( su ch as h is h om e ad d r ess) , b u t f or t h ese p u r p oses, y ou ' ll k eep y ou r st or ag e r eq u ir em en t s t o a m in im u m .
Nex t , y ou w ill n eed t o k eep t r ack of t h e v id eos t h at y ou st ock . Each v id eo h as a t it le an d a d u r at ion —y ou ' ll st or e t h ose. You m ig h t ow n sev er al cop ies of t h e sam e m ov ie an d y ou w ill cer t ain ly h av e m an y m ov ies w it h t h e sam e du r at ion , so y ou can ' t u se eit h er on e for a u n iq u e iden t if ier . I n st ead , y ou ' ll assign a u n iq u e I D t o each v id eo.
Fin ally , y ou w ill n eed t o t r ack r en t als. Wh en a cu st om er r en t s a t ape, y ou w ill st or e t h e cu st om er I D, t ap e I D, an d r en t al d at e.
Not ice t h at y ou w on ' t st or e t h e cu st om er n am e w it h each r en t al. As lon g as y ou st or e t h e cu st om er I D, y ou can alw ay s r et r iev e t h e cu st om er n am e. You w on ' t st or e t h e m ov ie t it le w it h each r en t al, eit h er —y ou can find t h e m ov ie t it le by it s u n iq u e iden t ifier .
At a f ew poin t s in t h is b ook , w e m ig h t m ak e ch an g es t o t h e lay ou t of t h e sam p le d at ab ase, b u t t h e basic sh ap e w ill r em ain t h e sam e. 1I n t r od u ct ion t o Post g r eSQL an d SQL
2Wor k in g w it h Dat a in Post g r eSQL
3Post g r eSQL SQL Sy n t ax an d Use
B a si c D a t a b a s e T e r m i n o l o g y
Before we get int o t he int erest ing st uff, it m ight b e useful t o get acquaint ed w it h a few of t he t er m s t hat y ou w ill encount er in y our Post gr eSQL life. Post gr eSQL has a long hist or y —you can t race it s hist ory back t o 1977 and a pr ogr am k now n as I ngr es. A lot has changed in t he relat ional dat abase w or ld since 1977. When y ou ar e br eak ing gr ound w it h a new product ( as t he I ngr es developer s w er e) , y ou don't have t he lux ur y of using st andard, well- under st ood, and well- accept ed t er m inology —you have t o m ak e it up as y ou go along. Many of t he t erm s used by Post gr eSQL have sy nony m s ( or at least close analogies) in t oday's r elat ional m ar k et place. I n t his sect ion, I 'll show you a few of t he t erm s t hat you'll encount er in t his book and t ry t o ex plain how t hey r elat e t o sim ilar concept s in ot her dat abase pr oduct s.
€ Schem a
A schem a is a nam ed collect ion of t ables. ( see t able) . A schem a can also cont ain view s, index es, sequences, dat a t ypes, oper at or s, and funct ions. Ot her relat ional dat abase pr oduct s use t he t er m cat alog .
€ Dat abase
A dat abase is a nam ed collect ion of schem as. When a client applicat ion connect s t o a Post gr eSQL server, it specifies t he nam e of t he dat abase t hat it w ant s t o access. A client cannot int er act w it h m ore t han one dat abase per connect ion but it can open any num ber of connect ions in or der t o access m ult iple dat abases sim ult aneously.
€ Com m and
A com m and is a st ring t hat y ou send t o t he ser v er in hopes of hav ing t he ser v er do som et hing useful. Som e people use t he word st at em ent t o m ean com m and . The t w o w or ds ar e v er y sim ilar in m eaning and, in pr act ice, are int er changeable.
€ Query
A query is a t ype of com m and t hat r et r ieves dat a fr om t he ser v er .
€ Table ( relat ion, file, class)
A t able is a collect ion of r ow s. A t able usually has a nam e, alt hough som e t ables ar e t em por ar y and ex ist only t o carry out a com m and. All t he r ow s in a t able have t he sam e shape ( in ot her w ords, every r ow in a t able cont ains t he sam e set of colum ns) . I n ot her dat abase sy st em s, y ou m ay see t he t er m s relat ion, file, or ev en class—t hese ar e all equivalent t o a t able.
€ Colum n ( field, at t ribut e)
A colum n is t he sm allest unit of st or age in a r elat ional dat abase. A colum n represent s one piece of infor m at ion about an obj ect . Every colum n has a nam e and a dat a t ype. Colum ns ar e gr ouped int o r ow s, and r ow s ar e gr ouped int o t ables. I n Figure 1.1, t he shaded area depict s a single colum n.
Figu r e 1 .1 . A colu m n ( h igh ligh t e d) .
The t er m s field and at t ribut e have sim ilar m eanings.
€ Row ( r ecor d, t uple)
A row is a collect ion of colum n values. Ev er y r ow in a t able has t he sam e shape ( in ot her w or ds, ev er y r ow is com posed of t he sam e set of colum ns) . I f y ou are t rying t o m odel a r eal- world applicat ion, a r ow r epr esent s a r eal- world obj ect . For ex am ple, if y ou are r unning an aut o dealership, y ou m ight hav e a vehiclest able. Each r ow in t he vehiclest able r epr esent s a car ( or t r uck , or m ot orcycle, and so on) . The k inds of infor m at ion t hat you st or e ar e t he sam e for all vehicles( t hat is, ev er y car has a color, a vehicle I D, an engine, and so on) . I n Figure 1.2, t he shaded area depict s a row .
You m ay also see t he t erm s record or t uple—t hese ar e equiv alent t o a row .
€ Com posit e t y pe
St art ing w it h Post gr eSQL ver sion 8, y ou can cr eat e new dat a t ypes t hat are com posed of m ult iple values. For ex am ple, you could cr eat e a com posit e t ype nam edaddresst hat holds a st r eet addr ess, cit y , st at e/ pr ovince, and post al code. When you cr eat e a t able t hat cont ains a colum n of t y pe address, you can st ore all four com ponent s in a single field. We discuss com posit e t ypes in m ore det ail in Chapt er 2, " Wor king w it h Dat a in Post gr eSQL."
€ Dom ain
A dom ain defines a nam ed specializat ion of anot her dat a t y pe. Dom ains ar e useful w hen y ou need t o ensur e t hat a single dat a t y pe is used in sever al t ables. For exam ple, y ou m ight define a dom ain nam ed accountNumbert hat cont ains a single let t er follow ed by four digit s. Then y ou can cr eat e colum ns of t ypeaccountNumberin a gener al ledger account s t able, an account s r eceiv able cust om er t able, and so on.
€ View
A view is an alt er nat ive w ay t o pr esent a t able ( or t ables) . You m ight t hink of a v iew as a " vir t ual" t able. A v iew is ( usually) defined in t er m s of one or m ore t ables. When y ou cr eat e a v iew , y ou ar e not st or ing m or e dat a, y ou ar e inst ead creat ing a different w ay of looking at exist ing dat a. A v iew is a useful w ay t o giv e a nam e t o a com plex quer y t hat y ou m ay hav e t o use r epeat edly .
€ Client / server
Post gr eSQL is built around a client / server ar chit ect ur e. I n a client / ser ver pr oduct , t her e ar e at least t w o pr ogr am s inv olv ed. One is a client and t he ot her is a ser v er . These pr ogr am s m ay ex ist on t he sam e host or on differ ent host s t hat ar e connect ed by som e sort of net w or k . The ser v er offers a ser vice; in t he case of Post gr eSQL, t he ser v er offer s t o st ore, ret rieve, and change dat a. The client asks a server t o per for m w or k ; a Post gr eSQL client ask s a Post gr eSQL ser v er t o ser ve up relat ional dat a.
€ Client
A client is an applicat ion t hat m akes r equest s of t he Post gr eSQL server. Before a client applicat ion can t alk t o a server, it m ust connect t o a post m ast er ( see postmaster) and est ablish it s ident it y. Client applicat ions pr ov ide a user int er face and can be w r it t en in m any languages. Chapt er s 8t hr ough 19w ill show you how t o w r it e a client applicat ion.
€ Ser v er
The Post gr eSQL ser ver is a program t hat ser vices com m ands com ing fr om client applicat ions. The Post gr eSQL ser v er has n o user int erface—you can't t alk t o t he ser v er dir ect ly , y ou m ust use a client applicat ion.
€ Post m ast er
Because Post gr eSQL is a client / ser v er dat abase, som et hing has t o list en for connect ion r equest s com ing fr om a client applicat ion. That 's w hat t he postmasterdoes. When a connect ion request ar r iv es, t he postmastercreat es a new ser ver pr ocess in t he host oper at ing sy st em .
€ Transact ion
A t r ansact ion is a collect ion of dat abase oper at ions t hat are t reat ed as a unit . Post gr eSQL guar ant ees t hat all t he oper at ions w it hin a t ransact ion com plet e or t hat none of t hem com plet e. This is an im por t ant pr oper t y —it ensur es t hat if som et hing goes w r ong in t he m iddle of a t r ansact ion, changes m ade befor e t he point of failur e w ill not be reflect ed in t he dat abase. A t r ansact ion usually st ar t s w it h a BEGINcom m and and ends w it h a COMMITor ROLLBACK( see t he next ent r ies) .
A com m it m arks t he successful end of a t r ansact ion. When y ou per for m a com m it , y ou are t elling Post gr eSQL t hat you have com plet ed a unit of oper at ion and t hat all t he changes t hat y ou m ade t o t he dat abase should becom e per m anent .
€ Rollback
A rollback m arks t he un successful end of a t r ansact ion. When y ou roll back a t r ansact ion, you ar e t elling Post gr eSQL t o discard any changes t hat y ou hav e m ade t o t he dat abase ( since t he beginning of t he t r ansact ion) .
€ I ndex
An index is a dat a st r uct ur e t hat a dat abase uses t o r educe t he am ount of t im e it t akes t o per for m cert ain oper at ions. An index can also be used t o ensur e t hat duplicat e values don't appear w her e t hey ar en't w ant ed. I 'll t alk about index es in Chapt er 4,
" Per for m ance."
€ Tablespace
A t ablespace defines an alt er nat ive st or age locat ion w her e y ou can cr eat e t ables and indexes. When y ou cr eat e a t able ( or index ) , you can specify t he nam e of a t ablespace—if y ou don't specify a t ablespace, Post gr eSQL cr eat es all obj ect s in t he sam e dir ect or y t r ee. You can use t ablespaces t o dist r ibut e t he w or k load acr oss m ult iple disk dr ives.
€ Result set
Pr e r e q u isit e s
Bef or e I g o m u ch f u r t h er , let ' s t alk ab ou t in st allin g Post g r eSQL. Ch ap t er s 2 1, " Post g r eSQL Ad m in ist r at ion , " an d 2 3, " Secu r it y , " d iscu ss Post g r eSQL in st allat ion in d et ail, b u t I ' ll sh o w y ou a t y p ical in st allat ion p r oced u r e h er e.
Wh en y ou in st all Post g r eSQL, y ou can st ar t w it h p r eb u ilt b in ar ies o r y ou can com p ile Post g r eSQL f r om sou r ce cod e. I n t h is ch ap t er , I 'll sh ow y ou h o w t o in st all Post g r eSQL o n a Lin u x h ost st ar t in g f r o m p r eb u ilt b in ar ies. I f y ou d ecid e t o in st all Post g r eSQL f r om sou r ce cod e, m a n y of t h e st ep s ar e t h e sam e. I ' ll sh ow y ou h o w t o bu ild Post g r eSQL f r om sou r ce cod e in Ch ap t er 2 1.
I n old er v er sion s of Post g r eSQL, y ou cou ld r u n t h e Post g r eSQL ser v er o n a Win d ow s h ost b u t y ou h ad t o in st all a Un ix - lik e in f r ast r u ct u r e ( Cy g w in ) f ir st : Post g r eSQL w asn ' t a n at iv e Win d ow s ap p licat ion . St ar t in g w it h Post g r eSQL v er sion 8 . 0 , t h e Post g r eSQL ser v er h as b een p or t ed t o t h e Win d ow s en v ir on m en t as a n at iv e - Win d ow s ap p licat ion . I n st allin g Post g r eSQL o n a Win d ow s ser v er is v er y sim p le; sim p ly d ow n load an d r u n t h e in st aller p r og r am . Yo u d o h av e a f ew ch oices t o m ak e, an d w e cov er t he en t ir e p r oced u r e in Ch ap t er 2 1.
I n st a llin g Post g r e SQL U sin g a n RPM
Th e easiest w ay t o in st all Post g r eSQL is t o u se a p r eb u ilt RPM p ack ag e. RPM is t h e Red Hat Pack ag e Man ag er . I t ' s a sof t w ar e p ack ag e d esig n ed t o in st all ( an d m an ag e) ot h er sof t w ar e p ack ag es. I f y ou ch oose t o in st all u sin g so m e m et h od ot h er t h an RPM, con su lt t h e d ocu m en t at ion t h at com es w it h t h e d ist r ib u t ion y o u ar e u sin g .
Post g r eSQL is d ist r ib u t ed as a collect ion of RPM p ack ag es—y ou d on ' t h av e t o in st all all t h e p ack ag es t o u se Post g r eSQL. Tab le 1 . 1 list s t h e RPM p ack ag es av ailab le as of r elease 7 . 4 . 5 .
Don ' t w or r y if y ou d on ' t k n ow w h ich of t h ese y ou n eed ; I 'll ex p lain m o st of t h e p ack ag es in lat er ch ap t er s. Yo u can st ar t w or k in g w it h Post g r eSQL b y d ow n load in g t h e p ost g r esq l, p ost g r esq l- libs, a n d p ost g r esq l - ser v er p ack ag es. Th e act u al f iles ( at t h e w w w . p ost g r esq l. or gw eb sit e) h av e n am es t h at in clu d e a v er sion n u m b er : postgresql-7.4.5-2PGDG.i686.rpm, f or ex am p le.
I st r on g ly r ecom m en d cr eat in g an em p t y d ir ect or y , an d t h en d ow n load in g t h e Post g r eSQL p ack ag es in t o t h at d ir ect or y . Th at w ay y ou can in st all all t h e Post g r eSQL p ack ag es w it h a sin gle co m m a n d .
Af t er y ou h av e d ow n load ed t h e d esir ed p ack ag es, u se t h e rpmco m m a n d t o p er f or m t h e in st allat ion p r oced u r e. Yo u m u st h av e su p er u ser p r iv ileg es t o in st all Post g r eSQL.
To in st all t h e Post g r eSQL p ack ag es, cdin t o t h e d ir ect or y t h at con t ain s t h e p ack ag e files a n d issu e t h e f ollow in g co m m an d :
# rpm -ihv *.rpm
Th e rpmco m m a n d in st alls all t h e p ack ag es in y ou r cu r r en t d ir ect or y . You sh ou ld see r esu lt s sim ilar t o w h at is sh ow n in Fig u r e 1 . 3.
Fi g u r e 1 . 3 . U si n g t h erpmco m m a n d t o i n st a l l P o st g r e SQ L.
[ View f u ll size im ag e]
T a b l e 1 . 1 . P o st g r e SQ L R P M P a ck a g e s a s o f R e l e a se 7 . 4 . 5
P a ck a g e D e scr i p t i o n
p ost g r esq l Clien t s, lib r ar ies, an d d ocu m en t at ion
p ost g r esq l- ser v er Pr og r am s ( an d d at a f iles) r eq u ir ed t o r u n a ser v er
p ost g r esq l- d ev el Files r eq u ir ed t o cr eat e n ew clien t ap p licat ion s
p ost g r esq l- j d b c JDBC d r iv er f or Post g r eSQL
p ost g r esq l- t cl Tcl clien t an d PL/ Tcl
p ost g r esq l- p y t h on Post g r eSQL' s Py t h on lib r ar y
p ost g r esq l- t est Reg r ession t est su it e f or Post g r eSQL
p ost g r esq l- libs Sh ar ed lib r ar ies f or clien t ap p licat ion s
p ost g r esq l- d ocs Ex t r a d ocu m en t at ion n ot in clu d ed in t h e p ost g r esq l b ase p ack ag e
Th e RPM in st aller sh ou ld h av e cr eat ed a n ew u ser ( n am edpostgres) f or y ou r sy st em . Th is u ser I D ex ist s so t h at all d at ab ase f iles accessed b y Post g r eSQL can b e o w n ed b y a sin g le u ser .
Each RPM p ack ag e is com p osed o f m a n y f iles. You can v iew t h e list of f iles in st alled f or a g iv en p ack ag e u sin g t h e rpm -ql co m m an d :
# rpm -ql postgresql-server /etc/rc.d/init.d/postgresql /usr/bin/initdb
/usr/bin/initlocation ...
/var/lib/pgsql/data # rpm -ql postgresql-libs /usr/lib/libecpg.so.3 /usr/lib/libecpg.so.3.2.0 /usr/lib/libpgeasy.so.2 ...
/usr/lib/libpq.so.2.1
At t h is p oin t ( assu m in g t h at ev er y t h in g w or k ed ) , y ou h av e in st alled Post g r eSQL o n y ou r sy st em . No w it ' s t im e t o cr eat e a d at ab ase t o p lay , er , w or k in .
Wh ile y ou h av e su p er u ser p r iv ileg es, issu e t h e f ollow in g co m m an d s:
# su - postgres
bash-2.04$ echo $PGDATA /var/lib/pgsql/data bash-2.04$ initdb
Th e f ir st co m m an d (su - postgres) ch an g es y ou r id en t it y f r om t h e OS su p er u ser ( r oot ) t o t h e Post g r eSQL su p er u ser (postgres) . Th e secon d co m m an d (echo $PGDATA) sh ow s y ou w h er e t h e Post g r eSQL d at a f iles w ill b e cr eat ed . Th e f in al co m m an d cr eat es t h e t w o p r ot ot y p e d at ab ases (template0an d template1) .
You sh ou ld g et ou t p u t t h at look s lik e t h at sh ow n in Fig u r e 1 . 4.
Fi g u r e 1 . 4 . Cr e a t i n g t h e p r o t o t y p e d a t a b a se s u si n g initdb.
You n o w h av e t w o em p t y d at ab ases n am ed template0an d template1. You r eally sh ou ld n ot cr eat e n ew t ab les in eit h er of t h ese d at ab ases—a t em p lat e d at ab ase con t ain s all t h e d at a r eq u ir ed t o cr eat e ot h er d at ab ases. I n ot h er w or d s, template0an d template1act as p r ot ot y p es f or cr eat in g ot h er d at ab ases. I n st ead , let ' s cr eat e a d at ab ase t h at y ou can p lay in . Fir st , st ar t t h e postmasterp r ocess. Th epostmasteris a p r og r am t h at list en s f or con n ect ion r eq u est s com in g f r om clien t ap p licat ion s. W h en a con n ect ion r eq u est ar r iv es, t h e postmasterst ar t s a n ew ser v er p r ocess. You can ' t d o an y t h in g in Post g r eSQL w it h ou t a postmaster.Fig u r e 1 . 5sh o w s y ou h o w t o g et t h epostmasterst ar t ed .
Fi g u r e 1 . 5 . Cr e a t i n g a n e w d a t a b a se w i t hcreatedb.
[ View f u ll size im ag e]
Af t er st ar t in g t h e postmaster, u se t h e createdbco m m a n d t o cr eat e t h emoviesd at ab ase ( t h is is also sh ow n in Fig u r e 1 . 5) . Most of t h e ex am p les in t h is b ook t ak e p lace in t h e moviesd at ab ase.
Not ice t h at I u sed t h e pg_ctlco m m a n d t o st ar t t h e postmaster[ 1].
[ 1]Yo u can also ar r an g e f or t h e postmastert o st ar t w h en ev er y ou b oot y ou r com p u t er , b u t t h e ex act in st r u ct ion s v ar y d ep en d in g on w h ich op er at in g sy st em y ou ar e u sin g . See t h e sect ion t it led "Ar r an g in g f or Post g r eSQL St ar t u p an d Sh u t d ow n" in Ch ap t er 2 1
Fi g u r e 1 . 6 .pg_ctlo p t i o n s.
[ View f u ll size im ag e]
I f y ou u se a r ecen t RPM file t o in st all Post g r eSQL, t h e t w o p r ev iou s st ep s (initdban d pg_ctl start) can b e au t om at ed . I f y ou f in d a file n am ed postgresqlin t h e /etc/rc.d/init.dd ir ect or y , y ou can u se t h at sh ell scr ip t t o in it ialize t h e d at ab ase an d st ar t t h e postmaster. Th e/etc/rc.d/init.d/postgresqlscr ip t can b e in v ok ed w it h an y of t h e co m m an d - lin e op t ion s sh ow n in Tab le 1 . 2.
At t h is p oin t , y ou sh ou ld u se t h e createuserco m m a n d t o t ell Post g r eSQL w h ich u ser s ar e allow ed t o access y ou r d at ab ase. Let ' s allow t h e u ser 'bruce' in t o ou r sy st em ( see Fig u r e 1 . 7) .
Fi g u r e 1 . 7 . Cr e a t i n g a n e w P o st g r e SQ L u se r .
[ View f u ll size im ag e]
T a b l e 1 . 2 ./etc/rc.d/init.d/postgresqlO p t i o n s
O p t i o n D e scr i p t i o n
start St ar t t h epostmaster
stop St op t h e postmaster
status Disp lay t h e p r ocess I D of t h e postmasterif it is
r u n n in g
restart St op an d t h en st ar t t h e postmaster
reload For ce t h e postmastert o r er ead it s con f ig u r at ion
Connect ing t o a Da t a ba se -c <query> Run only single query (or slash command) and exit -d <dbname> Specify database name to connect to (default: korry) -e Echo queries sent to backend
-E Display queries that internal commands generate -f <filename> Execute queries from file, then exit
-F <string> Set field separator (default: "|") (-P fieldsep=) -h <host> Specify database server host (default: domain socket) -H HTML table output mode (-P format=html)
-l List available databases, then exit -n Disable readline
-o <filename> Send query output to filename (or |pipe)
-p <port> Specify database server port (default: hardwired) -P var[=arg] Set printing option 'var' to 'arg' (see \pset command) -q Run quietly (no messages, only query output)
You use t h e -dopt ion t o specify t o w hich dat abase y ou w ant t o connect . I f y ou don't specify a dat abase, Post gr eSQL w ill assum e t hat y ou w ant t o connect t o a dat abase w hose nam e is y our user nam e. For ex am ple, if y ou ar e logged in as user br uce, Post gr eSQL w ill assum e t hat y ou w ant t o connect t o a dat abase nam ed bruce.
The -dand -Uar e not st r ict ly r equir ed. The com m and line forpsqlshould be of t he follow ing for m :
psql [options] [dbname [username]]
I f y ou ar e connect ing t o a Post gr eSQL ser v er t hat is r unning on t he host t hat y ou ar e logged in t o, y ou pr obably don't hav e t o w or r y about t he -hand -popt ions. I f, on t he ot her hand, y ou ar e connect ing t o a Post gr eSQL ser v er r unning on a differ ent host , use t he -hopt ion t o t ell psqlw hich host t o connect t o. You can also use t he -popt ion t o specify a TCP/ I P por t num ber —y ou only hav e t o do t hat if y ou ar e connect ing t o a ser v er t hat uses a nonst andar d por t ( Post gr eSQL usually list ens for client connect ions on TCP/ I P por t num ber 5432) . Her e ar e a few ex am ples:
$ # connect to a server waiting on the default port on host 192.168.0.1 $ psql -h 192.168.0.1
$ # connect to a server waiting on port 2000 on host arturo $ psql -h arturo -p 2000
I f y ou pr efer , y ou can specify t he dat abase nam e, host nam e, and TCP/ I P por t num ber using env ir onm ent v ar iables r at her t han using t he com m and - line opt ions. Table 1.3list s som e of t he psqlcom m and - line opt ions and t he cor r esponding env ir onm ent v ar iables.
A ( Very) Sim ple Query
At t his point , y ou should be r unning t hepsqlclient applicat ion. Let 's t r y a v er y sim ple quer y :
$ psql -d movies
Welcome to psql, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms \h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query \q to quit
movies=# SELECT user; current_user
---korry
(1 row)
movies=# \q
$
Let 's t ak e a close look at t his session. Fir st , y ou can see t hat I st ar t ed t h e psqlpr ogr am w it h t h e -d moviesopt ion —t his t ells psqlt hat I w ant t o connect t o t he moviesdat abase.
Aft er gr eet ing m e and pr ov iding m e w it h a few cr ucial hint s, psqlissues a pr om pt :movies=#. psqlencodes som e useful infor m at ion int o t he pr om pt , st ar t ing w it h t he nam e of t h e dat abase t hat I am cur r ent ly connect ed t o (moviesin t his case) . The char act er t hat follow s t he dat abase nam e can v ar y . A =char act er m eans t hat psqlis w ait ing for m e t o st ar t a com m and. A -char act er m eans t hat psqlis w ait ing for m e t o com plet e a com m and (psqlallow s y ou t o split a single com m and ov er m ult iple lines. The fir st line is pr om pt ed by a =char act er ; subsequent lines ar e pr om pt ed by a -char act er ) . I f t he pr om pt ends w it h a (char act er , y ou hav e ent er ed m or e opening par ent heses t han closing par ent heses.
You can see t he com m and t hat I ent er ed follow ing t h e pr om pt : SELECT user;. Each SQL com m and st ar t s w it h a v er b —in t his case, SELECT. The v er b t ells Post gr eSQL w hat y ou w ant t o do an d t he r est of t he com m and pr ov ides infor m at ion specific t o t hat com m and. I am ex ecut ing a SELECTcom m and.SELECTis used t o r et r iev e infor m at ion fr om t he dat abase. When y ou ex ecut e a SELECTcom m and, y ou hav e t o t ell Post gr eSQL w hat infor m at ion y ou ar e int er est ed in. I w ant t o r et r iev e m y Post gr eSQL user I D so I SELECT user. The final par t
y ou w on't be allow ed t o im per sonat e her . (Chapt er 2 3discusses secur it y in gr eat er det ail. ) I f y ou don't pr ov ide psqlw it h a user nam e, it w ill assum e t he user nam e t hat y ou used w hen y ou logged in t o y our host .
Ta b le 1 .3 . psqlEn v ir on m e n t V a r ia b le s
Com m a n d - Lin e Op t ion En v ir on m e n t V a r ia b le M e a n in g
-d <dbname> PGDATABASE Nam e of dat abase t o connect t o
-h <host> PGHOST Nam e of host t o connect t o
-p <port> PGPORT Por t num ber t o connect t o
of t his com m and is t he sem icolon (;) —each SQL com m and m ust end w it h a sem icolon.
Cr e a t i n g T a b l e s
Now t hat you have seen how t o connect t o a dat abase and issue a sim ple query, it 's t im e t o creat e som e sam ple dat a t o work wit h.
Because you are pret ending t o m odel a m ovie- rent al business ( t hat is, a video st ore) , you will creat e t ables t hat m odel t he dat a t hat you m ight need in a video st ore. St art by creat ing t hree t ables: tapes, customers, andrentals.
The tapest able is sim ple: For each videot ape, you want t o st ore t he nam e of t he m ovie, t he durat ion, and a unique ident ifier ( rem em ber t hat you m ay have m ore t han one copy of any given m ovie, so t he m ovie nam e is not sufficient t o uniquely ident ify a specific t ape) .
Here is t he com m and you should use t o creat e t he tapest able:
CREATE TABLE tapes (
tape_id CHARACTER(8) UNIQUE, title CHARACTER VARYING(80), duration INTERVAL
);
Let 's t ake a close look at t his com m and.
The verb in t his com m and is CREATE TABLE, and it s m eaning should be obvious—you want t o creat e a t able. Following t he CREATE TABLEverb is t he nam e of t he t able (tapes) and t hen a com m a- separat ed list of colum n definit ions, enclosed wit hin parent heses.
Each colum n in a t able is defined by a nam e and a dat a t ype. The first colum n in tapesis nam ed tape_id. Colum n nam es ( and t able nam es) m ust begin wit h a let t er or an underscore charact er[ 2 ] and should be 31 charact ers or fewer[ 3 ]. The tape_idcolum n is creat ed wit h a dat a t ype of CHARACTER(8). The dat a t ype you define for a colum n det erm ines t he set of values t hat you can put int o t hat colum n. For exam ple, if you want a colum n t o hold num eric values, you should use a num eric dat a t ype; if you want a colum n t o hold dat e ( or t im e) values, you should use a dat e/ t im e dat a t ype. tape_idholds alphanum eric values ( a m ixt ure of num bers and let t ers) , so I chose a charact er dat a t ype, wit h a lengt h of eight charact ers.
[ 2 ] You can begin a colum n or t able nam e wit h nonalphabet ic charact ers, but you m ust enclose t he nam e in double quot es. You have t o quot e t he nam e not only when you creat e it , but each t im e you reference it .
[ 3 ] You can increase t he m axim um ident ifier lengt h beyond 31 charact ers if you build Post greSQL from a source dist ribut ion. I f you do so, you'll have t o rem em ber t o increase t he ident ifier lengt h each t im e you upgrade your server, or whenever you m igrat e t o a different server.
Thetape_idcolum n is defined as UNIQUE. The word UNIQUEis not a part of t he dat a t ype—t he dat a t ype is CHARACTER(8). The keyword 'UNIQUE' specifies a colum n const raint . A colum n const raint is a condit ion t hat m ust be m et by a colum n. I n t his case, each row in t he tapest able m ust have a unique tape_id. Post greSQL support s a variet y of colum n const raint s ( and t able const raint s) . I 'll cover const raint s in Chapt er 2.
The titleis defined asCHARACTER VARYING(80). The difference bet ween CHARACTER(n)and CHARACTER VARYING(n)is t hat a CHARACTER(n)colum n is fixed lengt h—it will always cont ain a fixed num ber of charact ers ( nam ely, n charact ers) . A CHARACTER VARYING(n)colum n can cont ain a m axim um of n charact ers. I 'll m ent ion here t hatCHARACTER(n)can be abbreviat ed as CHAR(n), and CHARACTER VARYING(n)can be abbreviat ed as VARCHAR(n). I choseCHAR(8)as t he dat a t ype for tape_idbecause I know t hat atape_idwill always cont ain exact ly eight charact ers, never m ore and never less. Movie t it les, on t he ot her hand, are not all t he sam e lengt h, so I chose VARCHAR(80)for t hose colum ns. A fixed lengt h dat a t ype is a good choice when t he dat a t hat you st ore is in fact fixed lengt h; and in som e cases, fixed lengt h dat a t ypes can give you a perform ance boost . A variable lengt h dat a t ype saves space ( and oft en gives you bet t er perform ance) when t he dat a t hat you are st oring is not all t he sam e lengt h and can vary widely.
The durationcolum n is defined as an INTERVAL—an INTERVALst ores a period of t im e such as 2 weeks, 1 hour 45 m inut es, and so on.
I 'll be discussing Post greSQL dat a t ypes in det ail in Chapt er 2. Let 's m ove on t o creat ing t he ot her t ables in t his exam ple dat abase.
The customerst able is used t o record inform at ion about each cust om er for t he video st ore.
CREATE TABLE customers (
);
Each cust om er will be assigned a unique customer_id. Not ice t hat customer_idis defined as an INTEGER, whereas t he ident ifier for atapewas defined as a CHAR(8). A tape_idcan cont ain alphabet ic charact ers, but a customer_idis ent irely num eric[ 4 ].
[ 4 ] The decision t o definecustomer_idas an INTEGERwas arbit rary. I sim ply want ed t o show a few m ore dat a t ypes here.
I 've used t wo ot her dat a t ypes here t hat you m ay not have seen before: DATEand NUMERIC. A DATEcolum n can hold dat e values ( cent ury, year, m ont h, and day) . Post greSQL offers ot her dat e/ t im e dat a t ypes t hat can st ore different dat e/ t im e com ponent s. For exam ple, a TIMEcolum n can st ore t im e values ( hours, m inut es, seconds, and m icroseconds) . A TIMESTAMP colum n gives you bot h dat e and t im e com ponent s—cent uries t hrough m icroseconds.
ANUMERICcolum n, obviously, holds num eric values. When you creat e aNUMERICcolum n, you have t o t ell Post greSQL t he t ot al num ber of digit s t hat you want t o st ore and t he num ber of fract ional digit s ( t hat is, t he num ber of digit s t o t he right of t he decim al point ) . Thebalancecolum n cont ains a t ot al of seven digit s, wit h t wo digit s t o t he right of t he decim al point .
Now, let 's creat e t he rentalst able:
CREATE TABLE rentals (
tape_id CHARACTER(8), customer_id INTEGER, rental_date DATE );
When a cust om er com es in t o rent a t ape, you will add a row t o t he rentalst able t o record t he t ransact ion. There are t hree pieces of inform at ion t hat you need t o record for each rent al: t he tape_id, t hecustomer_id, and t he dat e t hat t he rent al occurred. Not ice t hat each row in t he rent als t able refers t o a cust om er (customer_id) and a t ape (tape_id) . I n m ost cases, when one row refers t o anot her row, you want t o use t he sam e dat a t ype for bot h colum ns.
W hat Makes a Relat ional Dat abase Relat ional?
V i e w i n g T a b l e D e scr i p t i o n s
You m ight have not iced t hat t he list ing for t hetapesand customerst ables show t hat an index has been creat ed. Post greSQL aut om at ically creat es an index for you when you define UNIQUEcolum ns. An index is a dat a st ruct ure t hat Post greSQL can use t o ensure uniqueness. I ndexes are also used t o increase perform ance. I 'll cover indexes in m ore det ail in Chapt er 3, "Post greSQL SQL Synt ax and Use."
A d d i n g N e w R e co r d s t o a T a b l e
The t wo previous sect ions showed you how t o creat e som e sim ple t ables and how t o view t he t able definit ions. Now let 's see how t o insert dat a int o t hese t ables.
U si n g t h e INSERTCo m m a n d
The m ost com m on m et hod t o get dat a int o a t able is by using t he INSERTcom m and. Like m ost SQL com m ands, t here are a num ber of different form at s for t heINSERTcom m and. Let 's look at t he sim plest form first :
INSERT INTO table VALUES ( expression [,...] );
When you use an INSERTst at em ent , you have t o provide t he nam e of t he t able and t he values t hat you want t o include in t he new row. The following com m and insert s a new row int o t he customerst able:
INSERT INTO customers VALUES (
1,
'William Rubin', '555-1212', '1970-12-31', 0.00
);
This com m and creat es a single row in t he customerst able. Not ice t hat you did not have t o t ell Post greSQL how t o m at ch up
A Quick I nt roduct ion t o Syntax Diagram s
I n m any books t hat describe a com put er language ( such as SQL) , you will see synt ax diagram s. A synt ax diagram is a precise way t o describe t he synt ax for a com m and. Here is an exam ple of a sim ple synt ax diagram :
INSERT INTO table VALUES ( expression [,...] );
I n t his book, I 'll use t he following convent ions:
€ Words t hat are present ed in uppercase m ust be ent ered lit erally, as shown, except for t he case. When you ent er t hese words, it doesn't m at t er if you ent er t hem in uppercase, lowercase, or m ixed case, but t he spelling m ust be t he sam e. SQL keywords are t radit ionally t yped in uppercase t o im prove
readabilit y, but t he case does not really m at t er ot herwise.
€ A lowercase it alic word is a placeholder for user- provided t ext . For exam ple, t he t able placeholder shows where you would ent er a t able nam e, and expression shows where you would ent er an expression.
€ Opt ional t ext is shown inside a pair of square bracket s ([]). I f you include opt ional t ext , don't include t he square bracket s.
€ Finally, ,...m eans t hat you can repeat t he previous com ponent one or m ore t im es, separat ing m ult iple occurrences wit h com m as.
So, t he following INSERTcom m ands are ( synt act ically) correct :
INSERT INTO states VALUES ( 'WA', 'Washington' ); INSERT INTO states VALUES ( 'OR' );
This com m and would not be legal:
INSERT states VALUES ( 'WA' 'Washington' );
0.00 );
There are t wo ot her form s for t heINSERTcom m and. I f you want t o creat e a row t hat cont ains only default values, you can use t he following form :
INSERT INTO table DEFAULT VALUES;
Of course, if any of t he colum ns in your t able are unique, you can only insert a single row wit h default values.
The final form for t he INSERTst at em ent allows you t o insert one or m ore rows based on t he result s of a query:
INSERT INTO table ( column [,...] ) SELECT query;
I haven't really t alked ext ensively about t he SELECTst at em ent yet ( t hat 's in t he next sect ion) , but I 'll show you a sim ple exam ple here:
INSERT INTO customer_backup SELECT * from customers;
This INSERTcom m and copies every row in t he customerst able int o t he customer_backupt able. I t 's unusual t o use
INSERT...SELECT...t o m ake an exact copy of a t able ( in fact , t here are easier ways t o do t hat ) . I n m ost cases, you will use t he INSERT...SELECT...com m and t o m ake an alt ered version of a t able; you m ight add or rem ove colum ns or change t he dat a using expressions.
U si n g t h e COPYCo m m a n d
I f you need t o load a lot of dat a int o a t able, you m ight want t o use t he COPYcom m and. The COPYcom m and com es in t wo form s. COPY ... TOwrit es t he cont ent s of a t able int o an ext ernal file. COPY ... FROMreads dat a from an ext ernal file int o a t able.
Let 's st art by export ing t hecustomerst able:
COPY customers TO '/tmp/customers.txt';
This com m and copies every row in t hecustomerst able int o a file nam ed '/tmp/customers.txt'. Take a look at t he cust om ers.t xt file:
1 Jones, Henry 555-1212 1970-10-10 0.00 2 Rubin, William 555-2211 1972-07-10 15.00 3 Panky, Henry 555-1221 1968-01-21 0.00 4 Wonderland, Alison 555-1122 1980-03-05 3.00
I f you com pare t he file cont ent s wit h t he definit ion of t he cust om ers t able:
movies=# \d customers
Table "customers"
Attribute | Type | Modifier
---+---+---customer_id | integer | customer_name| character varying(50) | phone | character(8) | birth_date | date | balance | numeric(7,2) | Index: customers_customer_id_key
You can see t hat t he colum ns in t he t ext form m at ch ( left t o right ) wit h t he colum ns defined in t he t able: The left m ost colum n is t he customer_id, followed bycustomer_name, phone, and so on. Each colum n is separat ed from t he next by a t ab charact er and each row ends wit h an invisible newline charact er. You can choose a different colum n separat or ( wit h t he DELIMITERS 'delimiter'opt ion) , but you can't change t he line t erm inat or. That m eans t hat you have t o be careful edit ing a COPYfile using a DOS ( or Windows) t ext edit or because m ost of t hese edit ors t erm inat e each line wit h a
carriage-ret urn/ newline com binat ion. That will confuse t he COPY ... FROMcom m and when you t ry t o im port t he t ext file.
I n st a l l i n g t h e Sa m p l e D a t a b a se
I f you want , you can download a sam ple dat abase from t his book's websit e: ht t p: / / www.conj ect rix.com / pgbook.
Aft er you have downloaded t he bookdata.tar.gzfile, you can unpack it wit h eit her of t he following com m ands:
$ tar -zxvf bookdata.tar.gz
or
$ gunzip c bookdata.tar.gz | tar xvf
-The bookdata.tar.gzfile cont ains a num ber of files and will unpack int o your current direct ory. Aft er unpacking, you will see a subdirect ory for each chapt er ( okay, for m ost chapt ers—not all chapt ers include sam ple code or sam ple dat a) .
You can use t he chapter1/load_sample.sqlfile t o creat e and populat e t he t hree t ables t hat I have discussed (tapes, customers, and rentals) . To use t he load_sample.sqlfile, execut e t he following com m and:
$ psql -d movies -f chapter1/load_sample.sql
R e t r i e v i n g D a t a f r o m t h e S a m p l e D a t a b a s e
At t his point , y ou should have a sam ple dat abase ( m ov ies) t hat cont ains t hr ee t ables (tapes,customers, and rentals) and a few r ow s in each t able. You k now how t o get dat a int o a t able; now let 's see how t o v iew t hat dat a.
TheSELECTst at em ent is used t o ret rieve dat a fr om a dat abase. SELECTis t he m ost com plex st at em ent in t he SQL language, and t he m ost pow erful. Using SELECT, you can ret rieve ent ire t ables, single r ow s, a gr oup of r ow s t hat m eet a set of const r aint s, com binat ions of m ult iple t ables, expr essions, and m or e. To help y ou under st and t he basics of t heSELECTst at em ent , I 'll t ry t o br eak it dow n int o each of it s for m s and m ov e fr om t he sim ple t o t he m ore com plex .
SELECTEx p r e s s i o n
I n it s sim plest for m , y ou can use t he SELECTst at em ent t o r et r iev e one or m ore values fr om a set of pr edefined funct ions. You've alr eady seen how t o r et r iev e y our Post gr eSQL user id:
movies=# select user; current_user
---korry
(1 row)
movies=# \q
Ot her v alues t hat you m ight w ant t o see are
select 5; -- returns the number 5 (whoopee) select sqrt(2.0); -- returns the square root of 2 select timeofday();-- returns current date/time
select now(); -- returns time of start of transaction
select version(); -- returns the version of PostgreSQL you are using
select now(), timeofday();
The previous ex am ple show s how t o SELECTm ore t han one piece of infor m at ion —j ust list all t he values t hat you w ant , separ at ed b y com m as.
The Post gr eSQL User 's Guide cont ains a list of all t he funct ions t hat ar e dist r ibut ed w it h Post gr eSQL. I n Chapt er 2, I 'll show y ou how t o com bine colum ns, funct ions, oper at or s, and lit er al v alues int o m or e com plex ex pr essions.
SELECT * FROMT a b l e
You pr obably w on't use t he fir st for m of t he SELECTst at em ent v er y oft en —it j ust isn't v er y ex cit ing. Mov ing t o t he next lev el of com plexit y, let 's see how t o r et r iev e dat a fr om one of t he t ables t hat y ou cr eat ed ear lier :
Code View : Scroll/ Show All
movies=# SELECT * FROM customers;
customer_id | customer_name | phone | birth_date | balance
---+---+---+---+---3 | Panky, Henry | 555-1221 | 1968-01-21 | 0.00 1 | Jones, Henry | 555-1212 | 1970-10-10 | 0.00 4 | Wonderland, Alice N. | 555-1122 | 1969-03-05 | 3.00
2 | Rubin, William | 555-2211 | 1972-07-10 | 15.00(4 rows)
When you w r it e aSELECTst at em ent , y ou have t o t ell Post gr eSQL what infor m at ion y ou ar e t r y ing t o ret rieve. Let 's t ak e a closer look at t he com ponent s of t his SELECTst at em ent .
Follow ing t he SELECTk ey w or d, y ou specify a list of t he colum ns t hat y ou w ant t o ret rieve. I used an ast er isk (*) her e t o t ell Post gr eSQL t hat w e w ant t o see all t he colum ns in t hecustomerst able.
Next , you have t o t ell Post gr eSQL w hich t able y ou w ant t o view ; in t his case, y ou w ant t o see t he customerst able.
Now let 's look at t he r esult s of t his quer y. ASELECTst at em ent r et ur ns a result set . A r esult set is a t able com posed of all t he r ow s and colum ns ( or fields) t hat y ou r equest . A r esult set m ay be em pt y.
Co m m e n t i n g
I don't m ean t o scar e y ou aw ay fr om t he NULLv alue—it 's very useful and oft en necessary —but y ou do have t o under st and t he com plicat ions t hat it int roduces.
N U LLI F( ) a n d COALESCE( )
Post gr eSQL offer s t w o oper at or s t hat can conv er t a NULLvalue t o som e ot her value or conv er t a specific value int o NULL.
The COALESCE()oper at or w ill subst it ut e a default value w henev er it encount er s a NULL. For ex am ple, pr et end t hat y ou'v e added t w o m or e colum ns, male_leadandfemale_leadt o t he t apes t able so t hat it looks like t his:
movies=# SELECT * from tapes;
tape_id | title | male_lead | female_lead | duration
---+---+---+---+---AB-12345 | The Godfather | Marlon Brando | | 02:55:00 AB-67472 | The Godfather | Marlon Brando | | 02:55:00 MC-68873 | Casablanca | Humphrey Bogart | Ingrid Bergman | 01:42:00 OW-41221 | Citizen Kane | | | 01:59:00 AH-54706 | Rear Window | James Stewart | Grace Kelly |
AH-44289 | The Birds | | Tippi Hedren | 01:59:00 (6 rows)
You can use t he COALESCE()oper at or t o t r ansfor m a NULL male_leadint o t he w or d 'Unknown':
movies=# SELECT title, COALESCE( male_lead, 'Unknown' ) FROM tapes; title | coalesce
---+---The Godfather | Marlon Brando The Godfather | Marlon Brando Casablanca | Humphrey Bogart Citizen Kane | Unknown
Rear Window | James Stewart The Birds | Unknown (6 rows)
The COALESCE()oper at or is m ore t alent ed t han w e'v e show n her e—it can sear ch t hr ough a list of values, r et ur ning t he fir st non -NULLvalue
Ta ble 1 .4 . Tr u t h Ta b le for Th r e e - V a lu e d ANDOpe r a t or
a b aANDb
TRUE TRUE TRUE
TRUE FALSE FALSE
TRUE NULL NULL
FALSE FALSE FALSE
FALSE NULL FALSE
NULL NULL NULL
Sour ce: Post gr eSQL User 's Guide
Ta ble 1 .5 . Tr u t h Ta ble f or Th r e e - V a lu e d OROpe r a t or
a b aORb
TRUE TRUE TRUE
TRUE FALSE TRUE
TRUE NULL TRUE
FALSE FALSE FALSE
FALSE NULL NULL
NULL NULL NULL
Sour ce: Post gr eSQL User 's Guide
Ta ble 1 .6 . Tr u t h Ta ble f or Th r e e - V a lu e d NOTOpe r a t or
a NOTa
TRUE FALSE
FALSE TRUE
NULL NULL
it finds. For ex am ple, t he follow ing quer y pr int s t he male_lead, or , ifmale_leadis NULL, t he female_lead, or if bot h ar e NULL, 'Unknown':
movies=# SELECT title, COALESCE( male_lead, female_lead, 'Unknown' ) movies-# AS "Starring"
movies-# FROM TAPES; title | Starring
---+---The Godfather | Marlon Brando The Godfather | Marlon Brando Casablanca | Humphrey Bogart Citizen Kane | Unknown
Rear Window | James Stewart The Birds | Tippi Hedren (6 rows)
You can st r ing t oget her any num ber of ex pr essions inside of t he COALESCE()oper at or ( as long as all ex pr essions ev aluat e t o t he sam e t ype) and COALESCE()w ill ev aluat e t o t he left m ost non -NULLvalue in t he list . I f all of t he ex pr essions insideCOALESCE()ar e NULL, t he ent ire expr ession ev aluat es t o NULL.
T h e CASEEx p r e s s i o n movies-# WHEN male_lead = 'James Stewart' THEN 'great movie' movies-# WHEN duration > '2 hours' THEN 'long movie'
cust om er s w ho have a balance ov er $10:
customer_id | customer_name | phone | birth_date | balance ---+---+---+---+---expr ession, Post gr eSQL uses "?column?" for t he field header[ 10].