-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Databases - database systems dbms persistence storage programming interface transaction management operations DDL data definition languages used to define database schema DML data manipulation language used for querys, really for CRUD create INSERT retreive SELECT update UPDATE delete DELETE applications airline reservations banking systems corporate records filesytems don't typically support efficient search larger numbers of small updates complex queries "controlled buffering"? file systems do cache... atomic/independant transactions relational database systems terms relations are tables attributes are the columns tuples are the rows trends sizing small to desktops applications big to tera and peta buytes tertiary storage parallel computing systems client-server n-tier multimedia data information integration components ddl compiler query compiler transaction manager execution manager logging and recovery concurrency control (lock tables) index/file/record manager buffer manager (buffers) storage manager (storage) ACID atomicity - all or nothing execution consistency - consistency constraints must be observed isolation - each much appear to be executed without interaction from others durability - the effect must never be lost once completed - Entity-Relationship Data Model entity-relationship model (aka E/R model) entity sets attributes relationships E/R diagrams notational symbols entity sets are rectangle (double border for weak entity sets, see below) attributes are ovals (underlines for key members, see below) relationships are diamonds (double diamond for supporting relationships, see weak entity sets below) arrow from movie to studio implies only one studio can own a movie (one-many) rounded arrow for referential integrity constraint (see below) diagram is for schema of database, that is structure or design a database with particular data is refered to as an instance multiplicity of binary e/r relationships R is many-one from E to F: E can be connected by R to a most one F R is one-one: E is many-one to F and F is many-one to E example Studios is one-one with Presidents R is many-many: in other cases multiway relationships an arrow pointing to E means that if we select one from each of the othr sets, those are rleated to a most one in E contract: stars, movies, ->studios. for the pair of any star and any movie, there is a contract with just one studio not as powerful as functional dependencies roles in relationships if a relationship includes the same set twice examples movies sequel-of or user supervisor-of label edges with role in the relationship examples origina/sequel employee/supervisor attributes on relationshives example attach salary to contract relationship unnecessary, could create new relation to hold atributes, but useful sugar convert multiway relationships to binary turn multiway relationship into an a new entity set then put binary relationships along old arrows note multiplicity might change (especially if functional dependencies were hidden) subclasses in E/R model isa relationship (triangle around word "isa", which is short for "is a") assume root set of the hierarchy this root has a key for every representative of the hierarchy an entity may have components in any subtree, as long as root is included roger rabbit would be in cartons, murder-mysteries, and movies, not quite OO model design issues be faithful to what you are modeling avoid redundancy simplicity - modeling of constraints classification keys: uniquely identify an entity single-value constraints: for example for many-to-one relationships referential integrity constraints: avoid dangling references domain constraints: attribute must be from specific set or range general constraints: arbitrary, only 10 stars per movie, general constraint expression language to come keys are represented by underlined attribute names only one key set per diagram with this syntax single value constraints either comments on attributes (non-null) or arrows for many-one, one-one relationships referential integrity use rounded arrow head to indicate referential integrity required non-null, one-many for example need non-null since one-many can mean at most one. either forbid deletes or cascade them updates another issue weak entity sets notation entity set has double rectangle supporting relationships have double diamond weak if some keys part of another set E's key consists of zero or more of its E's attributes key attributes from sets reached be certain many-one "supporting relationships" must have binary many-one relationship R from E to F R must have referential integrity from E to F the attributes from F must be key attributes of F if F as weak, recurse... if there are multiple relationships from E to F, might have same attributes with different values for the different "roles" example studios (pk=name) crews (pk=number+studio's name) - relational data model definitions by example relation Movies attributes title, year, length, filmType schema Movies(title, year, length, filmType) set of schemas relational database schema, database schema, design tuples (Star Wars, 1977, 124, color) domain domain of attributes must be atomic, no compound structs or nested tuples color blackAndWhite equivalent relations relations are sets, so order of tuples does not matter relation instances a set of tuples for a given relation at a specific time "current instance" is all that is kept in commerical rdbms from e/r diagram to relational designs entity set -> relation relationship -> relation whose attributes are the keys for the connected entity sets sometimes want to combine relations into one relation for simplicity many-one relation from E to F Movies - Owns -> Studio title year length filmType studioName new relation all the attributes of E all the key attributes of F and relationship belonging to relationship R weak entity relation for weak enity W must include attributes for W and all key attributes from W from other sets all the keys of W plus from the other sets are the new key however, the relationship R need not create a new relation, collapse as in Movie-Owns->Studio example isa cases: follow the e/r view for each set create a relation treat entities as objects for each subtree including root, create one relation this even means multiple diamond inheritence like Movies alone Movies and Cartoons only Movies and Murder-Mysteries only Movies and Cartoons and Murder-Mysteries roger rabbit would be in last case use null values create one relation for all the entity sets in the hierarchy Move(title, year, length, filmType, weapon) NULL weapon for non-murder-mysteries tradeoffs queries expensive to answer queries over multiple relations null approach works well here superclass queries queries just on movie attributes work well with e/r view, object view has to hit all relations subclass queries work best on object-oriented approach, worst in e/r view we would like not to use too many relations null -> one relation e/r -> n relations oo -> n^2 relations for root + n children case (but who does this that way...) we would like to minimze space and avoid repeating information oo -> one tuple per entity, no extra attributes nulls -> one tuple per entity but unneed null attributes er -> several tuples, but only keys duplicated functional dependencies (FDs) definition if two tuples agree on a set of A attribute, then they must agree on B A1, A2, ..., An -> B or if they imply more than one A1, A2, ..., An -> B1, B2, ..., Bn example Movies(title, year, length, filmType, studioName, starName) title year -> length title year -> filmType title year -> studioName or concisely title year -> length filmType studioName key redefition set of attributes is a key for R if 1.) those attributes functionally determine all the other attributes 2.) no proper subset functionally determines all other attributes a key must be minimal by 2.) however minimal does not mean minimum key size, just that you can't remove any attributes ABC and DE could both be minimal keys, even if DE is the minimum size superkeys only holds to clause 1, not 2. all keys are also superkeys discovering keys rule one: if a relation comes from an entity set, then the key is they key attributes rule two: for binary relationship R many-many: keys of both connected entity sets are key of R many-one from E1 to E2: keys of E1 one-one: keys of either rule three: multiway relationship R with arrow to E, no general answer, but one guarantee there is at least one key for the corresponding relation that excludes the key of E rules for functional dependencies trivial FD's A1, A2, ..., An -> B1 A1, A2, ..., An -> A1 (everything on the left can imply itself) actually true for any subset of the left on the right splitting/combining combining A1, A2, ..., An -> B1 A1, A2, ..., An -> B1 A1, A2, ..., An -> B1, B2 splitting A1, A2, ..., An -> B1, B2 A1, A2, ..., An -> B1 A1, A2, ..., An -> B1 augmentation (more for proofs) A1, A2, ..., An -> B1, ..., Bm A1, A2, ..., An, C1...Ck -> B1, ..., Bm, C1...Ck closure 1.) let X bet a set of attributes A that will become the closure initialize this with {A1, ..., An} the trivial FD's 2.) find some FD B1..Bm -> such that B1...Bm are in A, but not C. Add C to X 3.) repeat 2 until no more attributes can be added to X 4.) X = {A1...An}+ (the plus exponent sign is the closure operator for a set of attributes) transitive rule if A1..An->B1...Bm and B1..Bm->C1...Ck, then A1..An->C1...Ck also holds for R title year -> studioName studioName -> studioAddr title year -> studioAddr Armstrongs's axioms can be used to derive FD's w/o closure reflexivity (trivial) + augmentation + transitive closing sets distinguish between given set of FD's and derived FD's basis set of FD's can derived all others, can be mimimal projecting dependencies relation R with FD's f is projected by eliminating attributes, creating S what FD's of R still apply to S? compute all FD's that follow from F involve only attributes of S simplifying new FD's worse case exponential in the number of attributes of S however closing the empty set and thse set of all attributes cannot yield a set of non-attributes if we already know that the closure of some set X is all attributes, then we can't discover any more with supersets example R(A,B,C,D) S(A,C,D) FD's of R A->B B->C C->D A->C C->D design of relational database schemas anomalies redundancy update anomalies deletion anomalies examples Movies title year length starName Star Wars 1977 124 Harrison Ford Star Wars 1977 124 James Earl Jones Mighty Dux 1991 104 Emilio Estevez redundancy length twice for star wars update anomalies update length on only one tuple deletion anomalies delete emilio as star, deletes two record for dux decomposing relations decompose R A... to S B... and TC... such that {A...} = {B...} union {C...} tuples in S are projected from R, droping duplicates tuples in T, same as S example Movies1 title year length Star Wars 1977 124 Mighty Dux 1991 104 Movies2 title year starName Star Wars 1977 Harrison Ford Star Wars 1977 James Earl Jones Mighty Dux 1991 Emilio Estevez redundancy only in keys update anomaly gone delete anomaly gone Boyce-Codd Normal Form (BCNF) A relation R is in BCNF iff whenever there is a nontrivial FD A... -> B for R, then it is the case that A... is a superkey for R. that is the left side of every FD must contain a key example Movies(title year length starName) key is title year starName FD title year -> length however, FD left side does NOT contain full key, therefore NOT BCNF Movies1(title year length) key is title year FD title year -> length FD left side does contain key, therefore BCNF decomposing into BCNF for any violating FD, decompose into two tables one of both the left and right side of the FD the other with all the attributes not on the right of the FD repeat until no more violating FD's example Movies(title year length starName) key is title year starName FD title year -> length so we have BCNF conflict Movies1(title year length) (left+right) Movies2(title year starName) (all-right) note left+right union all-right = all attributes, we don't lose data note that we can recover information by joining relations back to original Third Normal Form (3NF) definition a relation r is in 3NF if: whenever A...->B is a nontrivial FD, either A... is a superkey (BCNF) or B is a member of some key example Bookings(title, theater, city) FD theater -> city title city -> theater Keys title city theater title Multivalued Dependencies (MVDs) an assertion that two attributes or sets of attributes are independent of one another FD's are not enough, we might be in BCNF and still have redundancy can happen if we put two many-many relationships in one relation example Stars-in name street city title year C. Fisher 123 Maple St. Hollywood Star Wars 1977 C. Fisher 123 Maple St. Hollywood ESB 1980 C. Fisher 5 Locust Ln. Malibu Star Wars 1977 C. Fisher 5 Locust Ln. Malibu ESB 1980 there are no FD's here, but lots of redundancy MVD A... ->> B... for all tuples that agree on A's, there is some that agree on B's more precisely, for each pair of tuples t and u that agree on A's, we can find a tuple v that agrees: 1.) with both t and u on A's, 2.) with t on the B's, and 3.) with u on the attributes of R that are not among the A's or B's example name ->> street city rules trivial A... ->> B... A... ->> B...+{some of the A's} transative no splitting/combining every FD is a MVD complement A... ->> B... A... ->> C... (where C... are all the attributs that aren't in A... or B...) Fourth Normal Form (4NF) whenever A...->>B..., is a nontrivial MVD, then A... is a superkey nontrivial is the extensions of MVD's, but also the MVD of all attributes example Stars-in(name street city title year) key is name street city title year) one FD is name ->> street city violation of 4NF decompose as for BCNF Stars-in1(name street city) MVD "name ->> street city" is now trivial, it involves all keys Stars-in2(name title year) MVD "name ->> title year" likewise relations between normal forms 4NF -> BCNF -> 3NF Property 3NF BCNF 4NF ------------------------------------------------------------------------ Eliminate redundancy due to FDs Most Yes Yes Eliminate redundancy due to MVDs No No Yes Preserve FDs Yes Maybe Maybe Preserve MVDs Maybe Maybe Maybe - Other Data Models object orient concepts types record structures collection types reference types classes and objects object identity methods class hierarchies odl types atomic types integer floag character, string, boolean, enumerations class names type constuctors colletions Set Bag List Array (i is length) Dictionary Struct N { T1 F1, ..., Tn Fn } class declarations class Movie { attibute string title; attibute integer year; attibute integer length; // note enumeration attibute enum Film {color, blackAndWhite} filmType; // many-many relationship Set stars inverse Start::starredIn; // many-one relationship Studio ownedBy inverse Studio::owns; // method examples float lengthInHours() raises(noLengthFound); void starNames(out Set); void otherMovies(in Star, out Set) raises(noSuchStar); } class Stars { attibute string name; // note struct attibute Struct Addr {string streer, string city} address; relationship Set starredIn inverse Movies::stars; } class Studio { attibute string name; attibute string address; // one-many relationship Studio ownedBy inverse Studio::owns; } multiway relationship in ODL subclasses in ODL examples class Cartoon extends Movie { relationship Set voices; } class MurderMystery extends Movie { attribute string weapon; } multiple inheritence example class CartoonMurderMystery extends MurderMystery : Cartoon; standard doesn't dictate resolution disallow just pick one by declaration order give a new name in the conflicting property name extents class Movie (extent Movies) { ... } OQL uses extent name (entity set) not class name (schema) keys class Movie (extent Movies key (title, year)) { ... } class Star (extent Stars key name) { ... } class N (extent Stars key foo, (bar, baz), qux) { ... } commas between key sets from ODL to relations issues may need to generate keys ODL attributes are not always atomic, need to be split collections often lead to unnormalized relations methods (punt) bags need to add count attribute to tuples list need to add count a count attribute to tuples array need to add to dupliate attribute names street1 ciry1 street2 city2 dictionary set, attribute are container key attributs, domain key attributs, range key attributes struct each field becomes attribute (with possible renaming) object-relational model extensions to relational structured types for attributes (like ODL, including collections) nested relations methods identifers for tuples references comparison to ODL objects and tuples extents and relations methods type systems references and object-id's backwards compatibility conversion from ODL to object-relational semistructured data self-describing, schema attached to data itself roles integration of databases similar data with different schemas document model in notations such as XML lots notes apparently collection of nodes leaf: attributes interior: arcs out arcs: labeled, both for collection/containment and relationships root node information integration vs semistructured data legacy database problem XML and its data model Extensible Markup Language document element = well-formed XML - semistructured data without DTD followed by root tag valid xml - DTD dtd = document type definition example with all tags ]> example with attributes ]> - relational algebra sets vs bags an algebra of relational operations atomic operands 1.) variables that stand for relations 2.) constants, which are finite relations operations a.) set operations: union, intersection, difference need to have the same attributes need to be sorted syntax sets b.) operations that remove parts of a relation projection - eliminates some columns (attributes) syntax pi a1...an(r) pi title,year,movie(Movie) selection - eliminates some rows (tuples) syntax sigma C(r) sigma lenth>100(Movie) c.) operations that combine the tuples of two relations A B 1 2 3 4 B C D 2 5 6 4 7 8 9 10 11 Cartesian product (aka cross product, product) syntax R x S A R.B S.B C D 1 2 3 5 6 1 2 4 7 8 1 2 9 10 11 3 4 3 5 6 3 4 4 7 8 3 4 9 10 11 join operations natural joins syntax R>< S sigma C(R X S) C = R.A1 = S.A1 AND ... AND R.An = S.An linear notation M(MA...) := sigma C1 Movies S(MA...) := sigma C2 Studios I(MA...) := M union S Answer (title, year) = rho (title,year)(piA...)(I) relational operations on bags union, intersection, difference based on counts projection leaves duplicates selection leaves duplicates product includes input dups joinws leaves duplicates algebraic laws can be different commutative still true but (R union S) - T = R-T union S-T is false extended operations duplicate-elimination operator delta(R) turns a bag into a set aggregation operators sum avg min/max count (not necessarily distinct) grouping operator syntax gammaList List elements can be attributes of the relation R (the grouping attribute) aggregation operators applied to an attribute of the relation construction partition the tuples into groups for each group, produce on etuble consisting of the grouping attributes' value for that group the aggregration over all tuples of that group, for the aggregration attributs in L delta is a special case of gamma delta(R) = gammaA...(R), that is group on all attributes extended projection elements can be a single attribute of R x -> y, renaming x to y E -> z where E is an expression containing attributes of R constants arithmetic operators string operators z is a cnew name a+b -> z c||d -> z concatenation sorting operator taoList(R) List of attributes funny because it returns a list of tuples, not a set if another operator occurs after, then it's a noop outerjoins including in join result any dangling tuples, pad missing attributes with nulls syntax R >o< S natural full outerjoin syntax R >ooo address sigma (MS1.name = MS2.name AND MS1.address != MS2.address) (MS1xMS2) = {} that is, if two stars have the same name, they better have the same address - SQL SELECT attributes FROM relations WHERE conditions SELECT L FROM R WHERE C => pi L(sigma C(R)) more general in this order SELECT, FROM, WHERE. GROUP BY, HAVING, ORDER BY only first two required projection SELECT a FROM b attributes SELECT a AS c FROM b renaming SELECT count(a) FROM b aggregration SELECT a+c FROM b expressions expression operators + - * / || concatenation selection WHERE operations = <> < > <= >= AND OR NOT LIKE % _ LIKE 'x%%x%' ESCAPE 'x' IS NULL IS NOT NULL set operators UNION, INTERSECT, EXCEPT set comparisions operators EXISTS, IN, > ALL, > ANY types strings 'Disney' 'Michael''s' CHAR(n), VARCHAR(n) bit string B'011' X'DEADBEEF' BIT(n), BIT VARYING(n) boolean TRUE FALSE BOOLEAN numbers 1970 -12.34 1.23E45 INT, INTEGER(always synonym for INT), SHORT INT, FLOAT, REAL, DOUBLE PRECISION, DECIMAL(n digits, d decimal), NUMERIC(n,d) (usually synonym for DECIMAL) date DATE '1948-05-14' DATE can be used as string time TIME '15:00:02.5' TIME can be used as string timestamp TIMESTAMP '1948-05-14 12:00:00' TIMESTAMP can be used as string null (aka NULL) uses value unknown unknown birthdate value inapplicable no spouse value withheld unlisted phone rules 1.) NULL with any arithmetic operator and any operand, including NULL, equals NULL 2.) NULL with any comparison operator and any operand, including NULL, equals UNKNOWN UNKNOWN rules TRUE = 1 FALSE = 0 UNKNOWN = 0.5 AND = the AND of two truth values is the minimum OR = the OR of two truth values is the maximum NOT = 1-v ordering output ORDER BY ORDER BY filmType, length DESC ORDER BY filmType ASC products and joins product = listing multiple relations in FROM clause join = WHERE clause condition conflicting attributes, prefixed with relation name "dot" attribute SELECT MovieStar.name, MovieExec.name FROM MovieStar, MovieExec WHERE MoveieStar.address = MoveieExec.address conflicting attributes from same table, prefixed with tuple variable "dot" attribute SELECT Star1.name, Star2.name, FROM MovieStar Star1, MovieStar Star2 WHERE Star1.name = Star2.name AND Star1.name < Star2.name ; NOTE: the ordering AND clause prevents duplicate results of X,Y Y,X interpretations nested loops parallel assignment conversion to relational algebra union, intersection, difference parenthesize sub-relations when using UNION, INTERSECT, EXCEPT subqueries that produce scalars, can compare to other values in where clause ... WHERE x = (SELECT ...) can return relations used in where clause ... WHERE EXISTS (SELECT ...) test for set existance (emptyness) ... WHERE x in (SELECT ...) scalar in set ... WHERE (x,y) in (SELECT ...) tuple in set ... WHERE x > ALL (SELECT ...) all x greater than set (can negate use with other 5 operators) ... WHERE x > ANY (SELECT ...) some x greater than set (can negate use with other 5 operators) correlated subquery SELECT title FROM Movie Old WHERE year < ANY (SELECT year FROM Movie WHERE title = Old.title); note that the subquery must be evalued once for every value of the outer loops title finds the older movies with the same title (original versions of remakes) can return relations used in from clauses like a stored relation SELECT ... FROM (SELECT ...) Alias... WHERE ... Alias is required joins products Movie CROSS JOIN StarsIn natural joins Movie NATURAL JOIN MovieExec joins on commonly named attributes of name and address in this example theta joins Movie JOIN StarsIn ON title = movieTitle AND year = movieYear remove duplicate columns SELECT ... FROM Movie JOIN StarsIn ON title = movieTitle AND year = movieYear outer joins MovieStar NATURAL FULL OUTER JOIN MovieExec MovieStar NATURAL LEFT OUTER JOIN MovieExec MovieStar NATURAL RIGHT OUTER JOIN MovieExec full relation operations duplicate removal SELECT DISTINCT keep duplicates UNION ALL, INTERSECT ALL, EXCEPT ALL aggregation SUM AVG MIN MAX COUNT SELECT COUNT(starName) FROM StarsIn; SELECT COUNT(DISTINCT starName) FROM StarsIn; SELECT AVG(netWorth) FROM MovieExec; grouping SELECT studioName, SUM(length) FROM Movie GROUP BY studioName; only can include grouped attributes or aggregation in result set HAVING apply condition to group SELECT name, SUM(length) FROM MovieExec, Movie WHERE producerC# = cert# GROUP by name HAVING MIN(year) < 1930; modifications insertion INSERT INTO R(A...) VALUES (V...) missing values defaulted per schema or NULL INSERT INTO R(A...) subquery deletion DELETE FROM R WHERE ... updates UPDATE R SET name = 'value' WHERE ... DDL see types above tables CREATE TABLE MovieStar ( name CHAR(3), address VARCHAR(255), gender CHAR(1) DEFAULT '?, note default birthdate DATE DEFAULT DATE '0000-00-00', ssn INTEGER UNIQUE, note unique constraint (see next line and UNIQUE INDEX below) UNIQUE(ssn), another way to make an attribute or set of attribute unique PRIMARY KEY (name)); note primary key declaration DROP TABLE MovieStar; ALTER TABLE ADD phone CHAR(16) DEFAULT 'unlisted'; ALTER TABLE DROP birthdate; indexes CREATE INDEX YearIndex ON Movie(year); CREATE INDEX KeyIndex ON Movie(title, year); CREATE UNIQUE INDEX SSNIndex ON MovieStar(ssn); DROP INDEX YearIndex; views CREATE VIEW ParamountMovie AS SELECT title, year FROM Movie WHERE studioName = 'Paramount' SELECT title FROM Paramount WHERE year = 1979; rename attributes CREATE VIEW ParamountMovie (title, year) AS modifying views (aka updateable views) update/insert/delete insert is hardest WHERE clause must not involve a subquery list in the select must include enough attributes that an tuple inserted includes all non-defaultable attributes fix this example with studioName, otherwise we can't insert a tuple with required attribute studioName all results always have studioName='Paramount' CREATE VIEW ParamountMovie AS SELECT studioName, title, year FROM Movie WHERE studioName = 'Paramount' DROP VIEW ParamountMovie; - constraints and triggers primary keys and unique keys added to create table above unique indexes added to create index above foreign keys referenced attribute must be primary or unique key values in first relation must exist in referenced releation CREATE TABLE Studio ( ... presC# INT REFERENCES MovieExec(cert#), ...OR... presC# INT, FOREIGN KEY (presC#) REFERENCES MovieExec(cert#) ) maintaining referential integrity default policy reject violating modifications cascade policy delete row and references update row and references Set-Null policy cascade update, but replace references with null example presC# INT REFERENCES MovieExec(cert#) ON DELETE SET NULL ON UPDATE CASCADE deferring constraints defer constraint checks until transaction commit by default not deferrable even for deferrable, can choose if we want default to be deferrable, or need transaction to override DEFERRABLE or NOT DEFERRABLE DEFERRABLE INITIALLY DEFERRED or DEFERRABLE INITIALLY IMMEDIATE presC# INT UNIQUE REFERENCES MovieExec(cert#) DEFERRABLE INITIALLY DEFERRABLE if named, toggle status with (naming covered later) SET CONSTRAINT MyConstraint DEFERRED; SET CONSTRAINT MyConstraint IMMEDIATE; attribute and tuple constraints not-null constraints presC# INT REFERENCES MovieExec(cert#) NOT NULL attribute based check constraints presC# INT REFERENCES MovieExec(cert#) CHECK (pres#>= 100000) gender CHAR(1) CHECK gender IN ('F', 'M)) tuple based check constraints CHECK (gender = 'F' OR name NOT LIKE 'Ms.%') altering constraints ALTER TABLE MovieStart DROP CONSTRAINT NameIsKey; ALTER TABLE MovieStart DROP CONSTRAINT NoAndro; ALTER TABLE MovieStart DROP CONSTRAINT RightTitle; ALTER TABLE MovieStart ADD CONSTRAINT NameIsKey PRIMARY KEY (name); ALTER TABLE MovieStart ADD CONSTRAINT NoAndro CHECK gender IN ('F', 'M)); ALTER TABLE MovieStart ADD CONSTRAINT RightTitle CHECK (gender = 'F' OR name NOT LIKE 'Ms.%'); schema level constraints and triggers CREATE ASSERTION name CHECK condition; DROP ASSERTION name; triggers: event-condition-action rules example CREATE TRIGGER name AFTER UPDATE OF attribute ON relation (or BEFORE) (or INSERT or DELETE, but then no OF clause) REFERENCING OLD ROW AS oldName (for INSERT, then no OLD clause) (or if FOR EACH STATEMENT and not ROW, then OLD TABLE) NEW ROW AS newName (for DELETE, then no NEW clause) (or if FOR EACH STATEMENT and not ROW, then NEW TABLE) FOR EACH ROW (or default FOR EACH STATEMENT, then we'd drop OLD ROW and NEW ROW) WHEN condition (optional) UPDATE ... (or INSERT or DELETE or BEGIN...END) instead of replace BEFORE/AFTER with INSTEAD OF, good for updatable VIEWs - System aspect of SQL SQL in programming environment impedance mismatch examples C no sets SQL no pointers SQL/Host Language interface EXEC SQL ... SQLSTATE returns 5 digit error code. 00000 OK, 02000 not tuple found, 21000 too many declares EXEC SQL BEGIN DECLARE SECTION char studioName[50], studioAddr[256] char SQLSTATE[6] EXEC SQL END DECLARE SECTION reference these variables with :studioName etc in EXEC SQL statement single-row select SELECT ... INTO :presNetWorth cursors general, reading EXEC SQL DECLARE cursor CURSOR FOR EXEC SQL OPEN cursor EXEC SQL FETCH FROM cursor INTO :worth #define NO_MORE_TUPLES !(strcmp(SQLSTATE,"02000")) EXEC SQL CLOSE cursor updating and deleting EXEC SQL UPDATE relation WHERE CURRENT OF cursor EXEC SQL DELETE FROM relationWHERE CURRENT OF cursor concurrent update INSENSITIVE keyword to make isolated EXEC SQL DECLARE cursor INSENSITIVE CURSOR FOR scrolling SCROLL keyword EXEC SQL DECLARE cursor SCROLL CURSOR FOR NEXT, PRIOR, FIRST, LAST, RELATIVE n, ABSOLUTE n NEXT = RELATIVE 1 PRIOR = RELATIVE -1 FIRST = ABSOLUTE 1 LAST = ABSOLUTE -1 EXEC SQL FETCH PRIOR FROM cursor INTO :worth dynamic SQL EXEC SQL BEGIN DECLARE SECTION char *query; EXEC SQL END DECLARE SECTION EXEC SQL PREPARE name FROM :query; EXEC SQL EXECUTE name; or EXEC SQL EXECUTE IMMEDIATE :query; Persistent Stored Modules (aka SQL/PSM PSM PSM-96 stored procedures) CREATE PROCEDURE name (parameters) local declarations; procedure body; CREATE FUNCTION name (parameters) RETURNS type local declarations; function body; CREATE PROCEDURE MOVE (IN oldAddr VARCHAR(255), IN newAddr VARCHAR(255)) UPDATE MovieStar SET address = newAddr WHERE address = oldAddr; statements (page 372 for big example) CALL name (arglist) EXEC SQL call Foo(:x, 3); RETURN expression; DECLARE name type; SET variable = expression; BEGIN ... END label: IF ... THEN ... ELSEIF ... THEN .. ELSE ... END IF; SELECT ... INTO local variable without : ... LOOP ... END LOOP; LEAVE looplabel; DECLARE NotFound CONDITION FOR SQLSTATE '02000'; FOR loopname AS cursorname CURSOR FOR query DO ... END FOR; use attribute names as variables names for each tuple exceptions DECLARE wheretogo=(CONTINUE|EXIT|UNDO) HANDLE FOR conditionList handlerStatement scoped by BEGIN...END block, but DECLARES at beginning of block CONTINUE = continue to next statement after the one that raise exception (after running handler) EXIT = leave block after handler UNDO = EXIT and roll back database AND local variable changes can use functions in SQL as if build in select GetYear(title) from Movies SQL Environment structures schemas - collection of tables, views assertsions, triggers, psm also stuff not in book domains, char sets, collations, grant statements) catalogs - collection of schemas, including special INFORMATION_SCHEMA clusters - collection of catalogs - each user has a cluster CREATE SCHEMA name declarations (like create tables) SET SCHEMA name; CREATE CATALOG name; // not standard SET CATALOG name; connections (client/server) CONNECT TO server AS connectionName AUTHORIZATION name and password SET CONNECTION conn1; DISCONECT conn1; session for connection keeps current schema, catalog modules Generic SQL Interface (interactive command line) Embedded SQL (EXEC SQL crap) True Modules (with PSM and native code modules) SQL Call Level Interface (SQL CLI aka ODBC) SQLRETURN = SQLAllocHandle(hType, hIn, hOut) 0 on success hType SQL_HANDLE_ENV SQL_HANDLE_DBC SQL_HANDLE_STMT hIn SQL_NULL_HANDLE hOut handle result alloc SQLHENV environment, then SQLHDBC db connection, then SQLHSTMT statement SQLPrepare(stmthandle, char* statement, int length (or SQL_NTS for null terminated)) SQLExecute(stmthandle); SQLFetch(stmthandle); SQLBindCol(stmthandle, colNo0, colType, pVar, varSize, varInfo) colType SQL_CHAR SQL_INTEGER SQLBindParameter(stmthandle, 1, ..., studioName, ...); JDBC connection Connection c = DriverManager.getConnection(url, name, password); statement Statement s = c.createStatement(); ResultSet rs = s.executeQuery("..."); s.executeUpdate("..."); preparedstatement PreparedStatement ps = c.prepareStatement("..."); ResultSet rs = ps.executeQuery(); ps.executeUpdate(); ps.setString(i, v); ps.setInt(i, v); resultset boolean more = rs.next(); rs.getString(i) rs.getInt(i) rs.getFloat(i) Transactions in SQL Serializability chooseSeat example locking Atomicity bank transfer example SQL the basics START TRANSACTION COMMIT check SQLSTATE of COMMIT, in case deferred constraints failed ROLLBACK read only SET TRANSACTION READ ONLY; SET TRANSACTION READ WRITE; usually the default, unless we specific READ UNCOMMITTED dirty reads SET TRANSACTION READ WRITE ISOLATION LEVEL READ UNCOMMITTED; set not in read only above isolation levels READ UNCOMMITTED as above READ COMMITTED repeated queries might get different results READ REPEATABLE READ repeated queries get the same results for the tuples seen but each seperate query might see results of two commits also will see phantom tuples, new tuples inserted since first read READ SERIALIZABLE (SQL default) security and user authorization privileges relations: SELECT, INSERT, DELETE, UPDATE first three can include attributes, such as SELECT(name) right to reference for referential integrity: REFERENCES right to use in declarations USAGE right to define triggers: TRIGGER right to execute code PSM ...: EXECUTE right to create subtypes of type: UNDER granting GRANT ON TO [WITH GRANT OPTION] revoking REVOKE ON FROM - Object-Orientation in Query Languages (not just OQL) OQL a.b.c or a->b->c are the same " quotes, not ' SQL like SELECT m.year from Movies [AS] m WHERE m.title = "Gone With the Wind" automatically correlated subquery (implicit) . note the from clause SELECT s.name from Movies m, m.stars s WHERE m.title = "Casablanca" complex outptu types SELECT DISTINCY Struct(star1: s1, star2: s2) (star1 is struct field name. s1 is local name) this is a result of Set addition OQL expressions quantifiers boolean result depending if x's match condition FOR ALL x in S : C(x) EXISTS x in S : C(x) object assignment and creation element object = ELEMENT(BAG) converts one element bag to a single object result set can be treated in C++ with array notation constants can include Set/Bag/List/Array/Struct creating new objects gwtw = Movie(title: "GWtW", year: 1939, length: 239, ownedBy: mgm); User-Defined Types in SQL (aka UDTs) CREATE TYPE AddressType AS ( street CHAR(50), city CHAR(20)) METHOD houseNumber() RETURNS CHAR(10; can use AddressType in create type CREATE TABLE MovieStar of StarType; references CREATE TYPE StarType AS ( name CHAR(30), address AddressType, // note addresstype bestMovie REF(MovieType) SCOPE Movie ); then then we create Movie table, we need to say REF IS Operations on Object-Relational Data following references x = REF(T) x->a to reference attribute a of T DEREF(x) returns referenced tuple user defined types select m.year() from Movie m where m.title() = 'King Kong' note the need for the () to call the getters generators and mutators T() for UDT is generator, (aka constructor) for each attribute, there is mutaor method x(b) (aka setter) ordering CREATE ORDERING FOR type EQUALS ONLY BY STATE (no <) CREATE ORDERING FOR type ORDERING FULL BY RELATIVE WITH F; F returns < 0, = 0, > 0 as most compare functions if F is used with EQUALS only, = 0 is equals, everything else unequal - joins inner - sql isolation levels - serializable - read committed - read uncommitted - repeatable read? syntax - create table - create index - insert into - select from where - delete from where - update set where grouping and aggregate operators - sql3 recursion - odl/oql