Integrity

Ensure that the data is correct.

Constraints

State Constraints

Type, Attribute
- STATUS >= 1 and STATUS <= 100
- WEIGHT > 0.0
Relation
- unique SID
- no supplier with CITY = 'London' and STATUS != 20
Database
- no supplier with STATUS < 20 and QTY > 500

Relation Predicate: logical AND of all relation constraints

Golden Rule: no update operation may leave any relation in a state violating its predicate

Transition Constraints

Valid transitions, e.g.

Single to Married
Married to Divorced
Married to Widowed

Not valid:

Single to Widowed
Single to Divorced
Widowed to Divorced

Keys

Candidate Key K: set of attributes with

Uniqueness: no distinct tuples with same value for K
Irreducibility: no subset of K has Uniqueness

Every relation has at least one candidate key.
If we cannot find any subset of attributes that form a candidate key we can use the complete set of attributes.

Superkey: unique, but not necessarily irreducible

In case of more than one candidate key: choose one as primary key, the others are alternate keys

Foreign Key: some set of attributes F in relation B, where

relation A with candidate key C
each value of F in B is identical to the value of C in some tuple in A

The value of F is a reference to the tuple containing C.
Example: value of ORDR.CUST must be equal to an existing value of CUST.ID

Referential Integrity: ensuring valid foreign key values

Referential Actions:

ON DELETE

CASCADE: delete matching tuples
RESTRICT: to cases without matching tuples

Trigger: call procedure on event

SQL Facilities

Domain Constraints

CREATE DOMAIN COLOR CHAR(6) DEFAULT '???'
  CONSTRAINT VALID_COLORS
    CHECK ( VALUE IN ('Red', 'Yellow', 'Blue', 'Green', '???'));

Base Table Constraints

Candidate Keys:

PRIMARY KEY (...)
implies NOT NULL
UNIQUE (...)

Foreign Keys:

CREATE TABLE SP ...
FOREIGN KEY (SID) REFERENCES S (SID) ON DELETE

CASCADE
SET DEFAULT
SET NULL
NO ACTION (default)

Check Constraint: add to column definition, e.g.

.. CHECK (QTY > 0 AND QTY <= 5000)

limitations varying according to DBMS, e.g. Postgres 11 no subqueries

Assertions

CREATE ASSERTION A1 CHECK ((SELECT MIN(STATUS) FROM S) > 4);

Widely unsupported; e.g. Postgres: ERROR: CREATE ASSERTION is not yet implemented

Related concept: TRIGGER

Call procedure on certain events, such as update on given table

Supported e.g. in Postgres, but sacrifices elegance of declarative SQL, need to write procedural code

Views

Named relational expression

Examples

 > create view GS as select sid, status, city from S where status > 15;

> select * from GS;

  sid  | status |      city       
-------+--------+-----------------
 S1    |     20 | London         
 S3    |     30 | Paris          
 S4    |     20 | London         
 S5    |     30 | Athens         

> select * from S;

  sid  |        sname         | status |      city       
-------+----------------------+--------+-----------------
 S1    | Smith                |     20 | London         
 S2    | Jones                |     10 | Paris          
 S3    | Blake                |     30 | Paris          
 S4    | Clark                |     20 | London         
 S5    | Adams                |     30 | Athens

Update:

Postgres 8.3:

 > insert into GS values ('S6', 20, 'Vienna');
ERROR:  cannot insert into a view
HINT:  You need an unconditional ON INSERT DO INSTEAD rule.

☆ Postgres 9.3: no problem

Mysql:

 > create view GS as select SID, STATUS, CITY from S where STATUS > 15;
Query OK, 0 rows affected (0.03 sec)

> insert into GS values ('S6', 20, 'Vienna');
Query OK, 1 row affected (0.01 sec)

> select * from GS;
+-----+--------+--------+
| SID | STATUS | CITY   |
+-----+--------+--------+
| S1  |     20 | London |
| S3  |     30 | Paris  |
| S4  |     20 | London |
| S5  |     30 | Athens |
| S6  |     20 | Vienna |
+-----+--------+--------+
5 rows in set (0.00 sec)

> select * from S;
+-----+-------+--------+--------+
| SID | SNAME | STATUS | CITY   |
+-----+-------+--------+--------+
| S1  | Smith |     20 | London |
| S2  | Jones |     10 | Paris  |
| S3  | Blake |     30 | Paris  |
| S4  | Clark |     20 | London |
| S5  | Adams |     30 | Athens |
| S6  | NULL  |     20 | Vienna |
+-----+-------+--------+--------+
6 rows in set (0.00 sec)

Advantages of Views

Security: force access through view, hide data
Shorthand, macro
Different users see same data in different ways
Logical data independence

Logical data independence

Immunity of users and programs to changes in the logical structure of the database (conceptual level)

Growth
- new attributes
- new relations
Restructuring
- E.g. split relation in two and provide view instead
- Select should work
- Expect problems with updates

View Retrieval

materialize (simple)
substitute (efficient)

View Update

can be regarded as shorthand for DELETE-INSERT

Snapshot

Data as it was at given point in time. Interesting for

Accounting
Decision Support

SQL

E.g. Part-Quantity, no suppliers for P7:

    insert into P (pid, pname) values ('P7', 'Hammer');

 > create view PQ as select PID, sum(QTY) as TOTQTY from SP group by PID;

> select * from PQ;

  pid   | totqty 
--------+--------
 P1     |    600
 P3     |    400
 P5     |    600
 P4     |    500
 P2     |   1000
 P6     |    100

No entry for P7.

Nested subquery representing scalar value:

 > select P.PID, (select sum(SP.QTY) from SP where SP.PID = P.PID) as TOTQTY from P;

  pid   | totqty 
--------+--------
 P1     |    600
 P2     |   1000
 P3     |    400
 P4     |    500
 P5     |    600
 P6     |    100
 P7     |

Left join:

select P.PID, sum(QTY) from P left join SP on SP.PID = P.PID group by P.PID;
  pid   | sum  
--------+------
 P1     |  600
 P2     | 1000
 P3     |  400
 P4     |  500
 P5     |  600
 P6     |  100
 P7     |

☆ select P.PID, coalesce(sum(QTY),0) from P left join SP on SP.PID = P.PID group by P.PID;

  pid   | coalesce 
--------+----------
 P4     |      500
 P3     |      400
 P1     |      600
 P2     |     1000
 P6     |      100
 P7     |        0
 P5     |      600

Keeping Track of Everything: the Catalog

Metadata i.e. data about data
- information about tables and their columns
- optimizer information (indexes, physical storage structures, ..)
- security constraints
in relational systems: itself in relations, e.g. TABLE, COLUMN
therefore can be queried in the same form as user tables (SQL)

Postgres:

SELECT table_name,column_name
FROM information_schema.columns WHERE data_type = 'numeric';

 table_name | column_name 
------------+-------------
 p          | weight      
 s          | status      
 sp         | qty