-
Notifications
You must be signed in to change notification settings - Fork 1
/
cs348.txt
1258 lines (755 loc) · 33.7 KB
/
cs348.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Introduction to database management
* What does a database management system provide?
Reliable, efficient, convenient, and safe multi-user storage of and access to
massive amounts of persistent data.
# Overview of Database Management
* What is the ANSI definition of data?
A representation of facts, concepts, or instructions in a formalized
manner suitable for communication, interpretation, or processing by humans
or by automatic means.
* How was data stored during the early beginnings of computers?
One data set per program, inputted as part of the program.
* What is a database?
A large and persistent collection of pieces of information organized in a way
that facilitates efficient retrieval and modification.
* What is a database management system?
A program that manages details related to storage and access for a database.
* What are some examples of a database?
- WWW
- Spreadsheets
* What is a data model?
A formalism that defines the schema.
* What is a schema?
A description of the data interface to the database.
* What is a database instance?
A database that conforms to a given schema.
* What were some of the original applications of databases?
- Inventory control
- Payroll
- Banking and financial systems
- Reservation systems
* What are the common circumstances that databases are applied in?
- Lots of data
- Data is formatted
- Requires persistence and reliability
- Requires efficient and concurrent access
* What are some of the issues that indicate one should use a database?
- Too many files with different structure.
- Shared files or replicated data.
- Need to exchange data.
When the data maintained by the system outscales the system itself.
* Why should one invest in database technology?
- Less redundancy
- Less inconsistencies
- Data integrity
- Shareability
- Security
* How does database technology improve programmer productivity?
They do not need to deal with data organization.
* What are some of the issues with just using files to store data?
- Redundancy: multiple copies of files
- Inconsistency: independent updates
- Inaccuracy: concurrent updates
- Incompatibility: multiple formats
- Insecurity: proliferation
- In-auditability: poor chain of responsibility
- Inflexibility: changes are difficult to apply
* What are the 5 parts of a database management system?
1. Data model
2. Access control
3. Concurrency control
4. Database recovery
5. Database maintenance
* What are the parts of the three-level schema architecture?
1. External schema
2. Conceptual schema
3. Physical schema
* What is the external schema w.r.t the three-level schema architecture?
What the application programs and users see.
* What is the conceptual schema w.r.t the three-level schema architecture?
The description of the logical structure of all data in the database.
* What is the physical schema w.r.t the three-level schema architecture?
The description of physical aspects, such as files, devices, and storage
algorithms.
* How does a DBMS commonly follow the 3-level schema architecture?
- Different applications can see different sets of tables.
- The physical design can be controlled with different implementations.
* What are the advantages of the 3-level schema architecture?
There needs to be only one repository of information.
This avoids redundancy and inconsistency, and allows one to manage access
control and performance.
* What is data independence?
Applications do not access data directly, but rather through an abstract
data model provided by the DBMS.
* What are the two types of data independence?
Physical and logical.
* What is physical data independence?
Applications are immune to changes in storage structures.
* What is logical data independence?
Applications are immune to changes in data organization.
* What is a data definition language used for?
For specifying schemas.
* Where is the information specified by a data definition language stored?
The data dictionary, or catalog.
* What is a data manipulation language used for?
For specifying queries and updates.
* What are the two kinds of data manipulation languages?
- Navigational (procedural)
- Non-navigational (declarative)
* What are the 3 types of database users?
- End user
- Application developer
- Database administrator
* What is an end user's involvement with the database?
- Using forms or query-generating applications.
- Generating ad-hoc queries specified in the data manipulation language.
* What is an application developer's involvement with the database?
Designing applications or services that accesses the database.
* What is a database administrator's involvement with the database?
- Manages conceptual schema
- Assists with application view integration
- Monitors and tunes DBMS performance
- Defines the internal schema
- Loads and reformats the database
- Responsible for security and reliability
* What is the core idea that transactions try to achieve?
Every application may think that it is the sole application accessing the
data.
* What is a transaction?
An application-specified atomic and durable unit of work.
* What properties of transactions does the DBMS ensure?
The ACID properties.
* What is ACID?
Atomic: A transaction occurs entirely, or not at all.
Consistency: Each transaction preserves the consistency of the database.
Isolated: Concurrent transactions do not interfere with each other.
Durablece completed, a transaction's changes are permanent.
# Relational model
* What is the main idea of the relational model?
All information is organized in flat relations.
* What are the features of the relational model?
- Simple and clean data model
- Powerful and declarative query/update languages
- Semantic integrity constraints
- Data independence
* What is a database in the relational model?
A set of named relations/tables.
* What is a relation in the relational model?
A set of named attributes/columns.
* What is an attribute in the relational model?
A type or domain.
* What does a tuple or row have in the relational model?
A value for each attribute.
* What are the 4 properties of the relational model?
1. Based on set theory. Therefore no ordering or duplicates.
2. All attribute values are atomic
3. degree(arity) = # of attributes in schema
4. cardinality = # of tuples in instance.
* What is the degree of a schema?
The number of attributes in the schema.
* What is the cardinality of an instance?
The number of tuples in the instance.
* What is the difference between relational models and SQL w.r.t semantics of instances?
Relations and sets of tuples, while tables are multisets of tuples.
* What is the difference between relational models and SQL w.r.t unknown values?
There is no concept of unknown values in the relational model, but SQL defines
the NULL value.
* What is the main idea of integrity constraints?
Extend schema with rules called "constraints".
An instance is only valid if it satisfies all schema constraints.
* What are two reasons for using constraints?
1. Ensure data modification respects database design.
2. Protect data from bugs in applications.
* What are the three levels of database constraints?
1. Tuple-level
2. Relation-level
3. Database-level
* What are the two types of tuple-level database constraints?
1. Domain restrictions
2. Attribute comparisons
* What is a domain restriction constraint?
Values can only be of a certain range.
* What is an attribute comparison constraint?
The relationship between two values is enforced.
* What are two types of relation-level constraints?
1. Key constraints
2. Functional dependencies
* What is a superkey?
A set of attributes for which no two tuples will have the same value.
* What is a candidate key?
A minimal superkey.
* What is a primary key?
A designated candidate key.
* What are functional dependencies?
A constraint between two sets of attributes.
* What are two types of database-level constraints?
1. Referential integrity
2. Inclusion dependencies
* What is referential integrity?
All instances of foreign keys must exist as a primary key instance.
* What is a foreign key?
A primary key of one relation appearing as attributes of another relation.
* What are inclusion dependencies?
If it exists in one table, it must exist in another.
e.g. "Every manager is an employee."
* What is an invariant?
A condition that must always be held true.
# Relational algebra
* What is the selection operator?
\sigma_{condition}(E)
Returns the subset of E's tuples that satisfy the condition.
* What is the projection operator?
\pi_{attributes}(E)
Returns all the tuples in E with only the attributes specified.
Eliminates duplicates.
* What is the rename operator?
\rho(R(F), E)
R = new relation name
F = list of attributes -> new attribute names
E = old relation name
Renames relation names or attributes.
* What is the product operator?
E1 \times E2
The result schema contains all the attributes of E1 and E2.
The result instance contains one tuple for every pair of tuples in E1 and E2.
Renaming might be needed if E1 and E2 share attribute names.
* What is the conditional join operator?
E1 \Join_{condition} E2
The result schema contains all the attributes of E1 and E2.
The result instance contains one tuple for every pair of tuples in E1 and E2
that meets the condition.
Renaming might be needed if E1 and E2 share attribute names.
* What is the conditional join operator equivalent to?
\sigma_{condition}(E1 \times E2)
* What is the natural join operator?
E1 \Join E2
Joins the two relations on the common attributes.
* What are the steps for performing a natural join?
1. Rename duplicate attributes (on the right relation)
2. Form the cross-product of the two relations.
3. Eliminate any tuples that do not have matching values for all common attributes.
4. Project out the duplicate attributes (keeping the left relation's names)
* What is the natural join operator equivalent to?
\pi_{remove duplicate attributes}(\sigma_{common attributes are equal}(E1 \times E2))
* What happens when you natural join two relations that have no common attribute?
Equivalent to the cross product.
* What is union compatibility?
Two relations share the exact same attributes and in the same order.
* What is the union operator?
R \cup S
Returns all tuples that appear in R or S or both.
R and S must be union compatible.
* What is the difference operator?
R - S
Returns all tuples that appear in R and not S.
R and S must be union compatible.
* What is the intersection operator?
R \cap S
Returns all tuples that appear in R and S.
R and S must be union compatible.
* What is set intersection equivalent to?
R \cap S = R \cup S - (R - S) - (S - R)
* What is set union equivalent to?
R \cup S = R \cap S + (R - S) + (S - R)
* What is relational division equivalent to?
X / S = \pi_{A}(X) - \pi_{A}((\pi_{A}(X) \times S) - X)
where A = set of attributes in X but not in S.
* Are cross product and division inverse operators?
Not perfectly, because
(R \times S) / S = R; but
(R / S) \times S != R
for all appropriate R, S.
* When is a language relationally complete?
A language is relationally complete when it is at least as expressive as
relational algebra.
* What are 3 languages that are relationally complete?
1. Safe relational calculus
2. Relational algebra
3. SQL
# Relational calculus and query by example
* What is the main idea of relational calculus?
Use first-order-logic formulae to specify properties of the query answer.
* What is in the vocabulary of relational calculus?
- Set of constants
- Set of variables
- Set of comparison predicates (e.g. =, <)
- Set of n-ary predicates
- Set of n-ary functions
- Logical connectives (and, or, not, if)
- Quantifiers (exists, for all)
- Parentheses
* What is a valuation w.r.t relational calculus?
A mapping from variable names to values in the universe.
* What are the two types of relational calculus?
1. Tuple relational calculus
2. Domain relational calculus
The only difference is really the primitive variable used in specifying the
queries.
* What is query safety?
A query is safe if, for all database instances conforming to the schema,
the query result can be computed using only constants appearing in the
database instance or in the query itself.
* What is query by example?
A graphical language heavily influenced by domain relational calculus.
# SQL Introduction
* Where was SQL first developed?
At IBM
* When was SQL first standardized?
During the 1980s.
* What are the major version names of SQL?
SQL-86, SQL-89, SQL-92, SQL:1999, SQL:2003, SQL:2006, SQL:2008, SQL:2011.
* What did SQL-86 offer?
A basic relationally-complete language.
Had bag semantics, NULLs, ordering, grouping/aggregations, and outer joins.
* What did SQL-92 offer?
- CASE and nesting.
- New syntax for OUTER JOIN
* What did SQL:1999 offer?
- Divided language into CoreSQL + packages
- CoreSQL = SQL-92 with recursion
- Packages include stored procedures, OO, triggers.
* Are indexes part of the SQL standard?
No, they are instead vendor-specific physical structures.
* What can you use to eliminate duplicates from a query in SQL?
SELECT DISTINCT
* What is the SQL basic query block?
SELECT attribute-expression-list
FROM relation-list
[WHERE condition]
* How can you perform multi-set set operations in SQL?
UNION ALL, INTERSECT ALL, EXCEPT ALL
* What is the 3-valued logic table for the AND operator?
and | TRUE FALSE NULL
-----------------------
TRUE | TRUE FALSE NULL
FALSE | FALSE FALSE FALSE
NULL | NULL FALSE NULL
* What is the 3-valued logic table for the OR operator?
or | TRUE FALSE NULL
-----------------------
TRUE | TRUE TRUE TRUE
FALSE | TRUE FALSE NULL
NULL | TRUE NULL NULL
* What is the 3-valued logic table for the NOT operator?
not | TRUE FALSE NULL
-----------------------
| FALSE TRUE NULL
* What does the WHERE statement filter out w.r.t 3-valued logic?
Any tuple for which the condition evaluates to FALSE or NULL.
* What condition makes two rows be considered equal?
Each corresponding field value is not NULL and equal, or both NULL.
* What is a correlated subquery?
A subquery that references an attribute from the outer query.
* What is a scalar subquery?
A subquery that returns an atomic value. One row and one column.
* When can you assume ordering in the results of SQL queries?
Only the result of a query with an ORDER BY clause.
Otherwise, you cannot assume ordering.
* How is the result of a query containing grouping and aggregation determined?
1. Cross product all relations in from clause
2. Filter tuples using the where clause.
3. Form groups based on the group clause.
4. Filter groups using the having clause.
5. Generate one tuple per group.
* What does the HAVING clause do?
Allows one to filter on groups, instead of just tuples.
* What is a trigger?
A procedure executed by the database in response to a change to the database
instance.
* What are the 3 basic components of a trigger description?
1. Event: Type of change that should cause trigger to fire
2. Condition: Test performed by the trigger.
3. Action: Procedure executed if condition is met.
# 7-SQL APP - Xiaofei Zhang.pdf
* How do SQL client programs typically interact with a SQL server?
By using "bindings", available in various programming languages.
* What is the main problem with SQL bindings?
Mismatch between SQL data types and the applications'.
* What are the 3 levels of SQL interfaces?
Statement-level (Embedded SQL)
Library-level (JDBC)
Application inside the DBMS (stored procedures)
* What is embedded SQL and how does it work?
Embedded SQL is a language extension to make SQL queries directly with host
variables (write host variables using "INTO var1, ...").
It uses a preprocessor to translate the statements to C.
* How can SQL DML and DLL be embedded inside a C program?
Prefixing statements with EXEC SQL and suffixing with ;.
* What are host variables with respect to static embedded SQL?
Variables used to send and receive values from the database system.
* How can you send data with host variables in embedded SQL?
Using the host variables in place of constants.
* How can you receive data with host variables in embedded SQL?
Writing an INTO clause to load the data into the variable.
* What is an indicator variable w.r.t embedded SQL?
A variable that signals if its respect host variable is NULL.
* What is the type of an indicator variable in C/C++ w.r.t embedded SQL?
short int
* What is a cursor?
A cursor is like a pointer that refers to some row of the result.
* When do we want to use cursors?
When a query could return more than one row.
* What are the three places that a cursor can be in?
- before first tuple
- on a tuple
- after last tuple
* How do you use a cursor? (4 steps) (WRT embedded SQL)
1. Declare the cursor (associates with query).
2. Open the cursor (evaluates the query, conceptually).
3. Fetch tuple(s) using the cursor.
4. Close the cursor.
* How to use dynamic embedded SQL? (embedded SQL with dynamic queries)
Prepare with PREPARE, then EXECUTE (possibly multiple times).
* What is a placeholder?
A place in a query that will be replaced by a literal.
* What is the equivalent of metadata in embedded SQL?
SQLDA (descriptor area)
* What are SQL descriptors used for w.r.t embedded SQL?
Determine count and types of input and output variables at runtime.
* What is SQLJ?
A framework that allows embedding of SQL into Java.
* What are the similiarities and differences between SQLJ and Embedded SQL?
They both use a preprocessing step.
However, SQLJ uses JDBC to establish the connection.
* What are JDBC, ODBC, and CLI?
Java/Open Database Connectivity (Java or Microsoft).
CLI: call-level interface
They allow a single application to access multiple DBMS without recompiling.
* What are the 3 main ideas of both dynamic SQL and CLI/ODBC/JDBC?
1. Queries are represented as strings in the application.
2. Queries are prepared, then executed.
3. Generally, applications will not know the number and type of input or
output parameters.
* What is the architecture of JDBC?
There is a common JDBC API that the application talks to.
The API interfaces with a JDBC Driver manager, which picks the appropriate
JDBC driver to use.
* What do stored procedures do?
Execute application logic directly inside the DBMS process.
* What are 3 possible advantages of stored procedures?
1. Minimize data transfer costs
2. Centralize application code
3. Logical independence
# 6-views - Xiaofei Zhang.pdf
* What is a view?
A view is a relation in the external schema whose instance is
determined by the instances of the relations in the conceptual schema.
* What is a virtual view?
A view defined as a query, not executed until the view itself is queried.
* What is a materialized view?
A view defined as a table which remembers the query that created it.
* What do views and base relations in the conceptual schema have in common?
- their schema information appears in the database schema
- access controls can be applied to them
- other views can be defined in terms of them
* What is the difference between virtual and materialized views?
- Virtual: Views are used only for querying; they are not stored in
the database
- Materialized: The query that makes up the view is executed, the
view constructed and stored in the database.
* What happens when you query a virtual view?
The view definition is subbed into the query before optimizing/executing.
* What 5 conditions are necessary for updatable views, according to SQL-92?
1. The query references exactly one table
2. The query only outputs simple attributes (no expressions)
3. There is no grouping/aggregation/distinct
4. There are no nested queries
5. There are no set operations
* What is the primary downside of materialized views?
When the base table changes, the materialized view may also change.
Must periodically reconstruct or incrementally update the materialized view.
# 8 ER Model ###################################################################
* What is an entity and entity set?
A distinguishable object, and group of entities of the same type.
# 10 Schema Refinement #########################################################
* What are the relational design principles?
- Relations should have semantic unity.
- Information should not be repeated.
- Null values should be avoided.
- Spurious joins should be avoided.
* What is a functional dependency?
If you know the values for the attributes on the left side, then you know the
values of the tuples on the right side.
* What is the closure of a set of functional dependencies?
The set of all functional dependencies that are satisfied by every
relational instance that satisfies a given (sub)set of functional dependencies.
* What are the properties of a good schema decomposition?
- Does not lose information.
- Does not complicate checking of constraints.
- Does not contain anomalies. (Or fewer).
* How can we determine if two relations are a lossless decomposition?
Iff the common attributes of two relations form a superkey for either schema.
* What is a interrelational dependency?
A dependency that can only be tested by joining tables.
* What is a dependency preserving decomposition?
A decomposition where no original functional dependencies are interrelational.
* When is a relation schema in Boyce-Codd Normal Form?
For all functional dependencies (e.g the closure):
Either the dependency is trivial (e.g. the RHS is a subset of the LHS); or
the LHS is a superkey of the relation schema.
* How does BCNF help reduce redundancy?
- Each LHS value of a FD only appears in one row; and
- All values determined by the FDs are not repeated.
* What is the algorithm for performing a lossless-join BCNF decomposition?
1. Select a relation which currently violates the BCNF rules.
2. Split that relation into two, based on a FD:
a. One that has all the attributes in the FD.
b. One that has all the attributes in the original relation except for ones on
the RHS of the FD.
3. Repeat until all relations satisfy the BCNF rules.
* When is a relation schema in Third Normal Form?
All function dependencies (e.g. the closure) are either:
Trivial (e.g. the RHS is a subset of the LHS); or
The LHS is a superkey of the relation schema; or
Each attribute in RHS - LHS is contained in a candidate key of the relation.
* When are two sets of functional dependencies equivalent?
When they have the same closures.
* When is a set of functional dependencies minimal?
When:
Every RHS of a dependency is a single attribute; and
No X -> A is the set F - {X -> A} equivalent to F; and
No X -> A and Z (a proper subset of X) is the set F - {X -> A} U {Z -> A}
equivalent to F.
* What are the three steps for finding a minimal cover?
1. Replace X -> YZ with X -> Y and X -> Z.
2. Remove A from the LHS of X -> B if A is not needed to determine B.
3. Remove X -> A from F in A can be determine without this FD.
* What are the steps for making a lossless-join and dependency preserving 3NF decomposition?
1. Find a minimal cover of functional dependencies.
2. For each functional dependency, add that relation to the result.
3. Add relations for candidate keys, if no relation always covers them.
# Transactions #################################################################
* Why do we need transactions?
Databases are shared resources, and not managing concurrent access will cause
problems.
* What are the two causes of problems that transactions try to solve?
Concurrency. Concurrent access and updating of information.
Failure. Updates crashing halfway through.
* What is a transaction?
An application-specified atomic and durable unit of work.
* What is an active transaction?
A transaction that has been started but not yet aborted or committed.
* What does the COMMIT command do w.r.t transactions?
Any updates that a transaction has made become permanent.
Other transactions can see the updates.
* What does the ABORT command do w.r.t transactions?
Any updates that the transaction may have made are erased.
* How do DBMSs guarantee isolation?
Via concurrency control algorithms.
* How do DBMSs guarantee atomicity and durability?
Every action is logged securely so that it can be consulted later to determine
what to do.
* What is an operating set?
A union of operations.
* What are the three "transaction modes"?
1. Diagnostic size.
2. Access mode.
3. Isolation level.
* What does the diagnostic size of a transaction determine?
How many error conditions can be recorded.
* What is the access mode of a transaction?
An indicator about whether the transaction is read only, or it can write.
* What does the isolation level of transaction determine?
How the interactions of transactions are to be managed.
* What are the 4 possible isolation levels of a transaction?
0 (Read Uncommitted): Transactions may see uncommitted updates.
1 (Read Committed): Transactions see only committed changes, but non-repeatable
reads are possible.
2 (Repeatable Read): Reads are repeatable, but "phantoms" are possible.
3 (Serializability): Transactions are serialized.
* What is the serializability isolation level? Can it always work?
Concurrent transactions must appear to have been executed sequentially.
Note that not all transactions can be serialized, as in the case of
w1[x] r2[x] r2[y] w1[y].
* When do two operations conflict w.r.t transactions?
When:
They belong to different transactions; and
They operate on the same object; and
At least one of the operations is a write.
* What are the two types of conflicts w.r.t transactions?
1. Read-Write
2. Write-Write
* What is an execution history w.r.t transactions?
An interleaving of the operations in a set of transactions in which the
operation imposed by each transaction is preserved.
* What are the two assumptions we make when dealing with an execution history?
1. Transactions interact only with reads and writes.
2. A database is a fixed set of independent objects.
* When are two histories conflict equivalent?
When:
They are over the same set of transactions; and
The ordering of each pair of conflicting operations is the same in each history.
* When is a history conflict serializable?
If there exists some serial history that is conflict equivalent to it.
* What is a serialization graph?
A graph of the transactions, which edges point from T1 to T2 if T1 must
run before T2.
* When is a history serializable w.r.t serialization graphs?
When its serialization graph is acyclic.
* What is a dirty read? What's a problem that could occur?
When a transaction (may) read the update of another uncommitted transaction.
If the other transaction is aborted, the value read is incorrect.
* What is an example of a non-repeatable read?
T1 reads value 1. T2 updates and commits value 1 to 2. T1 reads value 2 now.
* What is an example of a phantom tuple?
T1 reads a row from a table, based on a where clause.
T2 inserts new rows into the table.
T1 does the same read, and gets more rows than before.
* What is the concept of snapshot isolation?