1. What are the differences between the characteristics of an operational database and a dimensional database?
2. What is XML, and why is it useful?
3. What is an XML Schema document?
4. How can XML be used with a DBMS?© BrainMass Inc. brainmass.com October 17, 2018, 10:19 am ad1c9bdddf
1. What is the difference between the characteristics of an operational database and a dimensional database?
An operational database contains enterprise data that are up to date and modifiable. This type of database usually uses an OLTP database which is optimized for faster transaction processing such as insertion, deletion and update of data. It is the database that is currently and progressive in use capturing real time data and supplying data for real time computations and other analyzing processes.
A dimensional database is the one that rather than representing data in multiple relations, represents key data entities as different dimensions. Which means that, multidimensional database systems offer an extension to the relational system to provide a multi-dimensional view of the data. A dimensional database needs to be designed to support queries that retrieve a large number of records and that summarize data in different ways.
The difference between the characteristics of an operational database and a dimensional database are illustrated in the following table:
Data is atomized
Data is current
Processes one record at a time
Designed for highly structured repetitive processing
Data is summarized
Data is historical
Processes many records at a time
Designed for highly unstructured analytical processing
2. What is XML, and why is it useful?
XML stands for eXtensible Markup Language. XML is a meta-markup language developed by the World Wide Web Consortium (W3C) to deal with a number of the shortcomings of HTML. It defines a set of rules for encoding documents in a format that is readable by both human and machine. The design goals of XML emphasize ...
Key benefit provided by the relational model over previous data
1. What is the key benefit provided by the relational model over previous data models?
2. What are three types of physical data dependence described by Codd in his paper?
3. What are the good/bad choices the first two RDB systems made?
4. How the two systems finally united to the current relational database systems. Think about query language, system model, choices on storage, index, query evaluation, etc. What do you think are the choices made in commercial database, why? (Consider 80/20 rule).
5. Use the cost formula to explain why blocked access or pre-fetching is good exercise.
6. Considering a join operation between two tables, one M pages, another N pages, running on a buffer pool with K pages. What is the number of I/Os using nested-loop join, block-nested loop join. What if a hot set algorithm is used or the DBMIN algorithm is used? What if the replacement policy is LRU or MRU?
7. Consider a 3-level tree structure index with maximum 64 entries per internal node. On a range query that retrieve 1000-1200 entries. What is the number of I/Os if the index is B-tree. What is the number of I/Os if the index is B+ tree. What is the number of I/Os if the index is a cluster index?
8. We say that the basic idea behind index is partition and labeling. What is the partition and labeling of a trie structure?
9. write the exhaustive node-split algorithm for R-tree, then analyze the complexity of the algorithm
10. Analyze the complexity of the node-split algorithms in the R*-tree paper. Compare the algorithms to the corresponding ones in the R-tree paper, illustrate under what circumstance the algorithms in the R*-tree paper will out-perform the ones in the R-tree paper, and under what circumstance they will not.
11. In a Z-curve on a 1024 x 1024 grid, the point with coordinates (1,1) is first, the point (1,2) is second, the point (2,1) is third, and so on. What are the coordinates of the 45th point? Of the 17,945th point?
12. Given you a set of nodes (in 2D space), that belongs to one R-tree node that is to be splitted, what is the results of the splitting, using the three algorithms (as presented in the R-tree paper) respectively?
13. Consider a scenario in which we want to use GIST to model the string matching operation, where containment relationship is a fussy sub-string (with gaps). Please specify what the key functions should be defined.
14. Given a table that has 10 columns, 100K tuples expended across 1000 pages, if we project on two columns, with duplicate elimination, and assume that there are 50 unique values on the combination of the two columns, what is number of I/Os in the best/worse cases?
15. Given two tables R1 and R2, one has 1000 tuples, 100 pages, the other has 2000 tuples, 20 pages. Perform join on the two tables with predicate "R1.a > 2 and R1.a = R2.b", what join algorithm will you use, if (1) there is no index at all. (2) there is B+ tree index on R1.a (3) there is B+ tree index on R2.b. (4) there is cluster B+ tree index on R1.a. (5) there is join index on R1.a and R2.b.
16. What does an operator do in the "open" phase? Consider operator sort, nested-loop join, block-nested-loop join, file scan, index scan.
17. What are the possible physical operator(s) that implement a join? Which are blocked, which are pipelined?
18. Give an example of ((A join B) join C) where the output of a sort-merge join for (A join B) can be fed as input to a simple merge join with C, with no intermediate sorting required. A,B,C are relations -- you pick the attributes and the join conditions to make the above example work out.
19. You have a database with a primary index on ID for Employee and Dept, and secondary index on Name for the Employee relation only. You have access methods available for a Nested Loops Join, Nested Indexed Loop Join, and a Grace Hash Join. Generate alternative plans for the following query:
FROM Employee E, Dept D
WHERE E.Dept = D.ID
AND D.Name = "Toys"
20. If you are to design an application-style benchmark for a text-oriented XML database, how would you configure the data set and what are the queries you may ask?