Part 1
Question 1
1.
2.
RAID50 is faster than RAID5, safer than RAID0, have better bandwidth than RAID5. RAID50 is based on both RAID5 and RAID0, as shown in the picture above, RAID50 is separated in three part and each part is built in the structure of RAID5, because of the verification ability of RAID5, data stored is much more safer, even if one disk is broken. Data can be recovery.
RAID50 is based on RAID5 ,so it also has the ability.
As shown above, the three parts are assembled by the structure of RAID0, because RAID0 never do the verification, just concentrated on the speed of translating, so the three parts can be read or write without any verification, so data read or write can be in full bandwidth.
So, the RAID50 is taken advantage of both speed and safety. RAID50 is slower than RAID0, more unsafe than RAID5.
Question 2
If we use variable format, when we want to reuse the spaces which once occupied but already deleted, it will be very difficult, it may easily cause the disk fragmentation.
If using a large space to store a record, we may face the moving record regularly, and the move costs a lot, on the other hand, if using a small space, it cause disk fragmentation.
Question 3
LRU: 16284
MRU: 16830
IOs of LRU: 9
IOs of MRU: 9
Question 4
B-Tree
Because the Hash is calculating a value by its elements, the values don’t have the mean-ing of big or small, uncorrelated with the big or small of original data , also can’t be used as the standard of ordering, so it can’t be used to match the string “abc%”
B-Tree is a balanced tree, is sorted by the content, so when do the operation like “abc%”, it costs less and is very fast.
Question 5
1) A simple nested-loop join (NLJ) algorithm reads rows from the first table in a loop one at a time, passing each row to a nested loop that processes the next table in the join. This process is repeated as many times as there remain tables to be joined.
2) Two-pass join algorithm based on sorting is to read data from two relations which are to be joint into main memory first, and then sorted each relation by the join attribute, for each value of join attribute, do join operation for these tuples, and written the re-sult out to the disk. The procedures above are all partially processed due to the
memory space limit. Finally read the result list in disk to complete the operation.
3) The existence of an index on one or more attributes of a relation makes available some algorithms that would not be feasible without the index. Index-based algorithms are especially useful for the selection operator, but algorithms for join and other bina-ry operators also use indexes to very good advantage.
Part 2