Lecture 8

 

Chapter 5 (continued)

 

·       The Master File Update Problem (continued)

 

The pseudocode (not completely correct) created during class was:

 

            Open files

            Priming reads

            #files_at_end=0

            While #files_at_end < 2

                  If Mkey < Tkey

                                    Write NewMaster from OldMaster

                                    Read OldMaster

                  Else

                                    If Mkey > Tkey

                                          If Trans_ID <> “A”

                                                            Error in Trans_ID

                                          Else

                                                            Write NewMaster from Transaction

                                          Endif

                                          Read TransactionFile

                                    Else

                                          Case Trans_ID of

                                          “C”:                  Move Transaction Data to OldMaster

                              Write NewMaster from OldMaster

                              Read OldMaster

                  “D”:      Read OldMaster

                  Default:    

                              Error in Trans_ID

                  EndCase

                                    Endif

                                    Read TransactionFile

                  Endif

            Wend

            Close Files

 

What happens if adds, changes or deletes for the same master record are mixed in the transaction file?

 

 

·       Reclaiming Space in Files

·                 The problem revolves around what to do if we want to add, update or delete records in an existing file

·                           Record adds don’t cause problems if we are allowed to add at the end of an existing file

·       Record updates can be viewed as a record deletion followed by a record addition

·       Record deletions are the real problem!

·       Records can be recognized as deleted by placing a special character in a known position of the record (perhaps the first byte)

·       Programs must have special logic to understand that certain records are deleted (unless objects manage the I/O, in which case the object method would worry about this problem)

·       Space can be reclaimed periodically (when more space is needed or on a fixed schedule) by simply coping the file to a new location and not writing the deleted records out.

·       Space can also be reclaimed by compressing the file in place.

·       Which of the two techniques above is faster?

 

·       Deleting fixed-length Records for Reclaiming Space Dynamically

·       Used when we want to use the deleted space as soon as possible

·       Mark deleted records as above

·       How do we find the deleted records so we can reuse the space?

·       Search sequentially through the file looking for a marked (delted) record –NOT!!!

·       Linked lists of deleted records with a sentinel on the last record (Available record list)

·       Stacks are generally simpler to use

·       Pointers are usually the Relative Record Number (RRN) in the file

·       One can quickly determine if a file contains empty space and locate the space with one I/O

·       The stack can contain the actual data records with embedded pointers (RRNs) or can be a separate file itself (what are the tradeoffs?)

·       The pointer to the first deleted record in the file can be kept in a header record in the file.  If null, then there are no deleted records

·       Deleting variable-length records

·       We can still use an Available list of deleted records

·       Since records are of differing lengths, we cannot use a RRN as a link between records

·       We must instead use a relative byte offset in the file as a link address

·       Re-using space reclaimed from deleted variable-length records

·       We can’t simply take the first record on the stack since the deleted record slot my be too small to contain the new record to be added

·       We can’t use a simple stack (FIFO) data structure to manage the deleted records

·       We may not have a slot in the stack large enough to contain a new record

·       We may need to add especially large records to the end of the file

·       Storage Fragmentation

·       If we force all records to be fixed length (i.e. to be of maximum possible length) then we often waste space at the end of records that are less than maximum length.  We must pad the unused space in these records

·       Wasted space within a record is called internal fragmentation

·       Variable length records eliminate internal fragmentation

·       Deletion and replacement of longer variable length records with shorter records still results in internal fragmentation (not quite but the author describes it this way)

·       The part of the record not used should go on the Available list

·       If we choose to use a slot on the Available list that is too large for the record to be added, we can put the unused space back on the Available list

·       After considerable file activity (deletes and adds) many of the slots on the Available list will be too small to be reused.  We could simply compact the file (as described earlier) or we could attempt to coalesce the holes (consolidate available slots) that are physically adjacent.

·       The problem with coalescing holes is that they are not necessarily adjacent on the Available list!

·       Smart placement strategies can help reduce this problem