Computer Science 360

Peripherals and File Design

Lecture 2

 

Chapter 3 – continued

 

 

·        Disk Capacity

·        Sector capacity = # bytes (e.g. 512)

·        Track capacity = #sectors/track (e.g. 40) * sector capacity

·        Cylinder capacity = #tracks/cylinder (e.g. 11) * track capacity

·        Drive capacity = #cylinders/drive (e.g. 1331) * cylinder capacity

 

·        Sector Interleaving

·        Logical sectors are not stored physically adjacent on track

·        Goal is to improve performance by moving next sector to be read to a position that allows disk controller or main processor time to process data before reading the next sector

·        Newer, faster controllers often don’t require anything except 1-1 interleaving

 

·        Cluster

·        When the file manager must allocate disk storage for a program, it does so in groups of contiguous sectors called clusters

·        All sectors in a cluster can be read with at most one seek (to the 1st sector in the cluster)

·        The file manager maintains a file allocation table (FAT) that contains a list of all the clusters in a file and the physical location of the cluster

·        The more sectors in a cluster, the fewer seeks required to read all the data in the file when reading sequentially

·        Logical records may at times span sectors

 

·        Extents

·        An extent is a collection of physically and logically contiguous sectors

·        In a perfect world, every file would have only one extent and thus require minimal seeking for sequential processing

·        Many older systems (e.g. IBM’s MVS) allowed programmers (through a Job Control Language JCL) to control extents

·        As the number of extents increases, the file becomes spread out over the disk and performance suffers

·        Many companies sell disk Extent Consolidation software (e.g. Norton’s Speed Disk) to re-arrange data on disk into single extents per file

 

·        Internal Fragmentation

·        The last sector in a file may not be filled – thus resulting in wasted space

·        If a cluster contains N bytes, then the last cluster in a file may contain from 0 to N-1 wasted bytes, since it may not be completely filled

·        This wasted space becomes significant with many small files

·        There is a trade-off between performance (which favors large cluster sizes) and disk utilization (which favors small cluster sizes)


·        Variable Capacity Disk Blocks

·        Some systems (e.g. IBM 309x mainframe computers) use variable physical record (block) sizes on tracks rather than sectors

·        Each record contains three physical parts (sub-blocks):  Count, Key, Data

·        Hardware support is available for direct addressing of a block and for searching for a block with a specific key value (or key > range)

·        Searching is often much faster with this scheme since Channel Programming (low level I/O) can be used to off-load much of the processing from the CPU to a special processor concerned only with I/O

·        The record spanning (of sectors) problem and sector fragmentation problem don’t exist

·        A user-program supplied blocking factor indicates the number or records to be stored in each block in a file

·        Space can be wasted at the end of each track if there is insufficient room for the next block (track fragmentation)

·        Question:  How do we compute the track number of a specific block?

·        No record requires two I/O operations for retrieval

·        The programmer or operating system has much greater flexibility at the cost of extra work to organize and manage data in files

 

·        Pre-formatting (low-level formatting)

·        On sector-addressable disks, low-level information must be stored before each data sector to identify the track address, sector address and condition (usable or defective)

·        Gaps must be placed between these pieces of information to allow the read/write mechanism to distinguish between them

 

·        Disk Access Costs

·        Seek time

·        Most costly deterrent to performance

·        Multiple drives to hold independent files often helps

·        Single user system eliminates arm movement by other programs

·        The average seek time is normally 1/3 of the total number of cylinders that the read/write head ranges over.

·        Rotational delay (latency)

·        Typical speeds on hard disk are 3,600 rpm (16.7 msec per revolution)

·        Floppy disks speeds are typically 360 rpm (83.3 msec per revolution)

·        Interleaving may help reduce the negative effects or sequential access timing problems

·        Transfer time

·        Can be computed as:

 

Transfer time = #bytes transferred / # bytes per track * rotation time