Peripherals and File Design
· Disk Capacity
· Sector capacity = # bytes (e.g. 512)
· Track capacity = #sectors/track (e.g. 40) * sector capacity
· Cylinder capacity = #tracks/cylinder (e.g. 11) * track capacity
· Drive capacity = #cylinders/drive (e.g. 1331) * cylinder capacity
· Sector Interleaving
· Logical sectors are not stored physically adjacent on track
· Goal is to improve performance by moving next sector to be read to a position that allows disk controller or main processor time to process data before reading the next sector
· Newer, faster controllers often don’t require anything except 1-1 interleaving
· Cluster
·
When the file manager must allocate disk storage for a
program, it does so in groups of contiguous sectors called clusters
·
All sectors in a cluster can be read with at most one
seek (to the 1st sector in the cluster)
·
The file manager maintains a file allocation table (FAT) that contains a list of all the
clusters in a file and the physical location of the cluster
·
The more sectors in a cluster, the fewer seeks required
to read all the data in the file when reading sequentially
·
Logical records may at times span sectors
· Extents
· An extent is a collection of physically and logically contiguous sectors
· In a perfect world, every file would have only one extent and thus require minimal seeking for sequential processing
· Many older systems (e.g. IBM’s MVS) allowed programmers (through a Job Control Language JCL) to control extents
· As the number of extents increases, the file becomes spread out over the disk and performance suffers
· Many companies sell disk Extent Consolidation software (e.g. Norton’s Speed Disk) to re-arrange data on disk into single extents per file
· Internal Fragmentation
· The last sector in a file may not be filled – thus resulting in wasted space
· If a cluster contains N bytes, then the last cluster in a file may contain from 0 to N-1 wasted bytes, since it may not be completely filled
· This wasted space becomes significant with many small files
· There is a trade-off between performance (which favors large cluster sizes) and disk utilization (which favors small cluster sizes)
· Variable Capacity Disk Blocks
· Some systems (e.g. IBM 309x mainframe computers) use variable physical record (block) sizes on tracks rather than sectors
· Each record contains three physical parts (sub-blocks): Count, Key, Data
· Hardware support is available for direct addressing of a block and for searching for a block with a specific key value (or key > range)
· Searching is often much faster with this scheme since Channel Programming (low level I/O) can be used to off-load much of the processing from the CPU to a special processor concerned only with I/O
· The record spanning (of sectors) problem and sector fragmentation problem don’t exist
· A user-program supplied blocking factor indicates the number or records to be stored in each block in a file
· Space can be wasted at the end of each track if there is insufficient room for the next block (track fragmentation)
· Question: How do we compute the track number of a specific block?
· No record requires two I/O operations for retrieval
· The programmer or operating system has much greater flexibility at the cost of extra work to organize and manage data in files
· Pre-formatting (low-level formatting)
· On sector-addressable disks, low-level information must be stored before each data sector to identify the track address, sector address and condition (usable or defective)
· Gaps must be placed between these pieces of information to allow the read/write mechanism to distinguish between them
· Disk Access Costs
· Seek time
· Most costly deterrent to performance
· Multiple drives to hold independent files often helps
· Single user system eliminates arm movement by other programs
· The average seek time is normally 1/3 of the total number of cylinders that the read/write head ranges over.
· Rotational delay (latency)
· Typical speeds on hard disk are 3,600 rpm (16.7 msec per revolution)
· Floppy disks speeds are typically 360 rpm (83.3 msec per revolution)
· Interleaving may help reduce the negative effects or sequential access timing problems
· Transfer time
· Can be computed as:
Transfer time = #bytes transferred / # bytes per track * rotation time