5.5. Making the Storage Usable

Once a mass storage device is in place, there is little that it can be used for. True, data can be written to it and read back from it, but without any underlying structure data access is only possible by using sector addresses (either geometrical or logical).

What is needed are methods of making the raw storage a hard drive provides more easily usable. The following sections explore some commonly-used techniques for doing just that.

5.5.1. Partitions/Slices

The first thing that often strikes a system administrator is that the size of a hard drive may be much larger than necessary for the task at hand. As a result, many operating systems have the capability of dividing a hard drive's space into various partitions or slices.

Because they are separate from each other, partitions can have different amounts of space utilized, and that space in no way impacts the space utilized by other partitions. For example, the partition holding the files comprising the operating system is not affected even if the partition holding the users' files becomes full. The operating system still has free space for its own use.

Although it is somewhat simplistic, you can think of partitions as being similar to individual disk drives. In fact, some operating systems actually refer to partitions as "drives". However, this viewpoint is not entirely accurate; therefore, it is important that we look at partitions more closely.

5.5.1.1. Partition Attributes

Partitions are defined by the following attributes:

  • Partition geometry

  • Partition type

  • Partition type field

These attributes are explored in more detail in the following sections.

5.5.1.1.1. Geometry

A partition's geometry refers to its physical placement on a disk drive. The geometry can be specified in terms of starting and ending cylinders, heads, and sectors, although most often partitions start and end on cylinder boundaries. A partition's size is then defined as the amount of storage between the starting and ending cylinders.

5.5.1.1.2. Partition Type

The partition type refers to the partition's relationship with the other partitions on the disk drive. There are three different partition types:

  • Primary partitions

  • Extended partitions

  • Logical partitions

The following sections describe each partition type.

5.5.1.1.2.1. Primary Partitions

Primary partitions are partitions that take up one of the four primary partition slots in the disk drive's partition table.

5.5.1.1.2.2. Extended Partitions

Extended partitions were developed in response to the need for more than four partitions per disk drive. An extended partition can itself contain multiple partitions, greatly extending the number of partitions possible on a single drive. The introduction of extended partitions was driven by the ever-increasing capacities of new disk drives.

5.5.1.1.2.3. Logical Partitions

Logical partitions are those partitions contained within an extended partition; in terms of use they are no different than a non-extended primary partition.

5.5.1.1.3. Partition Type Field

Each partition has a type field that contains a code indicating the partition's anticipated usage. The type field may or may not reflect the computer's operating system. Instead, it may reflect how data is to be stored within the partition. The following section contains more information on this important point.

5.5.2. File Systems

Even with the proper mass storage device, properly configured, and appropriately partitioned, we would still be unable to store and retrieve information easily — we are missing a way of structuring and organizing that information. What we need is a file system.

The concept of a file system is so fundamental to the use of mass storage devices that the average computer user often does not even make the distinction between the two. However, system administrators cannot afford to ignore file systems and their impact on day-to-day work.

A file system is a method of representing data on a mass storage device. File systems usually include the following features:

Not all file systems posses every one of these features. For example, a file system constructed for a single-user operating system could easily use a more simplified method of access control and could conceivably do away with support for file ownership altogether.

One point to keep in mind is that the file system used can have a large impact on the nature of your daily workload. By ensuring that the file system you use in your organization closely matches your organization's functional requirements, you can ensure that not only is the file system up to the task, but that it is more easily and efficiently maintainable.

With this in mind, the following sections explore these features in more detail.

5.5.2.1. File-Based Storage

While file systems that use the file metaphor for data storage are so nearly universal as to be considered a given, there are still some aspects that should be considered here.

First is to be aware of any restrictions on file names. For instance, what characters are permitted in a file name? What is the maximum file name length? These questions are important, as it dictates those file names that can be used and those that cannot. Older operating systems with more primitive file systems often allowed only alphanumeric characters (and only uppercase at that), and only traditional 8.3 file names (meaning an eight-character file name, followed by a three-character file extension).

5.5.2.2. Hierarchical Directory Structure

While the file systems used in some very old operating systems did not include the concept of directories, all commonly-used file systems today include this feature. Directories are themselves usually implemented as files, meaning that no special utilities are required to maintain them.

Furthermore, because directories are themselves files, and directories contain files, directories can therefore contain other directories, making a multi-level directory hierarchy possible. This is a powerful concept with which all system administrators should be thoroughly familiar. Using multi-level directory hierarchies can make file management much easer for you and for your users.

5.5.2.3. Tracking of File Creation, Access, Modification Times

Most file systems keep track of the time at which a file was created; some also track modification and access times. Over and above the convenience of being able to determine when a given file was created, accessed, or modified, these dates are vital for the proper operation of incremental backups.

More information on how backups make use of these file system features can be found in Section 8.2 Backups.

5.5.2.4. Access Control

Access control is one area where file systems differ dramatically. Some file systems have no clear-cut access control model, while others are much more sophisticated. In general terms, most modern day file systems combine two components into a cohesive access control methodology:

  • User identification

  • Permitted action list

User identification means that the file system (and the underlying operating system) must first be capable of uniquely identifying individual users. This makes it possible to have full accountability with respect to any operations on the file system level. Another often-helpful feature is that of user groups — creating ad-hoc collections of users. Groups are most often used by organizations where users may be members of one or more projects. Another feature that some file systems support is the creation of generic identifiers that can be assigned to one or more users.

Next, the file system must be capable of maintaining lists of actions that are permitted (or not permitted) against each file. The most commonly-tracked actions are:

  • Reading the file

  • Writing the file

  • Executing the file

Various file systems may extend the list to include other actions such as deleting, or even the ability to make changes related to a file's access control.

5.5.2.5. Accounting of Space Utilized

One constant in a system administrator's life is that there is never enough free space, and even if there is, it will not remain free for long. Therefore, a system administrator should at least be able to easily determine the level of free space available for each file system. In addition, file systems with well-defined user identification capabilities often include the capability to display the amount of space a particular user has consumed.

This feature is vital in large multi-user environments, as it is an unfortunate fact of life that the 80/20 rule often applies to disk space — 20 percent of your users will be responsible for consuming 80 percent of your available disk space. By making it easy to determine which users are in that 20 percent, you can more effectively manage your storage-related assets.

Taking this a step further, some file systems include the ability to set per-user limits (often known as disk quotas) on the amount of disk space that can be consumed. The specifics vary from file system to file system, but in general each user can be assigned a specific amount of storage that a user can use. Beyond that, various file systems differ. Some file systems permit the user to exceed their limit for one time only, while others implement a "grace period" during which a second, higher limit is applied.

5.5.3. Directory Structure

Many system administrators give little thought to how the storage they make available to users today is actually going to be used tomorrow. However, a bit of thought spent on this matter before handing over the storage to users can save a great deal of unnecessary effort later on.

The main thing that system administrators can do is to use directories and subdirectories to structure the storage available in an understandable way. There are several benefits to this approach:

By enforcing some level of structure on your storage, it can be more easily understood. For example, consider a large mult-user system. Instead of placing all user directories in one large directory, it might make sense to use subdirectories that mirror your organization's structure. In this way, people that work in accounting have their directories under a directory named accounting, people that work in engineering would have their directories under engineering, and so on.

The benefits of such an approach are that it would be easier on a day-to-day basis to keep track of the storage needs (and usage) for each part of your organization. Obtaining a listing of the files used by everyone in human resources is straightforward. Backing up all the files used by the legal department is easy.

With the appropriate structure, flexibility is increased. To continue using the previous example, assume for a moment that the engineering department is due to take on several large new projects. Because of this, many new engineers are to be hired in the near future. However, there is currently not enough free storage available to support the expected additions to engineering.

However, since every person in engineering has their files stored under the engineering directory, it would be a straightforward process to:

Of course, such an approach does have its shortcomings. For example, if people frequently move between departments, you must have a way of being informed of such transfers, and you must modify the directory structure appropriately. Otherwise, the structure no longer reflects reality, which makes more work — not less — for you in the long run.

5.5.4. Enabling Storage Access

Once a mass storage device has been properly partitioned, and a file system written to it, the storage is available for general use.

For some operating systems, this is true; as soon as the operating system detects the new mass storage device, it can be formatted by the system administrator and may be accessed immediately with no additional effort.

Other operating systems require an additional step. This step — often referred to as mounting — directs the operating system as to how the storage may be accessed. Mounting storage normally is done via a special utility program or command, and requires that the mass storage device (and possibly the partition as well) be explicitly identified.