Releasing and Staging

 

Preceding modules discussed archiving, the most important and most configurable of the SAM functions. This module covers releasing and staging. It is crucially important to configure archiving. It is not as essential to configure releasing and staging and many sites operate with no or minimal deviation from the defaults.

 

Releasing and staging are parallel in many ways. You can configure a log file for each in files named respectively releaser.cmd and  stager.cmd,  located in /etc/opt/SUNWsamfs.  Other general directives can also be set in these files, which are similar to the archiver.cmd file in their format. All consist of single line entries containing fields separated with white space, comments follow the pound sign, and lines can be continued onto the next line using a back slash. Any time these files are edited, the daemon sam-fsd must be forced to reread the file with samd config.

 

File attributes for all three functions can also be configured per archive set in the archiver.cmd file. Releasing and staging use parameters appended to archive set declarations. Releasing and staging file attributes can also be set on individual files or on the files in individual directories using the release and stage commands. These commands are similar in format and function to the archive command discussed previously.

 

The Releaser

 

One of the jobs of the SAM software is to make space available on disk cache when file systems fill. After the disk cache of a SAM-FS or SAMQFS file system fills, some of the files may be released to make more space for other files. Released files are not deleted from the disk. In the inodes of released files, the pointers to data blocks containing the file’s data are zeroed. Their inodes remain and any references to released files in directories and symbolic links are current. The file remains part of the file system, but its data is available only on archive media.

 

This section describes how the releaser works, partial releasing, how releasing can be performed manually using the release command and how releaser operations such as logging can be customized in the /etc/opt/SUNWsamfs/releaser.cmd and archiver.cmd files.

 

The daemon sam-releaserd is referred to as “the releaser.” It is started by sam-fsd when utilization of the disk cache space assigned to a file system exceeds a specified percentage of the disk space called the high water mark. The sam-releaserd daemon scans the file system and weighs the access, modification and residence age of each file and the size of each file to determine which files should be released. Based on age and size, it then selects a single file and releases it. It then checks to see if disk cache utilization has fallen below the low water mark. If it has not, sam-releaserd repeats the process until the disk cache utilization of the file system falls below the low water mark, or until sam-releaserd cannot find any more files that it can release. At that point sam-releaserd exits.

 

The releaser can not release files that do not have at least one archive copy (which may or may not be Copy 1). It cannot release directories and other metadata, files that have been marked for

 

 

 

 

Page 2

 

no release, files marked as damaged (damaged files are discussed in the Disaster Recovery module) or files that have not been resident on disk for at least the minimum residence time which is 10 minutes by default.

 

The high water mark is 80% by default, but can be configured per file system to other values. Use the option to mount high=n where n is the percentage of disk space assigned to a file system that must be filled before releasing commences. The default low water mark is 70% but can be set with the option to mount low=n.

 

Selection of Files for Releasing

 

The releaser generally chooses the largest and least commonly accessed files for release. Obviously, files that are infrequently or never accessed are good choices for releasing, since few users will be inconvenienced by having to wait until they stage. There is little point to releasing frequently used files no matter how large they are. They will repeatedly stage, using as much disk space as before and additionally using system overhead. Such files should be marked “no release” using the methods discussed later, so they will never be released from the disk. Releasing small files is equally valueless. Their removal from the disk frees little space. The releaser thus targets large, infrequently accessed files.

 

The weights assigned to size and age in determining releasing priority can be (but rarely are) configured in the releaser.cmd file. Check the documentation for details.

 

Partial Releasing

 

By default, when a file is released none of the file’s data remains on disk. It is also possible, however to leave a small stub of the file on disk. The process of releasing all but a stub is called partial releasing and the default stub, assuming no configuration of partial releasing, is 16 kbytes. Partial releasing can speed file access for users and eliminates the need to stage files for processes that read only a small portion of the file.

 

You can set the size of the stub that will be left on disk when a file is released, within limits. The minimum stub size is always 8 kbytes. Unless configured otherwise, the default and maximum stub size is 16 kbytes.

 

Configuring Releasing

 

Releasing attributes such as partial releasing can be configured on individual files or all files in a directory using the release command.  They can also be set per-archive-set by appending parameters to archive set assignment directives in the archiver.cmd file. Configuration of the minimum residence time, log files and other releaser characteristics is done in the /etc/opt/SUNWsamfs/releaser.cmd file.

 

 

Page 3

 

Configuring the releaser.cmd File

 

The releaser configuration file, /etc/opt/SUNWsamfs/releaser.cmd, specifies the directives used to control releaser operations globally for all files in all file systems or per file system. The directives in this file allow you to set the minimum residence time of a file and to configure log files, among other things. When the releaser.cmd file is configured or changed, you must force sam-fsd to reread it using samd config.

 

The releaser.cmd file has the same overall format as the archiver.cmd file: Global directives are at the top of the file, while file-system specific directives follow a line of the form fs = file_system_name. Each directive is placed on a separate line, any text that appears after the pound sign character (#) is treated as a comment, etc.

 

The min_residence_age directive specifies the minimum amount of time that a file must reside in a file system prior to its being a candidate for release. The default is 600 seconds. The format of the line in the releaser.cmd file is:

 

min_residence_age = time

 

Where time is a time in seconds. *PERFORMANCE ISSUE* If most of your work is done during the day, create a script to stage files that will be needed during the night, in batches, and set min_residence_age to 12 hours. The files will stage at night when usage is low, reducing the bottleneck at the drives and will be available to users all day.

 

The log directives allow you to configure a log file of releasing operations in the releaser.cmd file in the format:

 

logfile = path_to_log

 

Where path_to_log is the absolute path to the log file. This file is created automatically if configured. It logs only the activities of sam-releaserd. Manual releasing is not logged.

 

The release Command

 

The release command allows the user or administrator to set releasing attributes on files and to release the disk space associated with files. One or more files, up to all files in a directory can be released using the following syntax:

 

release filename...

release -r dirname...

 

Unarchived files cannot be released, nor can directories. Releasing attributes can be set on a file or on the files in a directory using the following syntax and options:

release -[adnp] [-s partial_size] filename...

release -[adnp] [-s partial_size] -r dirname...

The release command used with the -a, -d, -n, -p, and -s options sets attributes on the specified file or directory, but does not release the file.

 

Page 4

 

Release Command Options

Option

Attribute

-a

Specifies that the file will be released as soon as one archive copy of the file is made. It can be useful to set this attribute on very large files under some circumstances. The file will have to be staged back to disk before other archive copies are made. This option cannot be used with the -n option.

-d

Resets the release attributes on the file to the default values. If used with other options that set attributes, all releasing attributes are first reset to the default, and then other attribute-setting options are processed. If this option is used on a file that has already been partially released, the partial attribute is reset, and the stub is released.

-n

Specifies that the disk space for this file never be released. Only a superuser can set this attribute on a file. This option cannot be used on the command line with the -a or -p options.

-p

Sets the partial release attribute on the file. When the file is released, a stub is retained on the disk. This option cannot be used with the -n option

-s partial_size

 

Set the size of the stub in kbytes to remain on disk when the file is partially released. If this option is used to set a smaller stub on a file that has already been partially released, the partial attribute is reset, and extra blocks are released. The smallest partial stub that can be set is 8 kbytes. The largest is 16 kbytes by default, but may be reset to be as large as 2 Gbytes using mount options.

-r dirname

Recursively releases disk space or sets release attributes for files contained in the specified directory and its subdirectories. More than one directory can be specified.

 

Configuring partial releasing with the release command

 

When a file is partially released using the following sequence of commands, the default stub remains on disk:

 

# release -p file_name

# release file_name

 

The -s option to the release command allows you to set the size of the stub that will be left on disk when a file is released, between 8 and 16 kbytes by default. The following sequence of commands would release all but 8 kbytes of a file:

 

# release -p -s 8 file_name

# release file_name

 

 

 

 

Page 5

 

Configuring Partial Releasing Using Options to mount

 

The 16 kbyte value for the default stub size can be changed using the partial=n option to mount, where n is the size of the default stub in kbytes. The maximum size of the stub that can be left on disk can be configured using the max_partial=n option to mount, where n is the maximum size of the stub in kbytes. The max_partial option can be set up to 2097152 kbytes. If max_partial is set to 0, partial releasing is disabled.

 

Configuring Releasing Attributes in the archiver.cmd File

 

Releasing attributes can also be applied to all files in an archive set by configuring the archiver.cmd file. One of a set of releasing directives can be appended to an archive set assignment directive using the following format:

 

archive_set_name path [search criteria …] releasing_directives …

 

Where directives may be:

-release a to specify the release of files in the archive set as soon as the first archive copy is made (equal to release -a filename).

-release n to specify never to release the files in the archive set (equal to release -n filename).

-release p to specify the partial release of files in the archive set (equal to release -p filename). Only the default stub will be left.

-release sXX where XX is the size in Kbytes of the partial stub to remain on disk.

 

The following example shows an archive set directive from the archiver.cmd file. The archive set is called programs and contains any file in the directory /sam1/development. Files in this archive set are never released:

 

programs  development -release n

 

Releasing can also be configured on archive set copies with the -release|-norelease options. These parameters immediately follow the archive set copy number. If -release is placed after the copy, the file will be forcibly released as soon as that copy is made. If  -norelease is placed after the copy number, files in that archive set will not be released until that copy has been made, and perhaps not even then, if the high water mark is not exceeded. The  -release option is used when you need to reclaim disk space as soon as possible, while the -norelease option is used when you want to make sure a file is not released until a specified copy has been made.

 

arset1              .          

            1          10m

            2          -release            20m

 

In the example above, any file archived with the archive set arset1 will be released once copy 2 has been made.

 

The following configuration prevents release of any file in the archive set arset1 until all archive copies are made. Once all archive copies are made, the file is immediately released from the disk.

 

 

Page 6

 

arset1  .          

            1          -release  -norelease 10m

            2          -release  -norelease  20m

 

This is a very popular configuration for file systems with very large files or where the main use of the file system is to send data to tape.  It is economical of tape drive use as files will not have to be staged in order to make archive copies. You cannot get the same effect by placing “‑release” after copy 2, as in the previous example. Even though copy 1 has a shorter archive age than copy 2, it may still end up archiving later because of archiving queues, or because it takes more time to archive to the media used by copy 1. Once copy 2 is archived, the file may be released, and would have to be staged in order to archive copy 1. Placing “‑norelease” next to every archive copy prevents the file from being released no matter how long it takes an archive copy to be made.

 

Summary of partial releasing:

 

Set with:

Applies to

Default

Range

Partial releasing

release –p on file ( or directory with –r)

-release p on archive set

Files and directories

 

Archive sets

Not used

 

Default stub released

mount –o partial=n

Entire file systems

16 kbytes

8 kbytes to

max_partial

Specified stub

release –s –p on file ( or directory with –r)

-release sXX on archive set

Files and directories

 

Archive sets

Not used

8 kbytes to

max_partial

Minimum stub

Not configurable

 

8 kbytes

Not configurable

Maximum stub

mount –o max_partial=n

Entire file systems

16 kbytes

16 kbytes to 2 Gbytes

 

 

Viewing Releasing Attributes

 

Releasing attributes are displayed in the output from sls -D. The example below shows a file that has been configured to release all but a 32 kbyte stub, although the file has not been released. If it had been released the “offline” flag would appear in the output. This output implies that the maxpartial=n option to mount has been set to at least 32 for the file system in which this file resides:

 

memo1:

mode:              -r-xr-xr-x         links: 1             owner: root      group: other

length:             10276              admin id: 0      inode: 1083.1

release -p partial = 32k

copy 1: ----      Nov 26 13:59             14f.11fb          dt         TAPE03

copy 2: ----      Nov 26 14:18              2d.1                 dk        backup1 f45

access:             Nov 26 13:57              modification:               Nov 26 13:57

changed:          Nov 26 13:57              attributes:                    Feb 27 16:17

creation:          Nov 26 13:57              residence:                    Feb 27 16:17

 

Page 7

 

Releasing Defaults

 

The table below lists the releasing defaults discussed and summarizes the configuration methods of each.

 

Releasing Defaults

Parameter

Default

How Configured

Utilization of the disk cache assigned to one file system required to trigger releasing (high water mark)

80

high=n option to mount

Upper limit on disk cache utilization required to terminate releasing (low water mark)

70

low=n mount option

Number of archive copies that must have been made before a file can be released

One

Not configurable

Minimum amount of time a file must be resident on disk before it can be released (minimum residence time)

10 minutes

min_residence_age directive in releaser.cmd file

Partial releasing enabled

no

-p option to release command or

"-release p" parameter appended to archive set declaration in archiver.cmd file

Default size of the stub left on disk when a file is partially released

16 kbytes

partial=n mount option

 

Actual size of the stub left on disk when a file is partially released

default stub

-s partial_size option to release

 

Maximum size of the stub that can be left on disk when a file is partially released

16 kbytes

max_partial=n mount option

 

Logging enabled

no

logfile directive in releaser.cmd file

 

 

Page 8

 

Staging

 

When a released file is accessed, it is normally returned to disk in the process of staging. In Release 4.6 of the software, a file may also be read directly from an archive copy without writing the file to disk.

 

The two stager daemons are sam-stagerd which handles most staging requests and sam-stagealld which  handles associative staging

 

These daemons are collectively referred to as “the stager.” The stager daemon sam-stagerd handles most staging requests. The process called associative staging is handled by sam‑stagealld. Associative staging occurs when staging a single file in a directory or archive set causes staging of all other files in the directory or archive set. Associative staging saves time when applied to files usually accessed as a set such as medical images of a single injury.

 

For example, associative staging might be used when a physician accesses one X-ray of a group of X-rays taken of the same injury. Because the physician will probably look at the other images of the injury as well, it saves time if they are all staged when the first image is staged.

 

You can modify the default behavior of staging daemons by creating and configuring the /etc/opt/SUNWsamfs/stager.cmd file, although staging is usually straightforward and configuration is rarely useful.

 

The staging behavior of files may be modified using the stage command or by setting directives on archive sets in the archiver.cmd file. The stage command allows you to force all files in a directory to stage associatively when any one of the files in the directory stages. Directives in the archiver.cmd file allow you to force all files in an archive set to stage associatively.

 

File Access and the “never stage” Attribute

 

Files can also be configured never to stage. If a file is configured never to stage and is subsequently released, it will not return to the disk if it is accessed, but will instead be read directly from archive copy. Unless the archive copy is on an online disk archive, this will require the use of a tape drive.

 

In the single-writer, multiple-reader file system, a released file set never to stage cannot be accessed by a reader system at all; access to the file is denied. In the shared QFS file system, a released file set never to stage will be read directly from archive copy if it is accessed by the server host. If it is accessed by a client host, the file stages even if the “never stage” attribute has been set on the file. As a result of this inconsistent behavior, Sun Microsystems does not support the stage -n attribute on the shared QFS file system (Jonathan Kennedy and Spencer McEwen, Harvard University Library)

 

This section of the module discusses the use of the stage command to force individual files or all files in a directory to stage. It also discusses configuration of staging file attributes using the stage command and using entries in the archiver.cmd file.

 

The stage Command

 

When a user or process accesses a released file, that file will automatically stage. It is also possible to stage one file or all the files in a directory directly with the stage command. The stage command can also be used to set staging attributes on a directory or file.

 

To stage a file or all files in a directory, use the following syntax:

stage [-c copy_number] filename...

stage [-c copy_number] -r dirname...

To set staging attributes on a file or the files in a directory, use the following syntax:

stage [-d] [-n] filename...

stage [-d] [-n] [-a] -r dirname...

 

Page 9

 

Option

Attribute

-a

Must be used with -r dirname. It specifies that all files in the directory will be staged associatively - that is, if one file is staged, the rest will be staged. This option cannot be used with the -n or -c options.

-c copy_number

Stage a file or all files in a directory from the archive copy specified as copy_number. Copy 1 of a file is staged by default.

-d

Resets staging attributes on a file or directory to the defaults. This option cannot be used with the -c option.

-n

Specifies that the file never be staged. This option cannot be used with the -c or -a options

-r dirname

Recursively sets staging attributes on files in the directory if used with -n or -a, or recursively stages all offline files in a directory if used with no options or with the -c option. Any attribute set on a directory using this option is inherited by files added to the directory after the attribute is set.

 

Staging can be configured in the archiver.cmd file by appending staging parameters to archive set assignments. Such parameters apply to all files in the archive set. The format of the archive set assignment directive is:

 

archive_set_name       path     [search criteria …]     staging_directives…

Where staging_directives may be:

 

-stage n to specify never to stage files in the archive set.

-stage a to specify to stage files in the archive set associatively

 

The following example shows an archive set directive from the archiver.cmd file. The archive set is called xrays and contains any file in the directory /sam1/images. Files in this directory will stage associatively.

 

xrays   images   -stage   a

 

Viewing Staging Attributes

 

Staging attributes appear in the output of sls -D. The following example shows a file configured for associative staging. If associative staging was set by issuing the stage -a -r dirname command, this file will stage when any other file in its directory is staged. If associative staging is set by appending -stage a to an archive set assignment directive, the file will stage when any file in its archive set is staged.

 

Page 10

 

memo1:

mode: -r-xr-xr-x         links: 1             owner: root      group: other

length: 10276              admin id: 0      inode: 1083.1

stage -a

copy 1: ----                  Nov 26 13:59              14f.11fb          dt         TAPE03

copy 2: ----                  Nov 26 14:18              2d.1                 dk        backup1 f45

access:                         Nov 26 13:57              modification:               Nov 26 13:57

changed:                      Nov 26 13:57              attributes:                    Feb 27 16:17

creation:                      Nov 26 13:57              residence:                    Feb 27 16:17

 

Hosted by www.Geocities.ws

1