Resource management

 

Solaris 10 enables you to configure containers, which consist of an instance of Solaris running in a zone, along with a set of resources assigned to that instance. This paper discusses the assignment of memory and CPU priority to processes running in zones.

 

Overview of resource management

 

Resource management in Solaris has been around since System V Unix, was fully implemented in Solaris 9, and is a very large topic. Originally resource management controlled the use of memory and CPU on a per-process basis. Later the concept of projects was added to Solaris specifically to allow easier implementation of resource management. A project is a database (defined in /etc/project or other equivalent name resolution service) to which processes can be assigned and for which resource control can be configured. The project ID is attached to a process the same way a user ID, group ID and process ID are, and allow the kernel to control the process based on its project ID the way it controls processes based on the UID and GID. A given project can be set up to include all of a user’s processes, all the processes started by a group, or all the processes started by an application. A project can also be defined with resource controls which specify how much CPU, memory and other resources can be used by processes running under the project.

 

Default Projects

 

Every process in the Solaris Operating System is assigned to a project. By default these are:

 

system - all system processes

user.root - all processes belonging to user root

group.staff - all processes belonging to members of the group “staff”
default - all other processes belonging to users

noproject - any process that does not fit into a defined project

 

Managing projects

 

A project can be created with:

 

# projadd <project name>

 

for example:

 

# projadd maryann

 

Look in /etc/project to see the newly created project and the default projects:

 

# cat /etc/project

 

The project name assigned to a process can be seen in the output of ps:

 

# ps -eo project,pid,user

 

Users and groups are associated with projects in the /etc/project database. Processes may be attached to projects through an application if that application is “project-aware,” which means that it has been written to use Solaris projects. Applications that are not project-aware must be started manually with the command newtask

 

# newtask -p <project_name> <command>

 

for example

# newtask -p maryann sleep 100&

 

Any project defined in /etc/project can be attached to a process this way:

 

# newtask -p user.root sleep 200&

 

or with a manifest in the Solaris Management Facility that starts the application with newtask. The project ID is then propagated to all processes in the process tree started by the application.

 

View the “sleep” processes with their associated projects:

 

# ps -eo project,pid,user | grep <name of project>

 

for example:

 

# ps -eo project,pid,user | grep maryann

 

All child processes inherit the parent’s project:

 

# newtask -p maryann sh &

 

(In new terminal window)

 

# sleep 300 &

 

# ps -eo project,pid,user | grep maryann

 

A zone is a process tree under a single process. Obviously resource control can be implemented on zsched by the zoneadmd daemon, and thence propagated to every other process under the zone.

 

Resource Control

 

The resource control associated with a process or a zone can be any of those listed in the man page for resource_controls(5). For example, project.cpu-shares configures Fair Share Scheduling (discussed later) for a project or rcap.max-rss sets the total memory allowed to a project, while project.max-shm-memory sets shared memory for a project. Only three types of resource control are currently available for zones: zone.max-lwps which controls the number of lightweight processes a single zone can run, zone.cpu-shares, which configures for zone Fair Share Scheduling in a zone and CPU pool assignment, which designates the number of processors a zone can use.

 

Resource Management in Zones

 

Resource management in Solaris 10 can easily be implemented through zones since all processes in the zone are child processes of the zone’s zsched process. Zones therefore constitute a very convenient way to implement resource management with or without projects. Any application started in the zone is automatically subject to the resource usage limitations configured for the zone. No project configuration is required. For zones so far Solaris 10 only allows resource management of light weight processes and processor time through processor pools and Fair Share Scheduling. Memory and other resources can still be managed as projects under the zone as they would in the global zone. Most forms of resource management still apply exclusively to projects; for example, control of memory usage must still be implemented through projects, as in older versions of Solaris.

 

Setting Resource Controls on Zones and Projects

 

Setting resource controls for resources requires three parameters to be set for the zone in the zone configuration file or for the project in the projects database: 1) the “privilege,” which specifies the user allowed to change the value of the resource control, 2) the action to take if the process exceeds the limit, and 3) the actual resource type to be configured and its value. The possible settings for the privilege and action are:

 

privilege=basic (users can change the controls) privileged (privileged users such as root can change controls) or system (cannot ever be changed)

 

action=deny (don’t let the process ever have more than the specified resources even if no other processes need the resource) signal=signal_number (send the signal to the process if it exceeds the limit) or none (do nothing if the process exceeds the limit specified). The action is meaningless for Fair Share Scheduling and is always set to “none.”

 

The types of limits vary depending on the resource being configured. Consult the man page for the specific resource type for options.

 

For most resource control configuration the only reasonable action is “none.” The action “deny” makes sense when a process might run away. Usually the application running such a process will include the resource control information as part of the setup instructions. For example, the recommended project shared memory for an Oracle database is usually ¼ of the available physical memory and the action specified is “deny.”

To figure out a reasonable level of processor use for a project, configure resource control for the project with the action “none”. Configure logging with rctladm -e syslog zone.max-lwps (or other resouurce control). The only action triggered when the project exceeds its share would be a message sent to syslog at level debug. Increase the share until you stop seeing the message, then change the action to “deny” to ensure the project’s CPU use does not exceed the normal use you established during the testing phase.

 

Memory Control in Zones

 

Control of memory is not enabled specifically for zones and must be implemented through a project configured in the zone, the same way it would be for a project configured in the global zone.

 

Enable memory caps control with

 

rcapadm -E

 

Check that the daemon rcapd is running. You may need to enable or restart it:

 

# svcadm restart rcap

 

Within the zone you can then add memory caps to a new project using a command something like the following:

 

# projadd -K 'rcap.max-rss=32Mb' project1 

 

This command creates a project called “project1” and allows processes associated with project1 to use at most 32 Mbytes of RAM. (Note: Do not play with memory caps casually. You can make it impossible for Solaris to run and then you will have to reboot.)

 

Continuously updated usage and memory caps can be displayed with:

 

# rcapstat

 

To see resource management for configured projects:

 

# prstat -J

 

Run rcapstat and vi edit a file such as /etc/magic. You can see the changes in the usage.

 

Processor Access Management

 

Solaris 9 implemented a new form of processor access control with the utility known as the Fair Share Scheduler. The Fair Share Scheduler allows control of processor use for any type of process. Solaris 9 also implemented the management of processor sets and processor pools as part of its Resource Management architecture. It allowed the exclusive assignment of all the cycles of one or more processors to a project.

 

Fair Share Scheduling (FSS)

 

By default Solaris provides first-come first-served system resource access to all processes. Resource management in Solaris 10 allows you to instead distribute CPU and memory on a flexible basis to processes. The Fair Share Scheduler is a type of resource management that allows the administrator to determine the share of an available pool of CPUs that a zone may have. Instead of assigning a set of processors to a zone, as will be done using pools, each zone is assigned a share of all processing available. As zones become active, the share available to other zones decreases, and as zones become inactive the share available increases. The shares assigned to each zone are entirely at the discretion of the administrator. For example you might assign the following shares to the zones  zonea, zoneb and zonec:

 

zonea               20

zoneb               10

zonec               10

 

If all three zones are active, zonea has half the processing power available to it (twenty out of the forty available shares) while zoneb and zonec each have one quarter of the processing power available (ten out of the forty available shares). If zoneb becomes inactive, zonea will have two thirds of the processing power (still twenty shares, but only thirty are in demand), and zonec will have one third (ten out of thirty). If zonea later becomes inactive, zonec will get all the processing power. A share is NOT a percentage. You can allocate shares any way you want using any numbering that makes sense to you. Two shares for zonea and one share for zoneb have the same effect as 14 shares for zonea and 7 shares for zoneb.

 

Fair Share Scheduling is the most flexible way to assign processing power to zones in Solaris 10. It means that processing power is never wasted. If one zone is not using its whole share, that processing power is distributed to other zones. If a zone is not active, the processing power assigned to it can be transferred to an active zone. Processor sets enforce a minimum and maximum CPU assignment and are not as flexible.

 

FSS is configured for projects using the parameter project.cpu-shares and for zones using zone.cpu-shares. You can set up FSS for a zone, then use projects within the zone to distribute processing further. Any project or zone not assigned a FSS value gets a share of 1, which is the default except for the system project, which has unlimited shares for the needs of the OS.

 

By default processes are placed into scheduling classes which are managed by the TS or Time Sharing scheduler, by the IA or Interactive scheduler and by SYS, which manages system processes. Descriptions of scheduling classes can be found in the man page for priocntl(1). Processes managed by TS and IA may be placed under the control of the Fair Share Scheduler instead. System processes (scheduler SYS) cannot never be placed into any other scheduling class, and have unlimited use of the CPUs.

 

 The scheduler that manages each process can be seen in the output of ps:

 

# ps -efc

 

The available scheduling classes can be seen in the output of:

 

# dispadmin -l

 

The Fair Share Scheduler is not configured by default and will not show up in the output of this command until it is configured:

 

# dispadmin -d FSS

 

This command configures the file /etc/dispadmin.conf with the line

 

DEFAULT_SCHEDULER= FSS

 

so that all processes except those in scheduling class SYS will be controlled by the Fair Share Scheduler at the next reboot, including those in the global zone. The change becomes active on reboot. It is also possible to configure FSS only on a CPU pool, which limits the use of Fair Share Scheduling to the projects or zones that use that pool. Fair Share Scheduling can also be implemented on a running system with the following command, issued in the global zone:

 

# priocntl -s -c FSS -i class TS

 

This command temporarily changes the scheduler to Fair Share Scheduling for the class specified (TS). For a permanent change that will persist over reboot, run dispadmin -d FSS in addition.

 

After changing to the Fair Share Scheduler, the output of ps -efc shows that the scheduling class for all non-system processes is now FSS.

 

To change the scheduler for one running process and any child processes started later:

 

# priocntl -s -c FSS -i pid 1

 

The foregoing command will change process 1 to the Fair Share Scheduler and also any processes started by init after the priocntl command is run. It will not affect any processes previously started by init.

 

Fair Share Scheduling can be set up for a zone when it is configured:

 

# zonecfg -z zonea

zonecfg:worka> set pool=pool1

zonecfg:worka> add rctl

zonecfg:worka:rctl> set name=zone.cpu-shares

zonecfg:worka:rctl> add value (priv=privileged,limit=40,action=none)

zonecfg:worka:rctl> end

 

The same resource control settings used in the projects database are used here for a zone.

This sets 40 shares for the entire zone. No action is taken if the zone exceeds its shares, and only the superuser can change the value. If the superuser wants to change this value on a running zone, that can be done in the global zone using the command line and prctl.

 

prctl -n zone.cpu-shares -v 30 -r -i zone zonea

 

sets (-s) the value with the name zone.cpu-shares (-n zone.cpu-shares) to the value of 30 shares (-v 30) with the type privileged and the action “none” (-r which uses those defaults) so only the privileged user can change it for the zone called zonea (-i zone zonea). Settings changed with prctl are not persistent.

 

The zone administrator cannot set this value, because that would allow the zone control over its share of the entire systems resources. The zone administrator CAN set shares for projects inside the zones, but those are shares of the share allowed by the global administrator.

 

You can check resource control settings for a particular zone with

 

# prctl -n zone.cpu-shares -i zone zonea

 

You can see the status of projects for the global zone and subzones with

 

# prstat -J

 

(look at the bottom of the output)

 

or to see resource use by zone from the global zone (look at the bottom of the display):

 

# prstat -Z

 

Processor sets and pools in zones

 

Solaris 10 defines processor sets and resource pools which can then be assigned to zones. A processor set, commonly shortened to “pset,” is an organizational construct consisting of a minimum and maximum number of CPUs, and given a name. A pool is a group of one or more processor sets and can be assigned directly to a zone. The daemon poold arbitrarily selects the physical CPUs that will belong to the processor set, from all CPUs on the system and shifts processors from one processor set into another as demand requires.

 

A processor set named “pset1” might be a minimum of two, and a maximum of four CPUs. In that case, the processes in a zone that is assigned the use of the processor set “pset1” never get less than two CPUs, but may be able to use as many as four. If the zone needs four CPUs, but no more than two are available it will get two. Those two CPUs are always reserved for the zone and no other zone may use them. If the zone needs six CPUs and six are available, it will still get just four. You must obviously restrict your total minimum assignment of processors to the total number of processors on the system. You cannot configure three processor sets each with a minimum of two CPUs if you have only four CPUs! When you attempt to configure your third processor set, NO CPUs will be assigned to it. The processor set also works with Dynamic Reconfiguration on servers. You cannot DR a board out of domain on which it is a required part of the minimum processor sets. When a board is added to a server and DR’ed into a domain, the processors are distributed among existing processor sets automatically.

 

Processor set assignments differ from Fair Share Scheduling in that they do not allow zones to have unlimited use of resources not used by other zones or processes. FSS ensures processors are never underutilized as long as there are processes that need to use it. Processor set assignments allow for the possibility that some processors may be idle while other processes are starved for processing power.

 

Each system has a default processor set called “default” that consists of all processors on the system not assigned to another processor set. There must be one processor assigned to this set at all times. That processor is reserved for system processes.

 

A processor set may be configured using the command lines below, either as input to poolcfg -dc, or in a script read using pooladm -f <scriptname>. The command “poolcfg -dc” creates the processor set in memory, but will not survive reboot. The command pooladm -s will write the configuration in memory to the file /etc/pooladm.conf, which is read at reboot. The following lines create a processor set named pset1. This processor set contains a minimum of two processors (pset.min = 2), but may get as many as four (pset.max = 4) depending on availability and demand. The daemon poold will determine how many more than two are assigned to the zone at any time.

 

create pset pset1

modify pset pset1 ( uint pset.min = 2; uint pset.max = 4 )

 

For example, at the command line you can create a processor set called pset1 with a minimum of two and a maximum of four processors:

 

# poolcfg -dc ‘create pset pset1’

 

# poolcfg -dc ‘modify pset pset1 ( uint pset.min = 2; uint pset.max = 4 )’

 

If you do not have enough free processors to meet the minimum assignment, your command will fail. One processor must always remain in the default processor set for the use of system processes. If you have only one processor on your computer, commands to set up processor sets will fail.

 

You can check processor set onfiguration in memory with

 

# poolcfg -dc info

 

If the lines are used as input to poolcfg -dc, they must be enclosed by single quotes, and they will change the current configuration in memory only. To save that configuration to the pool configuration file /etc/pooladm, run pooladm -s. If the option “-c” is used alone with poolcfg, processor set and pool configurations will be written to /etc/pooladm.conf, but not instantiated.

 

Processor set assignment can be checked using the command psrset. This command shows all processor sets and which processors are assigned to which sets.

 

Pools

 

Processor sets cannot be bound to projects or zones directly. Instead processor sets are associated with pools which are then bound with poolbind. A processor set can belong to more than one pool but a pool can contain only one processor set.

 

 The processors can either be assigned on a first-come first-serve basis to the zone’s processes, or can be managed with the Fair Share Scheduler in the global zone. A pool bound to a zone can also be distributed among projects within the zone using FSS in that zone. When a pool is created, the desired scheduler can be added as a property to the pool.

 

A zone can be configured to use only one pool which may contain one processor set, in turn, containing any number of CPUs. The default pool is named “default” and consists of all CPUs not assigned to a specific pool. A CPU not assigned to a processor set is assigned to a default processor set consisting of itself. A zone not assigned a pool will use the default pool which contains all processor sets.

 

Pools superficially do not make sense. Why not just use processor sets directly and skip this second layer of administration? Part of the reason is historical. Processor sets go back to Solaris 2.6, but were difficult for the administrator to manage. The processor set still exists as it did then for backwards compatibility, but are now attached to pools that can be automatically managed by poold. Secondly, Sun eventually expects that pools will contain multiple types of resources, though that is not yet available in Solaris 10.

 

The facility to administer pools must be enabled. This can be done with the command:

 

# pooladm -e

or

 

# svcadm enable pools

 

If this command has been run once the daemon poold will automatically start and will start again if the system is rebooted. The command pooladm -e also creates a default pool and puts all processors into the default processor set. The number of CPUs available in the default pool can be determined with the command

 

# poolstat

 

The following command creates the configuration file /etc/pooladm.conf and saves the current configuration to it. Anything already in the file will be overwritten:

 

# pooladm -s

 

Any subsequent time the command pooladm -s is run, whatever configuration is in memory will be flushed to /etc/pooladm.conf. You can make changes to your pool configuration temporarily at the command line using poolcfg -dc…and make them permanent by running pooladm -s.

 

Once processor sets are created, pools can be created and processor sets bound to the pools. The same commands are used in a similar way. For example, to create a pool called “pool1” and associate a pre-existing processor set “pset1” with it, you can use:

 

# poolcfg -c ’create pool pool1’

# poolcfg -c ’associate pool pool1 (pset pset1)’

 

These commands write the configuration to pooladm.conf, but do not instantiate it. To instantiate pools configured in pooladm.conf:

 

# pooladm -c

 

You could also place the above poolcfg commands, including quotes, into a file and execute the following command to get the same result.

 

# pooladm -f  <file_name>

 

This file can also contain the commands to create processor sets, as mentioned above.

 

To simultaneously create configurations and instantiate pools and processor sets without writing to pooladm.conf, add the option -d to the poolcfg commands described above:

 

# poolcfg -dc ’create pool pool1’

# poolcfg -dc ’associate pool pool1 (pset pset1)’

 

If you make changes to your current kernel configuration, for example, using the commands immediately above, it can be saved  to pooladm.conf using:

 

# pooladm -s

 

Otherwise changes are only made in memory and will not persist across reboots.

 

Each pool can have its processors automatically controlled by the Fair Share Scheduler. If a pool is assigned to two different zones and placed under the control of the Fair Share Scheduler, the zones can then share the processing power of the processors in the pool according to whatever FSS values are assigned to them.

 

The following command creates a pool, instantiates it and assigns its processes to the Fair Share Scheduler. It does not write to any configuration file. This way the processor sets in the pool can be shared across multiple zones without allowing one zone to hog CPUs. When each zone is set up, it can be assigned FSS shares.

 

# poolcfg -dc 'create pool <pool-name> (string pool.scheduler=”FSS”)’

 

 

You can check your current pool configuration using

 

# pooladm -n

OR

# poolcfg -dc info

 

If you want to remove your pool and processor set configuration from memory:

 

# pooladm -x

 

returns the processor set and pool configuration to the default processor set and the default pool. If you want to write that default configuration to pooladm.conf:

 

# pooladm -s

 

The combination of pooladm -x and pooladm -s will cause your configuration to revert to the default. Use it only if you are sure you do not want to keep any part of your configuration.

 

Zones and Pools

 

A zone may be bound to one and only one pool. The pool may be bound to multiple zones. Zones are bound to pools when the zone is configured:

 

# zonecfg -z zonea

zonecfg:zonea> set pool=pool1   

zonecfg:zonea> verify

zonecfg:zonea> exit

# init 6

 

Note that configuring FSS on a zone does NOT require any pools to be assigned to that zone if pools are not enabled. Once pools are enabled, a pool must be assigned to the zone, if only the default pool: pool_default. If FSS is designated the scheduler for a pool, you may want to specify the zone’s shares as well or the zone will get the default of one share. The following configuration assumes that the pool “pool1” has already been configured to use the Fair Share Scheduler:

 

# zonecfg -z zonea

zonecfg:zonea> set pool=pool1

zonecfg:zonea > add rctl

zonecfg:zonea:rctl> set name=zone.cpu-shares

zonecfg:zonea:rctl> add value (priv=privileged,limit=20,action=none)

zonecfg:zonea:rctl> end

# zoneadm -z zonea init 6

 

The effects of this command can be viewed in the global zone after the zone reboots:

 

# ps -ef | grep init

 

If a pool should be bound to a running zone without reboot:

 

# poolbind -p pool1 -i zoneid zonea

 

 

Summary of pool and processor set commands:

 

Create a pool or processor set in memory only:

 

# poolcfg -dc…

 

Write the configuration in memory to /etc/pooladm.conf:

 

# pooladm -s

 

Create a pool or processor set in /etc/pooladm.conf only:

 

# poolcfg -c…

 

Instantiate the pool or processor set in /etc/pooladm.conf:

 

# pooladm -c

 

So the following two sets of commands both have the effect of creating a configuration in memory that will survive reboot because it was written to /etc/pooladm.conf:

 

# poolcfg -dc…

# pooladm -s

 

OR

 

#poolcfg -c…

#pooladm -c

 

Remove the configuration from memory and revert to default:

 

# pooladm -x

 

Remove the configuration from memory and from /etc/pooladm.conf and revert to defaul:

 

# pooladm -x

# pooladm -s

 

Hosted by www.Geocities.ws

1