Solaris Zones

Solaris Zones

Brandon Gregg's excellent zones resource

http://www.sun.com/bigadmin/content/zones/ - BigAdmin web page for zones

The technological nature of zones

Solaris zones are described in Sun’s documentation as multiple virtual instances of the Solaris OS running on a single set of hardware. They are a software analogy to hardware domains, in which a single server runs multiple actual instances of Solaris, each assigned to a set of CPUs and I/O boards of which it has exclusive and inflexible use. In zones, virtual instances of Solaris share all hardware resources, so zones have great flexibility in assigning resources to applications.

In reality it is impossible for multiple instances of Solaris to share hardware. A UNIXkernelis a set of hardware drivers. Multiple instances of Solaris sharing hardware therefore imply multiple instances of hardware drivers, all contending for the same hardware. Unless there is some kind of software manager (such as VMWare) intervening between the OS and the firmware, multiple Solaris kernels cannot share the same resources. There cannot be two or more tape drivers on a system, each generating logical device names for tape drives. There cannot be two or more of the same kind of disk driver, each trying to pass data to a disk. When it comes to kernels, there can be only one. This reality underlies the technical nature of zones: Unlike hardware domains, they are not true instances of Solaris, because all zones on a single platform must share a kernel.

The Solaris boot process is divided into four parts: the PROM phase, the boot program phase, the kernel phase and the init phase (which actually begins when the kernel spawns the scheduling daemon sched, which in turn spawns the init process). Since zones must share a single kernel, it is apparent that a zone cannot be differentiated at least until the conclusion of the kernel phase of boot, and therefore that a zone can only include the parts of Solaris that begin during the init phase of boot. A zone is only a part of a true instance of Solaris, in fact, a zone is an isolated instance of a Solaris process tree under the main instance of Solaris.

The uses of zones

Zones were designed for installations that have replaced multiple servers, each dedicated to an application, with one large server on which multiple applications may need to run. This practice is called “server consolidation.” Some applications, such as databases and file systems, can be competitive for resources and quite predatory in acquiring them. Running different applications or even different versions of the same application in different zones allows the administrator to segregate application processes and provides a method of controlling resource use from the operating system on which the application runs.

Zones also provide security since each appears to external users as an independent server with its own root password. Each zone is assigned one or more IP addresses which are implemented on physical interfaces as virtual interfaces. Any connection to the IP address is directed to resources running on the zone, which appears to the connection to be the only instance of Solaris running on the system. Connections to the system can therefore be isolated from the global zone and therefore from control of the hardware and from applications and data in other zones. IPMP can be implemented in the global zone on the sub-zone’s IP address. Failover of the zone’s IP addresses will be performed automatically.

Not all software available for Solaris can run in a sub-zone. NFS services can run only from the global zone, for example, and routing may only be performed in the global zone. Oracle RAC will be very difficult to set up securely in a sub-zone because it requires so many modifications to the kernel. Check with your software manufacturer or Sun’s online resources to see if your software package can be run in a sub-zone.

Zone definitions and characteristics

So far the term “zone” has been used to describe isolated process trees under an original instance of Solaris. The process tree of this main instance is also referred to as a zone: it is called the “global” zone, and it is the only zone that always runs in Solaris 10. The global zone consists of the original process tree under the daemon sched, and it is perfectly possible to run Solaris with no zones but the global zone and to thereby ignore the entire issue of zones, as many sites do. The global zone is a privileged zone. It is the only zone allowed to configure the kernel, so it is also the only zone in which hardware can be configured or IPMP groups set up or ipfilter configured. In addition, zones can only be configured, installed and booted from the global zone.

You can have up to 8191 sub-zones below the global zone. These are also called “local zones as well as “sub-zones” or just “zones.” In this document we will use the term “zones” or “sub-zones” for all isolated process trees under the global zone, and “global zone” for the main process tree. Each requires a minimum of 100 Mbytes of disk space and should have 40 Mbytes of RAM above the minimum necessary to run the global zone, though this is a fairly arm-waving number. Swap is shared among all zones and is not configured for local zones, so your system’s swap must be adequate to the needs of all applications, however, it is easy to add swap space in Solaris if you need to do that for the use of applications in zones. Each zone requires a unique IP address connected to a virtual interface on an adapter. Each is given an ephemeral numeric identifier at boot time and each has a permanent name assigned by the system administrator.

All sub-zones share all the system’s hardware resources and the kernel, and may also share some or most of their OS software with the global zone. They will not share processes with each other or with the global zone, and no sub-zone is aware of the processes running in other zones. They can only communicate with each other through the same system protocols used to communicate over the network. The processes of the sub-zones, however, are all visible to the global zone, so the administrator can monitor and control sub-zones.

Each zone runs a process tree starting with the daemon “zsched” which communicates with sched in the global zone. Each zone has its own zsched, which is started by the global instance of init. The daemon zsched starts init in the sub-zone, and init starts svc.startd which starts configured services. Each zone also has one instance of the management daemon zoneadmd which runs only in the global zone and is not visible in the sub-zone.

The nature of zones dictates their functionality and has enormous implications for their administration and use. You can add a tape drive in the global zone by running devfsadm. That tape drive can then be made available to all sub-zones, but it cannot be configured again in any sub-zone. A disk drive can be partitioned only in the global zone, but can have a file system added in any zone. A zone can run applications that require shutdown scripts because it has its own instance of init. It cannot configure a new network adapter because the adaptor’s driver is part of the kernel, and so it can never use virtual instance zero of an adapter. It must always use instance 1 or higher. It can listen for telnet requests over the adapter because it has its own inetd daemon and telnetd daemon, but it cannot have its own IPMP groups because IPMP is implemented as a kernel module. Firewalling each zone independently with ipfilter is impossible because the kernel module ipf can have only one configuration for each physical adapter, while zones use virtual interfaces, nor can a zone configure panic dumps. Those occur when the kernel is dumped so there can be only one configuration. A zone can run a cron job and add users only to the zone though because the daemon cron and the authentication process are all started under init. Each zone will have a nearly complete set of OS files, including /usr, /etc and /sbin because those are used by the Solaris process tree. All these characteristics of zones are the result of its nature as a process tree sharing a kernel.

Zone access and security

You access a sub-zone by logging in from the main zone using the command zlogin <zone-name> from a shell in the global zone, or by using any enabled network protocol to log in to the zone’s IP address, which may be included in any name service in the usual way. Run level on a zone may be changed using the command zlogin <zone-name> init <run level> from the global zone, for example, zlogin zonea init 0 will shut down the zone named “zonea.” It is obvious to the user logging into a zone that they are working in a sub-zone: the process tree shown in the output of ps -ef starts with zsched rather than scched, and a quick survey of the /dev directory will likely show few or no configured physical devices. No zone may communicate with another in any way except through network protocols, so it should be difficult to hack into the global zone from a sub-zone even if you know that you are logged in to a sub-zone. Solaris implements the Internet and Transport layers of the TCP/IP network model in the kernel, so any firewalls specific to a zone must function only at the Application layer. Although you cannot generally firewall zones independent of each other, the entire system, including sub-zones, may be firewalled. The global zone can see all processes in the sub-zones, so any hacker will be working blind while root in the global zone can see all the hacker’s activities.

Designing zones

Each zone will consist of a nearly complete copy of the Solaris OS under a directory called “root” located in a directory designated by the administrator. In this document the directory “/export/zonea/root” will be used as the location of the zone, but any directory under / is an acceptable zone location. Many or most of the OS files under the mount point will actually be accessed through a virtual file system called a “loopback file system (lofs).” Loopback file systems are read-only file systems that reference another file system through a specified mount point.

Zones may be set up on a continuum between two extremes: a sparse zone and a whole zone. In a sparse zone, the directories /lib, /platform, /sbin and /usr are loopback file systems, referencing those of the global zone. (Directories that will be mounted as loopback file systems are referred to as “inherit-pkg-dir” in command output and input.) The file system /sbin would therefore be referenced through a loopback file system on a mount point such as /export/zonea/sbin, /usr would be referenced through a loopback file system at /export/zonea/usr, etc. As these are large, fairly static file systems, this allows you to set up a zone that has everything necessary to run an application without using a lot of disk space. I

If you plan to completely gut and customize your libraries and binaries, you may want to have a whole zone, where your zone includes unique, dedicated copies of all files in /lib, /platform, /sbin and /usr. This will take a lot more disk space - as much as is used by /lib, /platform, /sbin and /usr. If you start with a sparse zone you can configure a file somewhere between a sparse zone and a whole zone, by adding local versions of some of these file systems. If you want to share /lib, /sbin and /platform with the global zone, but your application will install libraries in /usr, you can install just /usr in the sub-zone on top of a sparse zone installation. You can also share additional directories in the global zone as loopback file systems. The more operating system files that are unique to the zone, the more you can customize them for the application or other uses of the zone.

Other directories are present but empty. The directory /opt exists in the sub-zone, but no files installed under /opt in the global zone will be included in the sub-zone unless it is specified as an inherit-pkg-dir. The directory /dev exists and all virtual devices, such as /dev/random and /dev/null are present, as are subdirectories such as dsk and rmt that contain physical devices. Unless a physical device is specifically added to the zone during configuration however, directories containing physical devices will be empty.

Configuring zones

The command zonecfg is used to actually configure the zone. It creates the file /etc/zones/zonea.xml (or other zone name). This command is complex, but if you make a mistake, it can be completely undone by deleting this file. At minimum zonecfg requires you to specify:

1. The zone’s IP address. A netmask can be specified in slash notation after the IP address, but you can also allow the netmask to default to the classful netmask in the usual way.

2. The name of the zone.

3. The location of the zone: the “zonepath.”

The following lines form a minimum zone:

#zonecfg -z zonea

zonecfg:zonea:> create

zonecfg:zonea:> set zonepath = /zones/zonea

zonecfg:zonea: > add net

zonecfg:zonea:net> set physical=eri0

zonecfg:zonea:net> set address=10.3.4.15

zonecfg:zonea:net> end

zonecfg:zonea:> commit

zonecfg:zonea:> exit

The following optional items are commonly added to the zone configuration:

1. The file systems you want to create for the exclusive use of the zone.

2. The devices that will be available to the zone, from those known to the global zone.

3. Automatic boot when the global zone boots

zonecfg:zonea: > autoboot = true

3. One resource pool of CPUs.

4. Fair Share Scheduling for the zone. These last two properties are what form a “container ” and are discussed in the resource management document.

All these properties, and any others added to a zone with zonecfg can be changed only in the global zone. The zone administrator cannot change the IP address of the zone, unmount a file system added to the zone as a file system, or configure resources. These activities are restricted to the administrator of the global zone. In addition, for security reasons, zone administrators cannot run the command snoop because it would allow the zone direct access to other zones’ packets.

The zone configuration created with zonecfg is saved as an xml file named for the zone in the directory /etc/zones in the global zone. A zone named zonea would therefore have a configuration file called /etc/zones/zonea.xml. The directory /etc/zones also has a file called “index” which lists installed zones. Once all zones are installed, they will boot in the order specified in the index file. This order can be changed simply by editing this file, though it is unsupported to do this on a production system. A configured, uninstalled zone is an xml file in /etc/zones and nothing else. It serves as a template for the zone’s installation, and later, for alterations to the zone. Once the zone is installed it can be booted and used.

The simplest possible zone configuration gives you a sparse zone. Additional loopback file systems can be added to the zone using the add inherit-pkg-dir utility of zonecfg. If you want a whole zone, use the command create -b to create the zone, then add any loopback file systems you want with inherit-pkg-dir. You can also create a whole zone using the create -b command in zonecfg, then use the ccommand remove inherit-pkg-dir to get rid of any loopback file systems.

If you want to make any devices available to the zone you must use the command set match= /dev/<path to device>. For example if you want to make a tape drive available to the zone:

zonecfg:zonea: > add device

zonecfg:zonea:device> set match=/dev/rmt/*

zonecfg:zonea:device> end

To add one Solaris partition to a zone:

zonecfg:zonea: > add device

zonecfg:zonea:device> set match=/dev/dsk/c1t1d0s0

zonecfg:zonea:device> end

zonecfg:zonea: > add device

zonecfg:zonea:device> set match=/dev/rdsk/c1t1d0s0

zonecfg:zonea:device> end

Both raw and block devices must be separately added in separate sequences, as shown above.

Memory-related devices and adapters cannot be added to a zone for security reasons.

Installing a zone

A configured zone is installed with the command zoneadm - z <zone name> install. The installation process creates a copy of /etc, /var and any other indicated directories under the zone’s mount point, and creates loopback file systems for others. It may take an hour or more depending on how many files have to be copied and the processing power of the system. If something goes wrong during the zone installation, you must run zoneadm -z <zone-name> uninstall.

Installation copies over existing directories from the global zone but also essentially the sysconfig tool command so that the zone’s system identification is mostly blank. You will have to set a superuser password and create users, configure files in /etc/default, set up any name services files, etc. The base system configuration must be performed once using the command zlogin -C <zone-name>. This is most easily done at the console, so for servers connected to a terminal concentrator, simply run the command. In the CDE open a terminal window, run the command zlogin in the window, and select the device type dtterm. In the Gnome desktop, open a window, run # dtterm, then run zlogin in the resulting window, selecting the device type dtterm.

Installed zones can be booted with the command zoneadm -z <zone name> boot. The zone may be configured to auto-boot at the time the system boots, or to wait for manual boot by the administrator of the global zone.

Sub-zones have seven possible states that describe their condition: undefined, configured, incomplete, installed, running, ready and “shutting down”. The only states routinely encountered by an administrator are installed, ready and running and shutting down. The undefined state merely means the zone doesn’t exist. In a configured zone, the file /etc/zones/<zonename>.xml has been created but the zone has not been installed. An incomplete zone has not been properly installed, and some kind of failure occurred while it was installing. It cannot be repaired and must be removed with zonadm -z <zone name> uninstall. As an installed zone boots, it passes through the ready state and eventually ends up running. As it shuts down, it passes through the “shutting down” state. Once the zone is shut down it returns to the installed state.

File Systems and Zones

A zone may be installed under the root file system or may be placed on a file system of its own mounted on the zone directory. Which you would use depends on the amount of space in the root file system and on the expected use of the zone. It is not a good idea to place a zone running an I/O intensive application on the same partition or even the same spindle as the root file system. Your performance will degrade because there are two sets of disk-intensive processes attempting to access the same disk.

Zones can be used to control and improve disk I/O performance. For example, if you are running DNS services in the global zone, those services will contend for the partition hosting /var with any other applications on the server. If you put DNS services in a zone on its own partition, the global /var partition won’t be affected by DNS requests.

Zones are allowed to use any amount of disk space assigned by root on the global zone. The administrator controls the zone’s use of space by assigning partitions or soft partitions of the selected size to the mount point file system of the zone, and to the partitions that make up the zone file systems. Each zone itself must have at least 100 Mbytes for a sparse installation. Any additional files must go into additional space. Applications may also require space.

A file system can be installed in the zone by using the add fs utility of zonecfg and specifying the raw and block device and the file system type. The file system can be a ufs file system or a loopback file system, or any other type.

zonecfg:zonea: > add fs

zonecfg:zonea:fs> set special = /dev/dsk/c1t1d0s4

zonecfg:zonea:fs> set raw = /dev/rdsk/c1t1d0s4

zonecfg:zonea:fs> set dir = /home

zonecfg:zonea:fs> set type = ufs

zonecfg:zonea:fs> end

This file system will be part of the zone’s configuration and will not appear in the zone’s /etc/vfstab file. The initialized file system cannot be unmounted or maintained in the zone.

If the disk devices /dev/dsk/.. and /dev/rdsk… are added to the zone with zonecfg, using the “add match” utility, file systems can also be initialized, mounted and maintained within the zone on those devices. This is risky and not recommended because it allows the zone’s local administrator to panic the system. It is also possible for the root administrator to make a file system available to a zone by mounting it on a mount point under the zone’s root directory:

# mount /dev/dsk/c0t1d0s4 /export/zonea/root/mnt

makes the file system on /dev/dsk/c0t1d0s4 available in the zone under the zone’s /mnt directory.

NFS shared file systems can be accessed by the zone as a client, but a sub-zone cannot be an NFS server.

Zone administration

The administrator in the global zone can administer the local zones using the command zoneadm.

zoneadm list -cv lists all configured zones (zonecfg -z zonea HAS been run, zoneadm -z zonea install has NOT been run)

zoneadm list -iv lists all installed zones (zonecfg -z zonea HAS been run, zoneadm -z zonea install HAS been run)

zoneadm -z zonea boot boots the zone

zoneadm -z zonea reboot and zoneadm -z zonea halt respectively reboot andd halt the zone. These act like reboot and halt in the global zone and do not run the shutdown scripts. To shut down in an orderly fashion, use init 0 in the zone.

Running init 0 in the global zone triggers init 0 in all the local zones.

Packages in zones

Note: Since patches are really packages, the following discussion implies both patches and packages when the word “package” is used.

Zones have their own /etc directory and therefore their own package and patch databases, so it is certainly possible to install a package in a zone without affecting other zones. If a package affects the kernel, it may not be possible to install a package in only one zone. The kernel is shared by all zones, so modifications to the kernel are also shared among all zones.

The behavior of packages installed in zones is indicated by a set of variables associated with the package and visible in the output of # pkgparam -v <package name>. These parameters indicate the installation behavior of packages with respect to the global zone and sub-zones. These parameters are:

SUNW_PKG_ALLZONES

SUNW_PKG_THISZONE

SUNW_PKG_HOLLOW.

The default value of all these parameters is false, and the absence of a parameter in the output of the pkgparam command means its value is false. They were invented for Solaris 10, and have no value or meaning in earlier releases of Solaris.

If the parameter SUNW_PKG_ALLZONES is set equal to true, a package can only be installed in the global zone and will be applied to all zones identically. A package that affects the kernel would have this variable set to true. If the ALLZONES parameter is set to true, there is only one way to install the package - in all zones at once. If SUNW_PKG_ALLZONES is set to true and SUNW_PKG_HOLLOW also set to true the package database in each sub-zone WILL indicate that it has been installed in that sub-zone, even though it has not. This is rare and done only for dependency checking. Normally a package with the ALLZONES parameter set to true updates only the global package databases and not the sub-zone databases.

If the ALLZONES parameter on a package is set to false, the package can be installed in any zone separately depending on the setting of SUNW_PKG_THISZONE. If the value of THISZONE is also false, the package can be installed in the global zone only, in all zones at once, or in any sub-zone only. The package is installed in the global zone only by running the command pkgadd -G <package name> as root in the global zone. Running the command pkgadd <package name> in the global zone will install such a package in all zones at once. It can also be installed in any single sub-zone using the command pkgadd <package-name> in the sub-zone.

A package in which SUNW_PKG_THISZONE is set to true can be installed in only one zone at a time. In that case, simply add the package with pkgadd <package name> as root in whichever zone you want to install it in, including the global zone.

If:	Install as root in all zones at once	Install as root in the global zone only	Install as root in a sub-zone only
ALLZONES=true THISZONE=false HOLLOW=false	pkgadd <pkg_name>	N/A	N/A
ALLZONES=true THISZONE=false HOLLOW=true	pkgadd <pkg_name>	N/A	The package will appear to have been installed in all sub-zones, because the package databases will include installation information.
ALLZONES=false THISZONE=false HOLLOW=false	pkgadd <pkg_name>	pkgadd -G <pkg_name>	pkgadd <pkg_name>
ALLZONES=false THISZONE=true HOLLOW=false	N/A	pkgadd <pkg_name>	pkgadd <pkg_name>

Jumpstart and Zones

According to Sun’s documentation, zones cannot be included in flash archives. Experiments conducted by Lawrence Lee indicate that they can and that they will install as long as the zone is modestly configured after installation and prior to boot. Use the following guidelines:

1) In the file /etc/zones/<zonename>.xml, “autoboot” must be set to false.

2) Set the IP address of the zone in /etc/zones/<zonename>.xml to some unused IP address prior to making the flash archive. After the flash archive installs, configure the zone’s IP address to the correct value in /etc/zones/<zonename>.xml.

3) Logical device names may need to be configured after installation if the flash archive is used on clone systems whose device paths differ from those of the master system. Any file systems included in the zone must have pathnames for raw and block devices set to those available on the host. In addition, disks must be partitioned and file systems initialized on the partitions before the zone is booted. This may require too many system-specific configurations to be practical.