Changes to IP Features
IPQoS
http://www.supertechnetworks.com/library/dscpconversion.htm
IP QOS allows the administrator to control and monitor network traffic by type. It can be used to charge for network services and to determine what kind of traffic gets preferential treatment. You can control traffic by its originating IP address, so that certain systems get preference over others, by the port number, so that certain applications are treated preferentially, etc. It is described in RFC 2474 (BE), 2475(EF), 2597(AF) and 2598, and goes by the name “DiffServ”, short for Differentiated Services. It was first implemented in Solaris 9. It uses what used to be called the Type of Service (ToS) field in the Internet Header. That 8 bit field has been renamed the DS Code Point or DSCP field. Only the first 6 bits are used for DiffServ which means you can classify traffic into 64 possible groupings, 0 (000000) to 63 (111111). Any system in the network that is DiffServ aware can then treat packets differentially depending on their DSCP field.
There are some major limitations to using this protocol. Only systems on the network that are DiffServ aware will Serve Differentially, and only those that handle the IP header will have any effect at all (though there is a Network Interface layer protocol handled by the NIC that can also add a User Priority Mapping header, IEEE 802.1D/P. Sun only has one interface type, ce (10 Gigabit), that is IEEE 802.1D/P compliant.). That means that originating systems, receiving systems, and routers can control packet flow depending on DiffServ status, and then only if they are configured to do so. In addition, each system must be configured the same way to get the same handling all along the packet’s route. Since every vendor gets to decide what they do with packets marked in a particular way, you have no assurance of how your packets will be handled outside your own network or if you route packets across multiple operating systems. Q: If a router is NOT DiffServ enabled, will it regenerate the DiffServ header when it forwards the packet? Yes. It won’t act on it, but the specification for the internet layer header requires the forwarding system to regenerate all fields not changed.
The service afforded each packet will depend on the configuration of the router forwarding that packet. Each router, as well as the originating system and the receiving system exhibits a “per-hop behavior” because each hop of the packet can be differently configured. There are currently three defined types of “per-hop behavior” in the RFCs: EF or expedited forwarding, AF or assured forwarding, and BE or Best Effort. Best Effort is the default and has the DSCP marker of 0. Any packets not otherwise classified get the Best Effort, which implies no priority at all. AF and EF are not mutually incompatible, but you would normally implement one or the other, not both.
Assured forwarding (AF) allows defines four classes, one through four. The class takes up the first three bits of the DSCP field and their numbers correspond to the value in the field. Thus if the first three bits of the DSCP field are 001, that is class 1. If they are 011, that is class 3. These classes have no priority implied and class one is not higher priority than class four. The classes exist to allow administrators to assign sets of network resources to packets. You can also use them to organize packets according to the resources they are likely to use. For example, packets sent between systems on the LAN and the printer, scanner, or tape library will be assigned to one class, because they use only the cabling on the LAN. Packets that go out to the internet might be assigned to another class because they go directly to the router, then out to your main network. If all your packets are sent over your LAN, you would use only one class. Which class would be your choice. It is possible to assign resources such as buffers and bandwidth to a particular class, and which resources went to which class, and the reliability associated with those resources are also your choice (or would be if Sun implemented that capability of RFC 2597).
Each AF class can be marked for high, medium or low drop precedence, which is 3,2 or 1 in decimal, and represented by 110,100 and 010 in the last three bits of the DSCP field.
Packets that have low drop precedence are protected from being dropped at the expense of higher drop precedence packets. This is somewhat counter-intuitive, but according to RFC 2597 “A congested DS node tries to protect packets with a lower drop precedence value from being lost by preferably discarding packets with a higher drop precedence value.”
If congestion occurs, the packets might be queued, or even dropped if the router’s buffer didn’t allow enough memory for queuing. The low drop precedence packets would definitely be sent out, and the medium precedence packets would go out after the lowest precedence packets were handled. Packets assigned to AF QoS are named AFxy, where x is the class and y is the drop precedence. A high precedence packet in class one would be marked AF11, and a medium precedence packet in class 3 would be AF32. In the DSCP field this would translate into 011100 (011 is 3, 10 is 2, ignore the last 0) or in decimal 28. Such a precedence is therefore sometimes called DSCP 28. You would use AF when you have congestion on your network. If there is no congestion on your network it just wastes resources to evaluate packets and add headers to them. AF lets you prioritize traffic.
Expedited forwarding allows you to mark some packets so they have first demand on the router’s resources and are put through first. All unmarked traffic is second priority, so traffic on the network is then either expedited or not. The expedited packets get through first because they are handled first, but all other packets are sent out as well. You might use EF when you have events, like a broadcast over the network, that needs to go as smoothly as possible. VoIP is an example of an application that would be marked EF. This kind of forwarding isn’t suitable for networks that have congestion. You would use AF for networks where packets queue and might even be dropped. The DSCP recommended by the IETF for EF packets is 101110 (46). In all cases EF packets have priority over all AF packets.
DiffServ’s architecture requires a “classifier”, which has the job of grouping packets into “classes” on which different QoS markers will be placed. In Solaris, as usual, the classifier is a kernel module, in this case, called ipgpc. The architecture also includes a “marker” - the kernel module dscpmk and dlcosmk (used with VLANs) - which places the correct code into the packet. There may also be a “meter” - the kernel modules flowacct, tokenmt and tswtclmt - which measure and provide statistics on the flow rates and number of packets transmitted in any class defined by the classifier.
IPQos is turned off by default when the OS is installed. It is configured by entering configuration parameters into /etc/inet/ipqosinit.conf, and enabling the contents of the file with the command ipqosconf -a. It can be turned off with ipqosconf -f, which flushes the configuration.
The file format is: first line, version
fmt_version 1.0
Following lines specify QoS kernel modules and actions for those modules to take. An example is on p 7-9.
You define classes by giving them names and specifying the action for the next module to take with those classes. You then apply filters to each class which defines what packets are in that class - for example all packets destined to port 515 and headed out of the system.
You can collect statistics on your network traffic and view it with the command
kstat -m (module). kstat also allows you to view statistics on the tcp and ip protocols and on MDT (below).
The FireEngine Project
This project redesigned the TCP/IP stack to improve performance on a single CPU by controlling and organizing multiple connection threads. It also allows the networking load could be spread across multiple CPUs. Previously a single NIC was bound to a single CPU. Now a single NIC can spread a processing load over multiple CPUs. It also supports 10 and 100 Gbit Ethernet which must spread their load over multiple CPUs - one CPU cannot handle the incoming load of bits otherwise.
There are no changes in the way applications are used or in administration.
TCP Multi-Data Transmission (MDT) Enhancement
MDT allows the IP module to send multiple TCP packets to the NIC simultaneously. It is only available in those NICs that support the transfer, so This speeds up transmission when applications send large volumes of packets out in a short time. It is enabled by default in Solaris 10. You can turn it off by doing ndd -set /dev/ip ip_multidata_outbound 0, but it isn’t used if large volumes of data are not being sent. The main reason to disable it is that QoS cannot be used if MDT is active. Also disable it if you use an application that employs a non-MDT aware STREAMS module such as ipfilter.
Changes to IP
The command routeadm can now be used in place of “route” to configure routing more easily. By itself, it reports the routing configuration. With -e and one of ipv4-forwarding, ipv4-routing, ipv6-forwarding and ipv6-routing, it enables the specified service. With -d and the arguments listed above, it disables the service. This simply writes to the file /etc/inet/routing.conf. The command routeadm -r reverts to the default - whatever the system booted with. You can configure multiple items at once, like routing AND forwarding, but must preface each with -e.
Once a configuration has been set, routeadm -u applies the configuration. You must apply routeadm -u only if you are enabling routing.
The arguments: ipvx-forwarding enables routing, and ipvx-routing starts the daemon.
Other arguments, listed on 7-25 allow you to stop the daemons and to specify a different daemon from in.routed, like in.gated, with a path to the daemon.
The settings created are global and apply to all interfaces configured in the future.
Routing and DHCP
In the past routing was always disabled if any interface on a system was a DHCP client. While an interface used in routing cannot be configured with DHCP, it is now possible to configure an interface on a router with DHCP as long as it is not used for routing. There is NOTHING about this in the release notes or the admin notes. THEY say Do NOT make a router a DHCP client.
IPoIB (IP over Infiniband)
Infiniband is being supported in Solaris 10. The drivers are ibd, so ifconfig ibd0 provides information on the first instance of an Infiniband interface. It has a 20 byte MAC instead of a 6 byte MAC, so it looks pretty complicated in the output from commands. Right now most commands work the same as they do with conventional adapters. They have been updated to display the Infiniband 20 byte MAC correctly.
You can only plumb and unplumb a virtual interface with decimal numbers. It used to be that you could use hex and octal, but it didn’t work right and it was never documented. Now you can’t use it at all.