Ethernet

 

urls:

http://docs.sun.com/app/docs/doc/806-4015/6jd4gh8fj?a=view  - the tunable parameters manual

 

The Ethernet is an IEEE standard used in for networking hardware and also by the network interface layers of the TCP/IP protocol. Ethernet may be packet-switched and half-duplex , or switched, and full duplex.  It is always broadcast, which means a packet is put out onto the local link, and the network interface implementation on all hosts see it. If an Ethernet-based network is half-duplex, it uses a contention method of access called CSMA/CD. This access method distinguishes it from ATM or telephone lines in which there is dedicated bandwidth and guaranteed quality of service, and which use a "reservation" method of access, or from a token ring, which uses a round robin method of access.

 

Ethernet transmission protocols: CSMA/CD

 

 Within an Ethernet LAN, when a computer wants to send a packet, it simply checks that the line is clear and then transmits.  Every computer on a network has equal access to the physical network, and each sends out packets at will.  A network that employs this method of access is engaged in packet switching.  The specific packet-switching protocol used by half-duplex Ethernet to control transmissions and avoid packet chaos is CSMA/CD – Carrier Sense Multiple Access/Collision Detect.  On networks using this protocol, multiple hosts are connected to a single cable or set of cables (Multiple Access). Hosts check the wire and transmit when it is not busy (Carrier Sense).  Packets are sent out by the originating host down the wire to the hub and the hub retransmits those packets back down the receiving wires of each system attached to it, including the originating system.  Every system's interface receives every packet sent out.  Since a system nearly simultaneously receives and sends its own packet, CSMA/CD is, by definition, only half duplex, as the send and receive wires are both involved in sending.

 

On an Ethernet LAN, any two systems may send out a packet at the exact same time, resulting in a collision and the loss of both packets. The entire LAN is said to be a single collision domain. Each system sending a packet out on the network must check for collisions and resend the packet if one occurs (Collision Detect). If a host doesn't receive its own packet, and instead hears a spike in voltage, indicating a collision, it waits for a while, then retransmits.  This is known as backing-off. The backoff delay is randomly selected from the range 0 to 2k units of 9.6 nanoseconds where k is in the range N to 10 (N is the number of the re-transmission attempt). If a collision occurs on re-transmission, then a further back-off occurs, only this time the maximum back-off time is doubled.  The firmware in the adaptor checks arriving packets for MAC addresses; if a packet's destination MAC address does not match that of the receiving adaptor, it is ignored. 

 

For full duplex ethernet, you must have a point-to-point connection between 2 hosts or between a host and a switch capable of supporting full duplex (such as a NetGear Ethernet switch), as well as an adaptor (such as hme or later) and cabling that supports full duplex.  The point-to-point connection exists between two systems' adaptors, using a crossover cable, or it may be between a host and the switch port using a straight-through cable.  The switch must repeat packets only to the port attached to the destination system, so that each system is on its own collision domain. CSMA/CD is not required for full duplex, as collisions will not occur on the single host collision domain present between the switch and the host. A system with a full-duplex connection can put out a packet at any time and need not wait for confirmation before continuing to transmit.

 

The biggest problem with half-duplex Ethernet is that collisions happen, and the more traffic on a network, the more collisions occur.  The command netstat  -i can be used to check on collisions.  If collisions total on the network are more than 5% of the total output packets (for 10 mbps) or 10% (for 100mbps), it is time to subdivide the network with bridges or routers, or to replace hubs with switches.

 

Ethernet frames

 

Ethernet packets are called frames.  The frame consists of a header, which is a section of information ahead of the actual data payload, and a trailer, which is the section of information after the data payload.  A frame starts with a 64 bit "preamble" which tells the system that a frame is coming in.  It consists entirely of 1010101…10101011, in which the last 8 bits are the "start of frame byte."

 

The next section of the frame gives the destination MAC address of the packet. Each adaptor is assigned a MAC address. The MAC address is composed of 6 octets (48 bits), written in hexadecimal and separated by colons, and is coded into the firmware of the NVRAM or the interface card's chip.  The first 3 octets are called the CID (company ID), indicate the manufacturer, and are assigned by IEEE, while the last 3 (VID – vendor ID) are unique in that manufacturer's output.  The whole MAC address is therefore unique in the world. If a Sun system has more than one interface, it may use its NVRAM MAC address for all, or it may have a built-in address for each interface. The le (lance ethernet, used in Sparc 5s) and qe interfaces must use the NVRAM MAC address.  Sun's other interface cards (including the hme interface) have a built in local MAC address.  To allow a system to use the MAC address built in to the interface card, the PROM parameter local-mac-address? must be set to true.  Root can check the MAC address of an interface with ifconfig  -a,  and set it using ifconfig  <interface>  ether <MAC>, for example:

 

# ifconfig hme0 ether 8:0:20:ef:7:b6

 

To set a MAC address automatically at boot time, an ifconfig command can be added  to /etc/init.d/inetsvc.  The state of an interface can also be checked with netstat  -i.

 

You might want to use the same MAC address for more than one interface if each interface is connected to a different network.  If multiple interfaces are connected to the same network, and you have a manageable switch, you can use different MAC addresses with the same IP address, and thereby do load balancing. Otherwise use the same MAC address for all interfaces.

 

The third section of a frame is the source MAC address, and it is followed by a "type" byte which indicates the protocol of the packet: ARP, RARP, IP, ICMP, IPv4, IPv6 etc. There can be multiple entries in this field since the "type" items are not exclusive. Then the data payload begins, which includes header information for other layers of the TCP/IP protocol, and which cannot exceed the MTU, which is 1500 bytes for Ethernet.  Finally there is the CRC or Cyclic Redundancy Check, which is a checksum used by the receiving interface to make sure the frame has not been corrupted.  It is generated by applying a polynomial to the bit pattern of the frame, so it verifies the exact pattern of bits in the frame.  If the CRC generated at the receiving host does not match that in the frame, the packet is discarded.  Packets are also discarded automatically if their payload is <46 bytes (runts), or > 1500 bytes (jabbers).

 

The data payload includes the application data and the headers attached to that data by the transport and internet layers.  We will discuss these in detail later.

 

MTUs An MTU (Maximum Transfer Unit) is the largest amount of data, excluding headers, that can be transferred across a network's hardware. The MTU for Ethernet is 1500. You will also see the lo device when you do ifconfig -a. It has an MTU of 8232. That is the loopback device, which is simply the circuitry the host uses when it uses network protocols to talk to itself.

 

Types of addresses: When a packet goes out on the network, it may be destined for a specific MAC address, or for more than one host.  A unicast MAC address goes to one destination, and has that destination's MAC addresses encoded in the destination address field of the frame. Broadcast addresses go out to all machines on the local-link, and have the MAC address ff:ff:ff:ff:ff:ff. Multicast packets go out to a subset of all machines on the network, and all get a  the destination MAC address starting with 01:00:5e, and ending with a set of 3 octets that designates the particular subset.  For example, a packet sent to all routers gets the MAC address 01:00:5e:0:0:2. 

 

Commands:  All of a frame except the preamble and the CRC may be captured and displayed with the useful "snoop" command. The snoop command runs on any interface, and can capture and display the top level (usually application layer) activity, or complete packets line by line (snoop -v).  It can send packets to a file for later analysis (snoop -o filename followed by snoop –i filename to read), or do an intermediate verbose snoop (snoop -V).  Snoop can filter by a large number of parameters, including a single hostname  (snoop  host1) or two host names, which captures only traffic between those systems (snoop host1 host2). If you are snooping hosts on a switch, you must snoop one or more hostnames. You can specify just one type of traffic (snoop broadcast, snoop dhcp, snoop rarp, snoop udp) or ask for snoop on a specific port (snoop udp 67).  You can use the -d option, followed by an interface name, to display only traffic on that interface, for example snoop -d qfe2.

 

To change tunable kernel parameters involved with the network interfaces, the command ndd  is used.  ndd <driver>  <parameter>  lists the value of the parameter specified  (\? lists them all) while ndd  -set  <driver>  <parameter> <value> sets the parameter. If you have multiple instances of a device associated with the driver (for example, hme0 and hme1, associated with /dev/hme) use the instance keyword as ndd <driver> instance <instance number>. Otherwise the default is to display the first instance. These ndd commands can be included in the /etc/init.d/inetinit script or the parameters can be set in /etc/system using the "set" command.  In general the ndd parameters are 1 for enabled and 0 for disabled.  For example, to set the speed on an hme interface, use ndd /dev/hme  link_speed 0 (for 10 Mbps) or 1 (for 100 Mbps). Only parameters listed as read/write can be set at the command line. Read only parameters must be set in /etc/system in the form

set  <parameter>  <value>; for example: set link_speed            1. If you want to get or set the value of a particular instance of a driver, such as qfe2, specify the instance desired as:

ndd -set <driver> instance <value> <parameter>, for example ndd -set /dev/qfe 2 link_speed 0.

 

Sun does not want you to set these parameters yourself, however online documentation is available - see url above.

 

Acronyms:

CID – Company ID – the first 24 bits of the MAC address (also known as the OUI – Organizationally Unique Identifier).

CRC – an algorithm determined by the sending system which checks the integrity of a packet.

CSMA/CD - Carrier Sense Multiple Access/Collision Detect

DSAP – Destination Service Access Point – part of an ISO/OSI frame.

FCS – Frame Check Sequence – a trailer added to a packet which checks the integrity of the packet.  The type of FCS used in Ethernet frames is the CRC.

FEP – Front End Processor

MII – Media Independent Interface

MTU – Maximum Transmission Unit – the largest packet of data than can pass through an interface.

ndd – network device driver/diddler – a command that changes how the kernel communicates with a device.

SNAP – Sub Network Access Protocol – part of an ISO/OSI frame.

SSAP – Source Service Access Point - part of an ISO/OSI frame.

TDMA – Time Domain Multiplexing Access – when a process is allowed a time slot when it can control the entire bandwidth.

VID – Vendor ID – the last 24 bits of the MAC address which identify one interface uniquely.

 

Definitions:

back-off algorithm – a random number x 9.6 nsecs. The length of time a system waits before retransmitting after a collision on an Ethernet network.

broadcast technology – packets sent out on network that go to all systems. Ethernet is a broadcast technology.

circuit switching network – one where there is dedicated bandwidth between 2 hosts.

contention – a network access method in which multiple hosts all use the same for transmission.

ethernet – a high speed, inexpensive LAN technology, consisting of cables, hubs, repeaters, bridges, switches and routers.  Standards are set by IEEE.

frame- a packet sent out on the network from the preamble to the end of the CRC.  A frame is the fundamental Ethernet packet.

full duplex – it is possible to send and receive over the cable at the same time, because the cable has 4 or more pairs of wire, the interface is capable of decoding such communications, and only one system transmits at a time. In Ethernet, full duplex is possible only with point-to-point connections or between a workstation and a switch port, such that an isolated collision domain is formed between only 2 devices.

half duplex – it is possible to send and receive over the cable, but at different times. All CSMA/CD transmissions must be half duplex.

header – in a frame: the preamble, destination address, source address and type. In general, any information in front of the data.

packet – a generic term for a collection of bits sent out on a network.

packet-switching – a network used by all systems alternately.

start-of-frame-byte – the last 8 bits of the preamble

trailer – in a frame: the CRC. In general, any information placed after the data in a packet.. 

 

Commands:

 

banner (at ok prompt) - display PROM MAC address

eeprom local-mac-address? true  - cause the system to use the MAC address on the NIC rather than the system's MAC address.

netstat   - provides information about the configuration and function of network

 interfaces.

            -i [ <interval>] with optional interval:gives information on output and input.

                        output of command: gives number of input packets, output packets and

                        collisions for each interface for the interval specified, and for all interfaces

                        on the system (includes loopback, which is not listed under interfaces).

                        The exact calculation of input and output errors is determined by the

interface boards, but in general, if the output error numbers are large, you

have a hardware problem. If the input errors are large, you may have a hardware problem, or there may be a duplicate IP address.  Without the optional interval, the output includes the name and nodename and MTU of the interface, along with input and output packets and collisions for each interface.

            -n            uses IP number instead of hostname

            -s            display protocol statistics

ifconfig  - configures internet ports and gets information on configuration.

            -a            show configuration for all interfaces.

            <interface> - show configuration for only the interfaace listed.

            <interface>    ether            <MAC address>         - sets the MAC address.

uname              -S         <new name>  - sets the host

 name

hostname <new name>  - sets the host name.

snoop    - read packets off the network.

            -v            verbose – include information from headers and trailers

            -V            summary verbose – include summary header and trailer information.

            -i  <snoop file name created with snoop –o >          - read snoop info from file.

                        You may also use the –v or –V options with this command.

            -o  <filename> - write snoop info to file. This file is binary and must be read with

                        snoop –i.

-d <interface name, such as hme0> - display only traffic to this interface. This
             option is useful for routers.

            <filter>          snoop only the type of packets given: where filter can be:

                        broadcast, arp, rarp, multicast, a hostname, an IP or MAC address, udp,

                        port <#>, and many others.  Boolean expressions may be used in the filter:

                        for example: snoop  host1 AND ether.

ndd  <driver> <parameter>  - where the driver is /dev/<device> as in /dev/hme,

/dev/arp, /dev/ip, /dev/icmp, /dev/tcp and parameter is a tunable kernel parameter. All parameters and their settings may be displayed using \? in place of a specific parameter.  If only one parameter is specified in the command line, the setting of that parameter is displayed.

ndd  -set  <driver>  <parameter>  <value>            - set the given parameter to the given value.  Parameters may also be set in /etc/system or inetinit. Values: 0 is disabled, 1 is enabled.

                        Parameters:  link_mode:  0 is half duplex, 1 is full duplex

                                                link_speed: 0 is 10 Mbps, 1 is 100 Mbps

                        Drivers:  /dev/arp          /dev/icmp         /dev/ip              /dev/tcp           

ndd  -set  <driver>  instance <instance_number>  - set the instance of the driver PRIOR to setting any parameters. Default is 0, or hme0. 

                        ndd  -set  /dev/hme  instance   2

                        sets the instance to hme2. Subsequent commands will then be applied to this particular interface.

 

Misc:

 

MTUs

16 Mbps token ring (IBM)            17914 bytes

4 Mbps token ring (IEEE)            4464

FDDI                                       4352

Ethernet 2                                 1500  (older versions were 1492, which you still sometimes see)

X.25 (internet)                          576

Point to point                            296

loopback                                  8232

 

special packet names:

runt      - data section is less than <46 bytes

jabber  - data section is  > 1500 bytes

            Subsets of jabbers:

            long            1500-6000

            giant            > 6000

 

CIDs for Sun:  tadpole – 0a:0:20 E10k, SF 12k/15k - 0:0:be   most others - 8:0:20

Sunblade: 00:03:ba          

interfaces that will use the PROM MAC only: qe, le

interfaces that can use the local MAC: hme, vge, ge, FDDI, ATM, qfe

 

 

 

Frame layout:

part of frame

preamble

Destination MAC

Source MAC

type

data

CRC

size in bits

64

48

48

16

1500 bytes

32

 

To bypass autonegotiation: do  the following to choose 100 Mbps half duplex:

 

   # ndd -set /dev/hme adv_autoneg_cap 0

   # ndd -set /dev/hme adv_100T4_cap 0

   # ndd -set /dev/hme adv_100fdx_cap 0

   # ndd -set /dev/hme adv_100hdx_cap 1

   # ndd -set /dev/hme adv_10fdx_cap 0

   # ndd -set /dev/hme adv_10hdx_cap 0

 

You can also do the same through the /etc/system kernel config file:

       set hme:hme_adv_xxx = 1

   ...with xxx being one of the kernel parameters.

 

To set up full duplex ethernet: (assuming hme interface):

 

in /etc/system:

set hme:hme_adv_autoneg_cap=0

set hme:hme_adv_100hdx_cap=0

set hme:hme_adv_100fdx_cap=1

 

or with ndd at the command line:

Setting with ndd

 

ndd -set /dev/hme adv_100hdx_cap 0

ndd -set /dev/hme adv_100fdx_cap 1

ndd -set /dev/hme adv_autoneg_cap 0

 

Hosted by www.Geocities.ws

1