ppp linux kernel module
notes

These are some brief notes on the structure and function of the PPP linux kernel module.
This is work in progress.
The purposes of this web page include:
  1. Discover and understand the execution paths for host-to-network (TX) packet transmission.
  2. Discover and understand the TX queueing structures.
  3. Determine how to modify the code, if necessary, to:
    • increase the TX buffer capacity from 1 to many packets;
    • control the TX buffer so as to prioritize UDP over TCP packets

The PPP module has a user-space interface which is used by pppd.
The main question is whether this can be used so that TX packet queueing can be tightly controlled (e.g. by preemption of the current queue) while at the same time maintaining a fairly large number of TX packets in the queue so that transmission bandwidth is not wasted because of idle time when the queue is empty.

2003-9-19: I created this web page as a sketchpad for discussing a query I had from a colleague in November 2002 about how to modify the PPP kernel software for TX packet queueing, in particular to implement PPP suspend/resume. Later I used this understanding to implement ROHC/PPP. So this web page has already fulfilled its objective. In the unlikely event that anyone else finds this web page useful, you're welcome.

Structure of the linux `ppp' kernel module.

The place to start is the directory /usr/src/linux/.
In this case, I'm looking at a version 2.4.6 linux kernel.
The ppp kernel module version is 2.4.1.
The directory /usr/src/linux contains these directories:

arch                CPU-dependent stuff
drivers             device drivers
fs                  file systems
include             header files
init                boot-time code
ipc                 inter-process communication (shared memory etc.)
kernel              kernel core software
lib                 [could be anything]
mm                  memory (RAM) management
net                 network protocols
scripts             [don't ask me]

From the point of view of ppp, only the directories "drivers", "include", and "net" are directly relevant, although there are many interactions with the kernel core of course.

These files in the drivers/net directory are relevant to PPP:

drivers/net/bsd_comp.c
drivers/net/ppp_async.c
drivers/net/ppp_deflate.c
drivers/net/ppp_generic.c
drivers/net/ppp_synctty.c
drivers/net/pppoe.c
drivers/net/pppox.c
drivers/net/zlib.c

These header files are directly imported by the ppp*.c files:

"zlib.c"
asm/atomic.h
asm/uaccess.h
linux/config.h
linux/devfs_fs_kernel.h
linux/errno.h
linux/etherdevice.h
linux/file.h
linux/filter.h
linux/if_arp.h
linux/if_ether.h
linux/if_ppp.h
linux/if_pppox.h
linux/if_pppvar.h
linux/inetdevice.h
linux/init.h
linux/ip.h
linux/kernel.h
linux/kmod.h
linux/list.h
linux/module.h
linux/net.h
linux/netdevice.h
linux/notifier.h
linux/poll.h
linux/ppp-comp.h
linux/ppp_channel.h
linux/ppp_defs.h
linux/proc_fs.h
linux/rtnetlink.h
linux/sched.h
linux/skbuff.h
linux/slab.h
linux/smp_lock.h
linux/spinlock.h
linux/string.h
linux/tcp.h
linux/tty.h
linux/vmalloc.h
net/slhc_vj.h
net/sock.h

The above listing is obtained with this command in the /usr/src/linux directory.

fgrep -h #include drivers/net/ppp*.c | sort | uniq | awk '{ print $2 }' | sed 's/[<>]//g'

Apart from the zlib.c file, all of the above headers are imported from the `include' directory, and most are from `include/linux', which is the kernel directory which is made available to application software as an API to the kernel.
The include/linux directory also provides a kind of kernel-to-kernel interface, because the header files are used to import functions from one part of the kernel into another.
These kernel-to-kernel interfaces are delimited by ``#ifdef __KERNEL__'' lines to discourage applications developers from trying to use them.

The directory net/ipv4 contains most of the Internet stack (TCP/IP and UDP/IP etc.)

Arrangement of ppp-related kernel modules when loaded.

This is what I see on one of my computers for the "lsmod" command.

akenning@dog> lsmod
Module                  Size  Used by
ppp_deflate            39040   0 (autoclean)
bsd_comp                3952   0 (autoclean)
ppp_async               6032   1 (autoclean)
ppp_generic            13520   3 (autoclean) [ppp_deflate bsd_comp ppp_async]
slhc                    4304   1 (autoclean) [ppp_generic]
ipt_REDIRECT             720  30 (autoclean)
ipt_multiport            592  13 (autoclean)
iptable_nat            12336   0 (autoclean) [ipt_REDIRECT]
ip_conntrack           12416   1 (autoclean) [ipt_REDIRECT iptable_nat]
ipv6                  123056  -1 (autoclean)
mousedev                3808   0 (unused)
hid                    12448   0 (unused)
input                   3072   0 [mousedev hid]
usb-uhci               20608   0 (unused)
usbcore                46432   1 [hid usb-uhci]
ne2k-pci                4704   1 (autoclean)
8390                    5792   0 (autoclean) [ne2k-pci]

This means that the kernel modules `ppp_deflate', `bsd_comp' and `ppp_async' are client modules of `ppp_generic'.
Server modules cannot be unloaded unless the client modules which require them are unloaded first.
`ppp_deflate', `bsd_comp' and `ppp_async' are clients for `ppp_generic'.
`ppp_generic' is a client for `slhc'.
In my case, `ppp_deflate' and `bsd_comp' are not in use.
But the kernel module `ppp_async' is in use.
Therefore `ppp_async' cannot be unloaded, and the ppp_generic and slhc modules cannot be unloaded because they are being used by other modules.

The reverse considerations apply when loading the modules.
Module `slhc' must be loaded first.
Then module `ppp_generic' must be loaded.
Then modules `ppp_async', `ppp_deflate' and `bsd_comp' may be loaded.

List of functions in drivers/net/ppp_generic.c.

The following functions are found in the file `ppp_generic.c':

static inline int proto_to_npindex(int proto)
static inline int ethertype_to_npindex(int ethertype)
static int ppp_open(struct inode *inode, struct file *file)
static int ppp_release(struct inode *inode, struct file *file)
static ssize_t ppp_read(struct file *file, char *buf, size_t count, loff_t *ppos)
static ssize_t ppp_file_read(struct ppp_file *pf, struct file *file, char *buf, size_t count)
static ssize_t ppp_write(struct file *file, const char *buf, size_t count, loff_t *ppos)
static ssize_t ppp_file_write(struct ppp_file *pf, const char *buf, size_t count)
static unsigned int ppp_poll(struct file *file, poll_table *wait)
static int ppp_ioctl(struct inode *inode, struct file *file, unsigned int cmd, unsigned long arg)
static int ppp_unattached_ioctl(struct ppp_file *pf, struct file *file, unsigned int cmd, unsigned long arg)

int __init ppp_init(void)

static int ppp_start_xmit(struct sk_buff *skb, struct net_device *dev)
static struct net_device_stats *ppp_net_stats(struct net_device *dev)
static int ppp_net_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
int ppp_net_init(struct net_device *dev)
static void ppp_xmit_process(struct ppp *ppp)
static void ppp_send_frame(struct ppp *ppp, struct sk_buff *skb)
static void ppp_push(struct ppp *ppp)
static int ppp_mp_explode(struct ppp *ppp, struct sk_buff *skb)
static void ppp_channel_push(struct channel *pch)
static inline void ppp_do_recv(struct ppp *ppp, struct sk_buff *skb, struct channel *pch)
void ppp_input(struct ppp_channel *chan, struct sk_buff *skb)
void ppp_input_error(struct ppp_channel *chan, int code)
static void ppp_receive_frame(struct ppp *ppp, struct sk_buff *skb, struct channel *pch)
static void ppp_receive_error(struct ppp *ppp)
static void ppp_receive_nonmp_frame(struct ppp *ppp, struct sk_buff *skb)
static struct sk_buff *ppp_decompress_frame(struct ppp *ppp, struct sk_buff *skb)
static void ppp_receive_mp_frame(struct ppp *ppp, struct sk_buff *skb, struct channel *pch)
static void ppp_mp_insert(struct ppp *ppp, struct sk_buff *skb)
struct sk_buff *ppp_mp_reconstruct(struct ppp *ppp)

int ppp_register_channel(struct ppp_channel *chan)
int ppp_channel_index(struct ppp_channel *chan)
int ppp_unit_number(struct ppp_channel *chan)
void ppp_unregister_channel(struct ppp_channel *chan)

void ppp_output_wakeup(struct ppp_channel *chan)
static int ppp_set_compress(struct ppp *ppp, unsigned long arg)
static void ppp_ccp_peek(struct ppp *ppp, struct sk_buff *skb, int inbound)
static void ppp_ccp_closed(struct ppp *ppp)
static struct compressor_entry *find_comp_entry(int proto)
int ppp_register_compressor(struct compressor *cp)
void ppp_unregister_compressor(struct compressor *cp)
static struct compressor *find_compressor(int type)
static void ppp_get_stats(struct ppp *ppp, struct ppp_stats *st)

static struct ppp *ppp_create_interface(int unit, int *retp)
static void init_ppp_file(struct ppp_file *pf, int kind)
static void ppp_destroy_interface(struct ppp *ppp)
static struct ppp *ppp_find_unit(int unit)
static struct channel *ppp_find_channel(int unit)
static int ppp_connect_channel(struct channel *pch, int unit)
static int ppp_disconnect_channel(struct channel *pch)
static void ppp_destroy_channel(struct channel *pch)

static void __exit ppp_cleanup(void)

module_init(ppp_init);
module_exit(ppp_cleanup);

EXPORT_SYMBOL(ppp_register_channel);
EXPORT_SYMBOL(ppp_unregister_channel);
EXPORT_SYMBOL(ppp_channel_index);
EXPORT_SYMBOL(ppp_unit_number);
EXPORT_SYMBOL(ppp_input);
EXPORT_SYMBOL(ppp_input_error);
EXPORT_SYMBOL(ppp_output_wakeup);
EXPORT_SYMBOL(ppp_register_compressor);
EXPORT_SYMBOL(ppp_unregister_compressor);
EXPORT_SYMBOL(all_ppp_units); /* for debugging */
EXPORT_SYMBOL(all_channels); /* for debugging */

Groups of functions in drivers/net/ppp_generic.c.

Below is a set of categories for the functions in ppp_generic.c.
All functions are shown here in the same order as in the source file except ppp_init() and ppp_cleanup().

  1. The basic init/cleanup functions which are called at module load/unload time.
    int __init ppp_init(void)
    static void __exit ppp_cleanup(void)
    
  2. The /dev/ppp char-device event-handler functions - plus some functions which are closely associated with these handlers.
    static inline int proto_to_npindex(int proto)
    static inline int ethertype_to_npindex(int ethertype)
    static int ppp_open(struct inode *inode, struct file *file)
    static int ppp_release(struct inode *inode, struct file *file)
    static ssize_t ppp_read(struct file *file, char *buf, size_t count, loff_t *ppos)
    static ssize_t ppp_file_read(struct ppp_file *pf, struct file *file, char *buf, size_t count)
    static ssize_t ppp_write(struct file *file, const char *buf, size_t count, loff_t *ppos)
    static ssize_t ppp_file_write(struct ppp_file *pf, const char *buf, size_t count)
    static unsigned int ppp_poll(struct file *file, poll_table *wait)
    static int ppp_ioctl(struct inode *inode, struct file *file, unsigned int cmd, unsigned long arg)
    static int ppp_unattached_ioctl(struct ppp_file *pf, struct file *file, unsigned int cmd, unsigned long arg)
    

    These are are entered via the ppp_device_fops jump-table from the kernel code for char device sockets.

    static struct file_operations ppp_device_fops = {
            owner:          THIS_MODULE,
            read:           ppp_read,
            write:          ppp_write,
            poll:           ppp_poll,
            ioctl:          ppp_ioctl,
            open:           ppp_open,
            release:        ppp_release
    };
    
  3. Functions which handle network events for IP packets which are transmitted via network interfaces ppp0, ppp1 etc.
    static int ppp_start_xmit(struct sk_buff *skb, struct net_device *dev)
    static struct net_device_stats *ppp_net_stats(struct net_device *dev)
    static int ppp_net_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
    int ppp_net_init(struct net_device *dev)
    

    These functions are entered via a net_device jump-table, which is passed between the ppp_generic.c module and the IP network layer, usually in a pointer called `dev'.

    static int
    ppp_net_init(struct net_device *dev)
    {
            dev->hard_header_len = PPP_HDRLEN;
            dev->mtu = PPP_MTU;
            dev->hard_start_xmit = ppp_start_xmit;
            dev->get_stats = ppp_net_stats;
            dev->do_ioctl = ppp_net_ioctl;
            dev->addr_len = 0;
            dev->tx_queue_len = 3;
            dev->type = ARPHRD_PPP;
            dev->flags = IFF_POINTOPOINT | IFF_NOARP | IFF_MULTICAST;
            return 0;
    }
    

    The ppp_net_init handler is set in ppp_create_interface() as follows:

            dev->init = ppp_net_init;
    
  4. Lower level functions for TX packet stream.
    It seems like only the ppp_xmit_process() function is called by event handlers, and that the other functions in this group are called basically only via ppp_xmit_process().
    These are the functions where the TX queueing must be happening.
    static void ppp_xmit_process(struct ppp *ppp)
    static void ppp_send_frame(struct ppp *ppp, struct sk_buff *skb)
    static void ppp_push(struct ppp *ppp)
    static int ppp_mp_explode(struct ppp *ppp, struct sk_buff *skb)
    static void ppp_channel_push(struct channel *pch)
    
  5. Lower level functions for the RX packet stream.
    The _mp_ functions are only compiled with the CONFIG_PPP_MULTILINK option.
    The functions ppp_input() and ppp_input_error() are called by other kernel modules, e.g. by the ppp_async.c module.
    The functions ppp_input() and ppp_input_error() are advertised to other kernel modules by the header file include/linux/ppp_channel.h.
    static inline void ppp_do_recv(struct ppp *ppp, struct sk_buff *skb, struct channel *pch)
    void ppp_input(struct ppp_channel *chan, struct sk_buff *skb)
    void ppp_input_error(struct ppp_channel *chan, int code)
    static void ppp_receive_frame(struct ppp *ppp, struct sk_buff *skb, struct channel *pch)
    static void ppp_receive_error(struct ppp *ppp)
    static void ppp_receive_nonmp_frame(struct ppp *ppp, struct sk_buff *skb)
    static struct sk_buff *ppp_decompress_frame(struct ppp *ppp, struct sk_buff *skb)
    
    static void ppp_receive_mp_frame(struct ppp *ppp, struct sk_buff *skb, struct channel *pch)
    static void ppp_mp_insert(struct ppp *ppp, struct sk_buff *skb)
    struct sk_buff *ppp_mp_reconstruct(struct ppp *ppp)
    
  6. A mini-API for access by other kernel modules.
    These are advertised to other kernel modules by include/linux/ppp_channel.h
    int ppp_register_channel(struct ppp_channel *chan)
    int ppp_channel_index(struct ppp_channel *chan)
    int ppp_unit_number(struct ppp_channel *chan)
    void ppp_unregister_channel(struct ppp_channel *chan)
    void ppp_output_wakeup(struct ppp_channel *chan)
    
  7. Compression-related functions.
    static int ppp_set_compress(struct ppp *ppp, unsigned long arg)
    static void ppp_ccp_peek(struct ppp *ppp, struct sk_buff *skb, int inbound)
    static void ppp_ccp_closed(struct ppp *ppp)
    static struct compressor_entry *find_comp_entry(int proto)
    int ppp_register_compressor(struct compressor *cp)
    void ppp_unregister_compressor(struct compressor *cp)
    static struct compressor *find_compressor(int type)
    
  8. Statistics.
    ppp_get_stats() is used by an ioctl call from the IP layer.
    static void ppp_get_stats(struct ppp *ppp, struct ppp_stats *st)
    
  9. Interface functions.
    ppp_create_interface() is called by a /dev/ppp ioctl.
    init_ppp_file() is called only by ppp_create_interface().
    static struct ppp *ppp_create_interface(int unit, int *retp)
    static void init_ppp_file(struct ppp_file *pf, int kind)
    static void ppp_destroy_interface(struct ppp *ppp)
    
  10. Channel functions.
    ppp_find_unit() and ppp_find_channel() are used by char-dev ioctl calls.
    static struct ppp *ppp_find_unit(int unit)
    static struct channel *ppp_find_channel(int unit)
    static int ppp_connect_channel(struct channel *pch, int unit)
    static int ppp_disconnect_channel(struct channel *pch)
    static void ppp_destroy_channel(struct channel *pch)
    

Load-time entry points in source files drivers/net/*.c.

These are the module init/cleanup points for drivers/net/ppp_generic.c:

module_init(ppp_init);
module_exit(ppp_cleanup);

The file ppp_async.c contains similar module entry points:

module_init(ppp_async_init);
module_exit(ppp_async_cleanup);

The file ppp_deflate.c has this:

module_init(deflate_init);
module_exit(deflate_cleanup);

The file ppp_synctty.c contains this:

module_init(ppp_sync_init);
module_exit(ppp_sync_cleanup);

The module_init(...) lines declare entry points for boot/modload time.
If you type this command line in /usr/src/linux:

fgrep module_ drivers/net/ppp*.c

You get these entry points:

drivers/net/ppp_async.c:module_init(ppp_async_init);
drivers/net/ppp_async.c:module_exit(ppp_async_cleanup);
drivers/net/ppp_deflate.c:module_init(deflate_init);
drivers/net/ppp_deflate.c:module_exit(deflate_cleanup);
drivers/net/ppp_generic.c:module_init(ppp_init);
drivers/net/ppp_generic.c:module_exit(ppp_cleanup);
drivers/net/ppp_synctty.c:module_init(ppp_sync_init);
drivers/net/ppp_synctty.c:module_exit(ppp_sync_cleanup);
drivers/net/pppoe.c:module_init(pppoe_init);
drivers/net/pppoe.c:module_exit(pppoe_exit);
drivers/net/pppox.c:module_init(pppox_init);
drivers/net/pppox.c:module_exit(pppox_exit);

All of the module_init(...) declarations result in boot/load-time invocations.
It seems like `ppp_generic' must be loaded before `ppp_async', because the latter is a client for the former.

Initialization of the `ppp' kernel module.

Initial calling sequence - boot/modload - file ppp_generic.c.

The following is a summary of the initial module-load sequence which occurs either at boot time or when "insmod" or "modload" load the module into memory.

kernel core                     ppp module
===========                     ==========
[modload]                   ==> ppp_init()
devfs_register_chrdev()     <-- devfs_register() <-- 


The first entry point of the ppp module is the function `ppp_init', which contains the following lines.

devfs_register_chrdev(PPP_MAJOR, "ppp", &ppp_device_fops);
devfs_handle = devfs_register(NULL, "ppp", DEVFS_FL_DEFAULT,
                              PPP_MAJOR, 0,
                              S_IFCHR | S_IRUSR | S_IWUSR,
                              &ppp_device_fops, NULL);

These functions are imported from `include/linux/devfs_fs_kernel.h'.
`devfs_register_chrdev' and `devfs_register' are defined in `fs/devfs/base.c'.
`devfs_register_chrdev' is defined as follows:

/**
 *      devfs_register_chrdev - Optionally register a conventional character driver.
 *      @major: The major number for the driver.
 *      @name: The name of the driver (as seen in /proc/devices).
 *      @fops: The &file_operations structure pointer.
 *
 *      This function will register a character driver provided the "devfs=only"
 *      option was not provided at boot time.
 *      Returns 0 on success, else a negative error code on failure.
 */

int devfs_register_chrdev (unsigned int major, const char *name,
                           struct file_operations *fops)
{
    if (boot_options & OPTION_ONLY) return 0;
    return register_chrdev (major, name, fops);
}   /*  End Function devfs_register_chrdev  */

An interesting little function!
The `devfs_register' function is too long to quote here:

/**
 *      devfs_register - Register a device entry.
 *      @dir: The handle to the parent devfs directory entry. If this is %NULL the
 *              new name is relative to the root of the devfs.
 *      @name: The name of the entry.
 *      @flags: A set of bitwise-ORed flags (DEVFS_FL_*).
 *      @major: The major number. Not needed for regular files.
 *      @minor: The minor number. Not needed for regular files.
 *      @mode: The default file mode.
 *      @ops: The &file_operations or &block_device_operations structure.
 *              This must not be externally deallocated.
 *      @info: An arbitrary pointer which will be written to the @private_data
 *              field of the &file structure passed to the device driver. You can set
 *              this to whatever you like, and change it once the file is opened (the next
 *              file opened will not see this change).
 *
 *      Returns a handle which may later be used in a call to devfs_unregister().
 *      On failure %NULL is returned.
 */

devfs_handle_t devfs_register (devfs_handle_t dir, const char *name,
                               unsigned int flags,
                               unsigned int major, unsigned int minor,
                               umode_t mode, void *ops, void *info)
/* .... */

Anyway, what this registration does is to register a character device driver with major number 108, and the following `jump table' is registered:

static struct file_operations ppp_device_fops = {
        owner:          THIS_MODULE,
        read:           ppp_read,
        write:          ppp_write,
        poll:           ppp_poll,
        ioctl:          ppp_ioctl,
        open:           ppp_open,
        release:        ppp_release
};

This module will be called back through the `ppp_open' function, which is invoked by the application open() function called for the path `/dev/ppp'.
The above 6 entry points correspond to the application calls read(), write(), select(), ioctl(), open() and close().

Initial calling sequence - boot/modload - file ppp_async.c

The following is a summary of the initial module-load sequence which occurs either at boot time or when "insmod" or "modload" load the module into memory.

kernel core                     ppp module
===========                     ==========
[modload]                   ==> ppp_async_init()
                                --> tty_register_ldisc(N_PPP, &ppp_ldisc)


The function ppp_async_init() is invoked by the kernel boot/load procedure.
The TTY function tty_register_ldisc() is invoked by ppp_async_init().
The function tty_register_ldisc() is defined in file drivers/char/tty_io.c, which is part of a different kernel module.
This function is as follows:

int tty_register_ldisc(int disc, struct tty_ldisc *new_ldisc)
{
        if (disc = NR_LDISCS)
                return -EINVAL;

        if (new_ldisc) {
                ldiscs[disc] = *new_ldisc;
                ldiscs[disc].flags |= LDISC_FLAG_DEFINED;
                ldiscs[disc].num = disc;
        } else
                memset(&ldiscs[disc], 0, sizeof(struct tty_ldisc));

        return 0;
}

The function tty_register_ldisc() registers a jump-table of type "tty_ldisc", which is defined in include/linux/tty_ldisc.c as follows:

struct tty_ldisc {
        int     magic;
        char    *name;
        int     num;
        int     flags;
        /*
         * The following routines are called from above.
         */
        int     (*open)(struct tty_struct *);
        void    (*close)(struct tty_struct *);
        void    (*flush_buffer)(struct tty_struct *tty);
        ssize_t (*chars_in_buffer)(struct tty_struct *tty);
        ssize_t (*read)(struct tty_struct * tty, struct file * file,
                        unsigned char * buf, size_t nr);
        ssize_t (*write)(struct tty_struct * tty, struct file * file,
                         const unsigned char * buf, size_t nr);
        int     (*ioctl)(struct tty_struct * tty, struct file * file,
                         unsigned int cmd, unsigned long arg);
        void    (*set_termios)(struct tty_struct *tty, struct termios * old);
        unsigned int (*poll)(struct tty_struct *, struct file *,
                             struct poll_table_struct *);

        /*
         * The following routines are called from below.
         */
        void    (*receive_buf)(struct tty_struct *, const unsigned char *cp,
                               char *fp, int count);
        int     (*receive_room)(struct tty_struct *);
        void    (*write_wakeup)(struct tty_struct *);
};

This has function-pointers for various events including an "open" event, which causes the async channel definition to be registered.
Here is the actual contents of the ldisc jump table in ppp_async.c:

static struct tty_ldisc ppp_ldisc = {
        magic:  TTY_LDISC_MAGIC,
        name:   "ppp",
        open:   ppp_asynctty_open,
        close:  ppp_asynctty_close,
        read:   ppp_asynctty_read,
        write:  ppp_asynctty_write,
        ioctl:  ppp_asynctty_ioctl,
        poll:   ppp_asynctty_poll,
        receive_room: ppp_asynctty_room,
        receive_buf: ppp_asynctty_receive,
        write_wakeup: ppp_asynctty_wakeup,
};


Initialising the asynctty line discipline: ppp_asynctty_open()

The first function in the ppp_async.c module to be called after the boot/load procedure is done must be ppp_asynctty_open() as follows.

/*
 * Called when a tty is put into PPP line discipline.
 */
static int
ppp_asynctty_open(struct tty_struct *tty)
{
        struct asyncppp *ap;
        int err;

        MOD_INC_USE_COUNT;
        err = -ENOMEM;
        ap = kmalloc(sizeof(*ap), GFP_KERNEL);
        if (ap == 0)
                goto out;

        /* initialize the asyncppp structure */
        memset(ap, 0, sizeof(*ap));
        ap->tty = tty;
        ap->mru = PPP_MRU;
        spin_lock_init(&ap->xmit_lock);
        spin_lock_init(&ap->recv_lock);
        ap->xaccm[0] = ~0U;
        ap->xaccm[3] = 0x60000000U;
        ap->raccm = ~0U;
        ap->optr = ap->obuf;
        ap->olim = ap->obuf;
        ap->lcp_fcs = -1;

        ap->chan.private = ap;
        ap->chan.ops = &async_ops;
        ap->chan.mtu = PPP_MRU;
        err = ppp_register_channel(&ap->chan);
        if (err)
                goto out_free;

        tty->disc_data = ap;

        return 0;

 out_free:
        kfree(ap);
 out:
        MOD_DEC_USE_COUNT;
        return err;
}

The function ppp_asynctty_open() initialises an `asyncppp' structure and then registers the `chan' field of this structure with the `ppp_generic' module by calling the ppp_register_channel() function.
If this succeeds, the module use count is incremented, and this shows up in the `lsmod' command output.
Now it just happens that this line occurs in the ppp_generic.c file:

EXPORT_SYMBOL(ppp_register_channel);

So (maybe) the ppp_async.c module forces the loading of the ppp_generic.c module when it makes this call.
Just maybe, this could be some sort of dynamic loading procedure.
(My memory's hazy on this last point.)

The ppp_register_channel() function in ppp_generic.c called from ppp_async.c by ppp_asynctty_open() is as follows:

/*
 * Create a new, unattached ppp channel.
 */
int
ppp_register_channel(struct ppp_channel *chan)
{
        struct channel *pch;

        pch = kmalloc(sizeof(struct channel), GFP_ATOMIC);
        if (pch == 0)
                return -ENOMEM;
        memset(pch, 0, sizeof(struct channel));
        pch->ppp = NULL;
        pch->chan = chan;
        chan->ppp = pch;
        init_ppp_file(&pch->file, CHANNEL);
        pch->file.hdrlen = chan->hdrlen;
#ifdef CONFIG_PPP_MULTILINK
        pch->lastseq = -1;
#endif /* CONFIG_PPP_MULTILINK */
        spin_lock_init(&pch->downl);
        pch->upl = RW_LOCK_UNLOCKED;
        spin_lock_bh(&all_channels_lock);
        pch->file.index = ++last_channel_index;
        list_add(&pch->file.list, &all_channels);
        spin_unlock_bh(&all_channels_lock);
        MOD_INC_USE_COUNT;
        return 0;
}

This function calls init_ppp_file(), which doesn't do anything really exciting, as follows.

/*
 * Initialize a ppp_file structure.
 */
static void
init_ppp_file(struct ppp_file *pf, int kind)
{
        pf->kind = kind;
        skb_queue_head_init(&pf->xq);
        skb_queue_head_init(&pf->rq);
        atomic_set(&pf->refcnt, 1);
        init_waitqueue_head(&pf->rwait);
}

The main thing to observe here is that the ppp_register_channel() function has set up links between the async channel's `ppp_channel' structure and the ppp_generic.c module's `channel' structure with these lines:

        pch->chan = chan;
        chan->ppp = pch;

The async channel's data is then added to the list `all_channels' for future access.


Now it seems that there will be 3 jump-tables, each with read(), write(), ioctl() etc. functions.
Consider the write-handlers:

  • The ppp_write() function in structure `ppp_device_fops' handles write-events to the device /dev/ppp.
  • The ppp_start_xmit() function in the net_device structure `dev' handles host-to-network transmission events.
  • The ppp_asynctty_write() function in structure `ppp_ldisc' in the ppp_async.c module handles write-events for the serial (tty) device.

The ppp_write() function accepts raw packets from the pppd program (via /dev/ppp) for transmission on the PPP link.
The ppp_start_xmit() function accepts packets from the network (IP) layer.
The ppp_synctty_write() function does absolutely nothing.
Here's the proof in ppp_async.c:

/*
 * Read does nothing - no data is ever available this way.
 * Pppd reads and writes packets via /dev/ppp instead.
 */
static ssize_t
ppp_asynctty_read(struct tty_struct *tty, struct file *file,
                  unsigned char *buf, size_t count)
{
        return -EAGAIN;
}

/*
 * Write on the tty does nothing, the packets all come in
 * from the ppp generic stuff.
 */
static ssize_t
ppp_asynctty_write(struct tty_struct *tty, struct file *file,
                   const unsigned char *buf, size_t count)
{
        return -EAGAIN;
}

So it seems that there are really only two ways to send data `down' through the PPP kernel module: either as a raw packet via /etc/ppp or as standard IP traffic through the internet connection.
Since the ppp_write() function is only a raw packet interface, the function to concentrate on is ppp_start_xmit().

Opening the /dev/ppp device by pppd.


Initialising an interface: ppp_open() and ppp_ioctl()

The following calls occur when opening an interface.

[pppd]                       => ppp_device_fops->open   [function-pointer]
                            ==> ppp_open()

[pppd]                       => ppp_device_fops->ioctl  [function-pointer]
                            ==> ppp_ioctl(PPPIOCNEWUNIT, unit)
                                --> ppp_unattached_ioctl(PPPIOCNEWUNIT, unit)
                                    --> ppp_create_interface(unit)
                                        --> init_ppp_file()
register_netdevice()                    <-- [pppd]> ppp_device_fops->ioctl  [function-pointer]
                            ==> ppp_ioctl(PPPIOCATTCHAN, unit)
                                --> ppp_find_channel(unit)


The ppp_open() function just tests the caller permissions as follows:

        if (!capable(CAP_NET_ADMIN))
                return -EPERM;
        return 0;

The ppp_ioctl() function - the initialisation sequence.

The ppp_ioctl() function is where the ppp module is controlled by the pppd program.
This function is too large to quote in full here.

static int ppp_ioctl(struct inode *inode, struct file *file,
                     unsigned int cmd, unsigned long arg)
{
        struct ppp_file *pf = (struct ppp_file *) file->private_data;
        /* ... */

        if (pf == 0)
                return ppp_unattached_ioctl(pf, file, cmd, arg);
        /* ... */

As above, the first thing to happen in ppp_ioctl() is to check if it has been called before.
If not, the function ppp_unattached_ioctl is invoked.
Function ppp_unattached_ioctl() contains the following switch statement:

        switch (cmd) {
        case PPPIOCNEWUNIT:
                /* Create a new ppp unit */
                /* ... */
                break;
        case PPPIOCATTACH:
                /* Attach to an existing ppp unit */
                /* ... */
                break;
        case PPPIOCATTCHAN:
                /* ... */
                break;

The first case is as follows:

        case PPPIOCNEWUNIT:
                /* Create a new ppp unit */
                if (get_user(unit, (int *) arg))
                        break;
                ppp = ppp_create_interface(unit, &err);
                if (ppp == 0)
                        break;
                file->private_data = &ppp->file;
                err = -EFAULT;
                if (put_user(ppp->file.index, (int *) arg))
                        break;
                err = 0;
                break;

In the above code, the integer unit number `unit' is fetched from the caller-indicated address `arg'.
This is used in a call to ppp_create_interface(), which is summarised as follows [abbreviated code, like most in this web page]:

/*
 * Create a new ppp interface unit.  Fails if it can't allocate memory
 * or if there is already a unit with the requested number.
 * unit == -1 means allocate a new number.
 */
static struct ppp *
ppp_create_interface(int unit, int *retp)
{
        struct ppp *ppp;
        struct net_device *dev;

        /* Check to see if the unit is allocated. If it is, return it. */
        /* ... */

        /* Create a new ppp structure and link it before `list'. */
        ppp = kmalloc(sizeof(struct ppp), GFP_ATOMIC);
        memset(ppp, 0, sizeof(struct ppp));
        dev = kmalloc(sizeof(struct net_device), GFP_ATOMIC);
        memset(dev, 0, sizeof(struct net_device));

        ppp->file.index = unit;
        ppp->mru = PPP_MRU;
        init_ppp_file(&ppp->file, INTERFACE);
        ppp->file.hdrlen = PPP_HDRLEN - 2;      /* don't count proto bytes */
        for (i = 0; i npmode[i] = NPMODE_PASS;
        INIT_LIST_HEAD(&ppp->channels);
        spin_lock_init(&ppp->rlock);
        spin_lock_init(&ppp->wlock);

        ppp->dev = dev;
        dev->init = ppp_net_init;
        sprintf(dev->name, "ppp%d", unit);
        dev->priv = ppp;
        dev->features |= NETIF_F_DYNALLOC;

        ret = register_netdevice(dev);
        list_add(&ppp->file.list, list->prev);

        *retp = ret;
        return ppp;
}

The above (abbreviated) code means that two structures are allocated: a `struct ppp' and a `struct net_device'.
The `struct ppp' is for use by the ppp kernel module.
The `struct net_device *dev' is stored and used by the network layer.
Both structure are initialised to zero.
Various pieces of data are set in the `ppp' structure.
The initialisation function-pointer `init' of `dev' is set to `ppp_net_init', which will be called by the network layer to initialise the device.
Then both `ppp' and `dev' are linked to each other so that they can be correlated later.
The `dev' structure is registered with the internet network layer.
The `ppp' structure does not seem to be added to any list, but it is returned by the function, and it is linked by `dev'.

The ppp_net_init() function is called by the network layer to initialise the net_device structure which was registered by the ppp module.

static int
ppp_net_init(struct net_device *dev)
{
        dev->hard_header_len = PPP_HDRLEN;
        dev->mtu = PPP_MTU;
        dev->hard_start_xmit = ppp_start_xmit;
        dev->get_stats = ppp_net_stats;
        dev->do_ioctl = ppp_net_ioctl;
        dev->addr_len = 0;
        dev->tx_queue_len = 3;
        dev->type = ARPHRD_PPP;
        dev->flags = IFF_POINTOPOINT | IFF_NOARP | IFF_MULTICAST;
        return 0;
}

The above event-handler functions and data are used by the network layer to know what to do with packets etc.
For example, the above fields are used by `ifconfig' for configuring the interface.
The ppp_start_xmit() function is called for each individual IP packet which must be transmitted by the ppp module.
The ppp_net_ioctl() function is called by network device ioctl() calls, which are different to the character device ioctl() calls.
The ppp_net_stats() function is called by the network layer to fetch stats on the ppp link from the ppp driver.

Outgoing packet handling.

To understand how the TX packet stream works, it is useful to trace all of the TX functions to their callers as follows.

Citation trees for TX packet stream functions.

The TX stream functions in the module ppp_generic.c include the following:

static void ppp_xmit_process(struct ppp *ppp)
static void ppp_send_frame(struct ppp *ppp, struct sk_buff *skb)
static void ppp_push(struct ppp *ppp)
static int ppp_mp_explode(struct ppp *ppp, struct sk_buff *skb)
static void ppp_channel_push(struct channel *pch)

Here are some citation (reverse invocation) trees.

ppp_send_frame
    ppp_xmit_process
        ppp_start_xmit
            IP-layer write
        ppp_file_write(INTERFACE)
            ppp_write(INTERFACE)
                /dev/ppp
                    pppd
        ppp_channel_push
            ppp_file_write(CHANNEL)
                ppp_write(CHANNEL)
                    /dev/ppp
                        pppd
            ppp_output_wakeup
                [ppp_async.c] ppp_asynctty_wakeup
                    [ppp_async.c] ppp_ldisc->write_wakeup
                [ppp_async.c] ppp_async_flush_output
                    [ppp_async.c] ppp_asynctty_ioctl
                        [ppp_async.c] ppp_ldisc->ioctl
ppp_push
    ppp_xmit_process
        =as above=
    ppp_send_frame
        =as above=

ppp_mp_explode
    ppp_push
        =as above=

To really understand how the TX queueing structures operate, it is best to do the citation trees for the TX queueing objects rather than functions.
But the functions give the clue as to where the queueing objects are.
One of the TX buffer structures is the pointer ppp->xmit_pending in ppp_send_frame(), which points to a single TX packet.
In ppp_channel_push(), there is a TX queueing structure called pch->file.xq.
This is a packet queue associated with the pppd process which can send packets via /dev/ppp, which invokes the ppp_file_write() function via ppp_write(), but the xq queue is also accessed by ppp_xmit_process() and ppp_start_xmit(), which are called by the IP layer.


Here is an invocation tree for ppp_start_xmit(), ignoring standard system calls, resource locking procedures and compression routines.
(It is assumed here that the channel module is ppp_async.c.)

ppp_start_xmit
    ppp_xmit_process
        ppp_push
            pch->chan->ops->start_xmit
                [ppp_async.c] ppp_async_send
                    [ppp_async.c] ppp_async_push
                        [tty] tty->driver.write
                        [ppp_async.c] ppp_async_encode
                            [ppp_async.c] async_lcp_peek
            ppp_mx_explode (only for MULTILINK)
        ppp_send_frame
            ppp_push
                =as above=

The IP layer invokes ppp_start_xmit() to send a single IP packet, and this function sends the packet to the ppp_async.c module, which in turn sends the packet to the TTY device driver via the function tty->driver.write().


Next, the accesses of the TX queueing structures must be traced for the functions invoked via ppp_start_xmit().
Each IP packet sent via a PPP link must pass through the ppp_start_xmit() function in an sk_buff structure as parameter `skb' of ppp_start_xmit().
The important 3 lines of ppp_start_xmit() are the following:

        netif_stop_queue(dev);
        skb_queue_tail(&ppp->file.xq, skb);
        ppp_xmit_process(ppp);

This means that:

  1. the IP layer is (presumably) stopped from sending anything further on the PPP link,
  2. the packet is appended to the TX packet queue ppp->file.xq, and
  3. the function ppp_xmit_process() is invoked.

The ppp_xmit_process() function is as follows:

static void
ppp_xmit_process(struct ppp *ppp)
{
        struct sk_buff *skb;

        ppp_xmit_lock(ppp);
        ppp_push(ppp);
        while (ppp->xmit_pending == 0
               && (skb = skb_dequeue(&ppp->file.xq)) != 0)
                ppp_send_frame(ppp, skb);
        /* If there's no work left to do, tell the core net
           code that we can accept some more. */
        if (ppp->xmit_pending == 0 && skb_peek(&ppp->file.xq) == 0
            && ppp->dev != 0)
                netif_wake_queue(ppp->dev);
        ppp_xmit_unlock(ppp);
}

This does the following:

  1. invokes a ppp_xmit_lock() function (presumably to lock out other CPUs on an SMP system),
  2. invokes ppp_push(),
  3. while no TX packet is "pending" and the `xq' queue is non-empty, invoke ppp_send_frame() to send the next packet in `xq',
  4. if no TX packet is pending and the `xq' queue is empty, wake up the IP layer (presumably from the netif_stop_queue() call),
  5. undo the ppp_xmit_lock().

The ppp_push() function essentially just calls the function ppp_async_send() in ppp_async.c (in the case that this is the kind of channel being used).
The ppp_async_send() function essentially just calls ppp_async_push() once or twice.
ppp_async_push() contains the following loop:

        for (;;) {
                if (test_and_clear_bit(XMIT_WAKEUP, &ap->xmit_flags))
                        tty_stuffed = 0;
                if (!tty_stuffed && ap->optr olim) {
                        avail = ap->olim - ap->optr;
                        set_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
                        sent = tty->driver.write(tty, 0, ap->optr, avail);
                        if (sent <0) goto flush; /* error, e.g. loss of CD */ ap->optr += sent;
                        if (sent optr >= ap->olim && ap->tpkt != 0) {
                        if (ppp_async_encode(ap)) {
                                /* finished processing ap->tpkt */
                                clear_bit(XMIT_FULL, &ap->xmit_flags);
                                done = 1;
                        }
                        continue;
                }
                /*
                 * We haven't made any progress this time around.
                 * Clear XMIT_BUSY to let other callers in, but
                 * after doing so we have to check if anyone set
                 * XMIT_WAKEUP since we last checked it.  If they
                 * did, we should try again to set XMIT_BUSY and go
                 * around again in case XMIT_BUSY was still set when
                 * the other caller tried.
                 */
                clear_bit(XMIT_BUSY, &ap->xmit_flags);
                /* any more work to do? if not, exit the loop */
                if (!(test_bit(XMIT_WAKEUP, &ap->xmit_flags)
                      || (!tty_stuffed && ap->tpkt != 0)))
                        break;
                /* more work to do, see if we can do it now */
                if (test_and_set_bit(XMIT_BUSY, &ap->xmit_flags))
                        break;
        }

This loop sends sequences of bytes to the TTY kernel module.
The loop exits when either:

  • an error is encountered when sending bytes to tty->driver.write(),
  • the TTY module can't accept any more bytes right now
  • ....

(I don't really understand this loop.)
The variable `done' which is returned by ppp_async_push() equals 1 if the current packet to be sent (`tpkt') has been fully encoded into bytes for retransmission.
So `done' does not indicate that all bytes have been sent to the TTY module, but only that the `tpkt' packet has been fully encoded into a byte buffer for transmission, which implies that another such packet may be queued for encoding.

The function call:

sent = tty->driver.write(tty, 0, ap->optr, avail);

in function ppp_async_push() is declared in include/linux/tty_driver.h as part of the tty_driver struct declaration as follows:

        int  (*write)(struct tty_struct * tty, int from_user,
                      const unsigned char *buf, int count);

It's not clear whether ppp_async_push() could block while waiting to send bytes to the TTY module or not.
But certainly the buffer capacity implied here is only a single packet.


Now returning to the netif_stop_queue() call in ppp_start_xmit(), here are the relevant definitions from include/linux/netdevice.h:

static inline void netif_stop_queue(struct net_device *dev)
{
        set_bit(__LINK_STATE_XOFF, &dev->state);
}

static inline int netif_queue_stopped(struct net_device *dev)
{
        return test_bit(__LINK_STATE_XOFF, &dev->state);
}

Here is the relevant section of net/core/dev.c

if (dev->flags&IFF_UP) {
        int cpu = smp_processor_id();

        if (dev->xmit_lock_owner != cpu) {
                spin_unlock(&dev->queue_lock);
                spin_lock(&dev->xmit_lock);
                dev->xmit_lock_owner = cpu;

                if (!netif_queue_stopped(dev)) {
                        if (netdev_nit)
                                dev_queue_xmit_nit(skb,dev);

                        if (dev->hard_start_xmit(skb, dev) == 0) {
                                dev->xmit_lock_owner = -1;
                                spin_unlock_bh(&dev->xmit_lock);
                                return 0;
                        }
                }
                dev->xmit_lock_owner = -1;
                spin_unlock_bh(&dev->xmit_lock);
                if (net_ratelimit())
                        printk(KERN_DEBUG "Virtual device %s asks to queue packet!\n", dev->name);
                kfree_skb(skb);
                return -ENETDOWN;
        } else {
                /* Recursion is detected! It is possible, unfortunately */
                if (net_ratelimit())
                        printk(KERN_DEBUG "Dead loop on virtual device %s, fix it urgently!\n", dev->name);
        }
}

What this shows is that, yes indeed, all packets are discarded by the network layer if the network device is stopped.
Therefore all packets apart from the 1-packet buffer, and a few bytes in the TTY module, are discarded under overload conditions.
Actually there seem to be two 1-packet buffers, namely the ppp_async.c buffer called `tpkt' and the ppp_generic.c buffer called ppp->xmit_pending.
However, it seems like the /dev/ppp access for `pppd' puts packets into the `xq' buffer, which has a much larger capacity.
Maybe the `xq' buffer could be made available for network packets also, rather than creating a new buffer for increased queueing capacity.


The full ppp_start_xmit() function is as follows:

static int
ppp_start_xmit(struct sk_buff *skb, struct net_device *dev)
{
        struct ppp *ppp = (struct ppp *) dev->priv;
        int npi, proto;
        unsigned char *pp;

        npi = ethertype_to_npindex(ntohs(skb->protocol));
        if (npi <0) goto outf; /* Drop, accept or reject the packet */ switch (ppp->npmode[npi]) {
        case NPMODE_PASS:
                break;
        case NPMODE_QUEUE:
                /* it would be nice to have a way to tell the network
                   system to queue this one up for later. */
                goto outf;
        case NPMODE_DROP:
        case NPMODE_ERROR:
                goto outf;
        }

        /* Put the 2-byte PPP protocol number on the front,
           making sure there is room for the address and control fields. */
        if (skb_headroom(skb) len + dev->hard_header_len, GFP_ATOMIC);
                if (ns == 0)
                        goto outf;
                skb_reserve(ns, dev->hard_header_len);
                memcpy(skb_put(ns, skb->len), skb->data, skb->len);
                kfree_skb(skb);
                skb = ns;
        }
        pp = skb_push(skb, 2);
        proto = npindex_to_proto[npi];
        pp[0] = proto >> 8;
        pp[1] = proto;

        netif_stop_queue(dev);
        skb_queue_tail(&ppp->file.xq, skb);
        ppp_xmit_process(ppp);
        return 0;

 outf:
        kfree_skb(skb);
        ++ppp->stats.tx_dropped;
        return 0;
}

The above code does the following things:
Retrieve the `ppp' structure from the given `dev' structure.
Determine the network layer protocol `npi' of the offered outgoing packet from the network layer.
The protocol can be IP (IP version 4), IPV6, IPX or AT (Appletalk).
Each protocol can either be marked to be passed or dropped.
(These flags are set or gotten with the PPPIOCSNPMODE and PPPIOCGNPMODE ioctls in ppp_ioctl(), which is the char device driver ioctl() handler.)
Then two bytes are prepended to the IP packet for the PPP link layer.
These two bytes indicate to the other side of the link which network protocol is being sent.
The network layer is blocked from accessing the device `dev'.
The outgoing packet-buffer `skb' is enqueued in the queue ppp->file.xq - a transmission queue.
Here is the ppp_file structure declaration:

/*
 * An instance of /dev/ppp can be associated with either a ppp
 * interface unit or a ppp channel.  In both cases, file->private_data
 * points to one of these.
 */
struct ppp_file {
        enum {
                INTERFACE=1, CHANNEL
        }               kind;
        struct sk_buff_head xq;         /* pppd transmit queue */
        struct sk_buff_head rq;         /* receive queue for pppd */
        wait_queue_head_t rwait;        /* for poll on reading /dev/ppp */
        atomic_t        refcnt;         /* # refs (incl /dev/ppp attached) */
        int             hdrlen;         /* space to leave for headers */
        struct list_head list;          /* link in all_* list */
        int             index;          /* interface unit / channel number */
};

Then the function ppp_xmit_process() is called to deal with the packet as follows:

/*
 * Called to do any work queued up on the transmit side
 * that can now be done.
 */
static void
ppp_xmit_process(struct ppp *ppp)
{
        struct sk_buff *skb;

        ppp_xmit_lock(ppp);
        ppp_push(ppp);
        while (ppp->xmit_pending == 0
               && (skb = skb_dequeue(&ppp->file.xq)) != 0)
                ppp_send_frame(ppp, skb);
        /* If there's no work left to do, tell the core net
           code that we can accept some more. */
        if (ppp->xmit_pending == 0 && skb_peek(&ppp->file.xq) == 0
            && ppp->dev != 0)
                netif_wake_queue(ppp->dev);
        ppp_xmit_unlock(ppp);
}

The above function invokes the following ppp_push() function.

ppp_push() function.
/*
 * Try to send the frame in xmit_pending.
 * The caller should have the xmit path locked.
 */
static void
ppp_push(struct ppp *ppp)
{
        struct list_head *list;
        struct channel *pch;
        struct sk_buff *skb = ppp->xmit_pending;

        if (skb == 0)
                return;

        list = &ppp->channels;
        if (list_empty(list)) {
                /* nowhere to send the packet, just drop it */
                ppp->xmit_pending = 0;
                kfree_skb(skb);
                return;
        }

        if ((ppp->flags & SC_MULTILINK) == 0) {
                /* not doing multilink: send it down the first channel */
                list = list->next;
                pch = list_entry(list, struct channel, clist);

                spin_lock_bh(&pch->downl);
                if (pch->chan) {
                        if (pch->chan->ops->start_xmit(pch->chan, skb))
                                ppp->xmit_pending = 0;
                } else {
                        /* channel got unregistered */
                        kfree_skb(skb);
                        ppp->xmit_pending = 0;
                }
                spin_unlock_bh(&pch->downl);
                return;
        }

#ifdef CONFIG_PPP_MULTILINK
        /* Multilink: fragment the packet over as many links
           as can take the packet at the moment. */
        if (!ppp_mp_explode(ppp, skb))
                return;
#endif /* CONFIG_PPP_MULTILINK */

        ppp->xmit_pending = 0;
        kfree_skb(skb);
}

The ppp_push() function checks to see if a packet is `pending' by looking at the field ppp->xmit_pending, which is a packet-buffer-pointer.
If there is no packet pending, then ppp_push() does nothing.
If no channels are open, then ppp_push() drops the pending packet.
If the pending packet is sent at all, it is sent via the function pch->chan->ops->start_xmit() which is provided by the kernel module for the channel.
For example, the start_xmit function for the ppp_async.c module is ppp_async_send().
The `ops' field is a pointer to a ppp_channel_ops structure which has different versions according to the kind of PPP channel being used.
Here are the relevant declarations in include/linux/ppp_channel.h:

struct ppp_channel;

struct ppp_channel_ops {
        /* Send a packet (or multilink fragment) on this channel.
           Returns 1 if it was accepted, 0 if not. */
        int     (*start_xmit)(struct ppp_channel *, struct sk_buff *);
        /* Handle an ioctl call that has come in via /dev/ppp. */
        int     (*ioctl)(struct ppp_channel *, unsigned int, unsigned long);
};

struct ppp_channel {
        void            *private;       /* channel private data */
        struct ppp_channel_ops *ops;    /* operations for this channel */
        int             mtu;            /* max transmit packet size */
        int             hdrlen;         /* amount of headroom channel needs */
        void            *ppp;           /* opaque to channel */
        /* the following are not used at present */
        int             speed;          /* transfer rate (bytes/second) */
        int             latency;        /* overhead time in milliseconds */
};


The actual function pointers of ppp_channel_ops structures are defined in ppp_async.c (structure `async_ops'), ppp_synctty.c (structure `sync_ops') and pppoe.c (structure `pppoe_ops').
As an example, the async_ops structure is as follows:

struct ppp_channel_ops async_ops = {
        ppp_async_send,
        ppp_async_ioctl
};

So the pch->chan->ops->start_xmit() call would then invoke the function ppp_async_send(), which is as follows:

ppp_async_send() function.
/*
 * Send a packet to the peer over an async tty line.
 * Returns 1 iff the packet was accepted.
 * If the packet was not accepted, we will call ppp_output_wakeup
 * at some later time.
 */
static int
ppp_async_send(struct ppp_channel *chan, struct sk_buff *skb)
{
        struct asyncppp *ap = chan->private;

        ppp_async_push(ap);

        if (test_and_set_bit(XMIT_FULL, &ap->xmit_flags))
                return 0;       /* already full */
        ap->tpkt = skb;
        ap->tpkt_pos = 0;

        ppp_async_push(ap);
        return 1;
}

The ppp_async_push() function tries to send data via the serial device driver.


Whatever the situation is before the ppp_push() call, the xmit_pending packet is cleared when the call returns.

Locking, blocking and timing of the TX packet stream

As noted above, an outgoing (host->network) packet passes through multiple stretches of code on its way to the network.
The portion of the TX packet path of interest here has these 3 interfaces:

  1. IP layer offers packet to PPP module (ppp_generic.c).
  2. ppp_generic.c module offers packet to ppp_async.c module.
  3. ppp_async.c module feeds packet to TTY module.

At each interface, as in the case of applications sending data or packets via sockets, a blocking or non-blocking style of interaction may be used.
The main question here is whether the interaction style is blocking or non-blocking for each of the above 3 interfaces.
Also of interest are the styles of signalling, locking, unlocking, wake-up calls and timing of each interface.
When these aspects have been determined, it should be possible to redesign one or more of the interfaces to achieve tighter control over queueing.

First, the packet offer by the IP layer to the PPP module (in function dev_queue_xmit() in net/core/dev.c) expects a speedy return because it is done with BH entries disabled and `xmit_lock' locked.
An inspection of the code in ppp_generic.c reveals that there is no obvious potential for any sleep etc. in between the invocation of ppp_start_xmit() by the IP layer and the invocation by ppp_push() of ppp_async_send() in ppp_async.c.
It seems logical that this call in ppp_async.c should return speedily.

            sent = tty->driver.write(tty, 0, ap->optr, avail);

In other words, it should not block, because that would cause a late return to the IP layer call.
But it's probably a good idea to check what happens here just in case.


The tty->driver.write() call is sent to the driver for the character device which PPP is being run on.
There seem to be dozens of possible such char device drivers which have tty_driver structures in them in directory drivers/char.
For example, in drivers/char/serial.c, there is this declaration:

static struct tty_driver serial_driver, callout_driver;

and this bit of code in the module load routine rs_init():

        memset(&serial_driver, 0, sizeof(struct tty_driver));
        serial_driver.magic = TTY_DRIVER_MAGIC;
#if (LINUX_VERSION_CODE > 0x20100)
        serial_driver.driver_name = "serial";
#endif
#if (LINUX_VERSION_CODE > 0x2032D && defined(CONFIG_DEVFS_FS))
        serial_driver.name = "tts/%d";
#else
        serial_driver.name = "ttyS";
#endif
        serial_driver.major = TTY_MAJOR;
        serial_driver.minor_start = 64 + SERIAL_DEV_OFFSET;
        serial_driver.num = NR_PORTS;
        serial_driver.type = TTY_DRIVER_TYPE_SERIAL;
        serial_driver.subtype = SERIAL_TYPE_NORMAL;
        serial_driver.init_termios = tty_std_termios;
        serial_driver.init_termios.c_cflag =
                B9600 | CS8 | CREAD | HUPCL | CLOCAL;
        serial_driver.flags = TTY_DRIVER_REAL_RAW | TTY_DRIVER_NO_DEVFS;
        serial_driver.refcount = &serial_refcount;
        serial_driver.table = serial_table;
        serial_driver.termios = serial_termios;
        serial_driver.termios_locked = serial_termios_locked;

        serial_driver.open = rs_open;
        serial_driver.close = rs_close;
        serial_driver.write = rs_write;
        serial_driver.put_char = rs_put_char;
        serial_driver.flush_chars = rs_flush_chars;
        serial_driver.write_room = rs_write_room;
        serial_driver.chars_in_buffer = rs_chars_in_buffer;
        serial_driver.flush_buffer = rs_flush_buffer;
        serial_driver.ioctl = rs_ioctl;
        serial_driver.throttle = rs_throttle;
        serial_driver.unthrottle = rs_unthrottle;
        serial_driver.set_termios = rs_set_termios;
        serial_driver.stop = rs_stop;
        serial_driver.start = rs_start;
        serial_driver.hangup = rs_hangup;
#if (LINUX_VERSION_CODE >= 131394) /* Linux 2.1.66 */
        serial_driver.break_ctl = rs_break;
#endif
#if (LINUX_VERSION_CODE >= 131343)
        serial_driver.send_xchar = rs_send_xchar;
        serial_driver.wait_until_sent = rs_wait_until_sent;
        serial_driver.read_proc = rs_read_proc;
#endif

The only important line here is

        serial_driver.write = rs_write;

Here is the rs_write() function:

static int rs_write(struct tty_struct * tty, int from_user,
                    const unsigned char *buf, int count)
{
        int     c, ret = 0;
        struct async_struct *info = (struct async_struct *)tty->driver_data;
        unsigned long flags;

        if (serial_paranoia_check(info, tty->device, "rs_write"))
                return 0;

        if (!tty || !info->xmit.buf || !tmp_buf)
                return 0;

        save_flags(flags);
        if (from_user) {
                down(&tmp_buf_sem);
                while (1) {
                        int c1;
                        c = CIRC_SPACE_TO_END(info->xmit.head,
                                              info->xmit.tail,
                                              SERIAL_XMIT_SIZE);
                        if (count xmit.head,
                                               info->xmit.tail,
                                               SERIAL_XMIT_SIZE);
                        if (c1 xmit.buf + info->xmit.head, tmp_buf, c);
                        info->xmit.head = ((info->xmit.head + c) &                                           (SERIAL_XMIT_SIZE-1));
                        restore_flags(flags);
                        buf += c;
                        count -= c;
                        ret += c;
                }
                up(&tmp_buf_sem);
        } else {
                cli();
                while (1) {
                        c = CIRC_SPACE_TO_END(info->xmit.head,
                                              info->xmit.tail,
                                              SERIAL_XMIT_SIZE);
                        if (count xmit.buf + info->xmit.head, buf, c);
                        info->xmit.head = ((info->xmit.head + c) &                                           (SERIAL_XMIT_SIZE-1));
                        buf += c;
                        count -= c;
                        ret += c;
                }
                restore_flags(flags);
        }
        if (info->xmit.head != info->xmit.tail
            && !tty->stopped
            && !tty->hw_stopped
            && !(info->IER & UART_IER_THRI)) {
                info->IER |= UART_IER_THRI;
                serial_out(info, UART_IER, info->IER);
        }
        return ret;
}

In include/linux/serial_reg.h are the UART declarations:

#define UART_IER        1       /* Out: Interrupt Enable Register */

and

/*
 * These are the definitions for the Interrupt Enable Register
 */
#define UART_IER_MSI    0x08    /* Enable Modem status interrupt */
#define UART_IER_RLSI   0x04    /* Enable receiver line status interrupt */
#define UART_IER_THRI   0x02    /* Enable Transmitter holding register int. */
#define UART_IER_RDI    0x01    /* Enable receiver data interrupt */



Two things to note here are the size of the buffer, which is SERIAL_XMIT_SIZE bytes, and the call of serial_out(), which is the only way in which this function could possibly suffer a long delay.
The value of SERIAL_XMIT_SIZE is set in include/linux/serial.h as follows:

#define SERIAL_XMIT_SIZE PAGE_SIZE

The value of PAGE_SIZE is set in include/asm/page.h:

#define PAGE_SHIFT      12
#define PAGE_SIZE       (1UL << PAGE_SHIFT) 
In other words, the serial device buffer size is fixed as 4096 bytes. This is a rather large amount of buffer if your serial line is slow!!! (Actually it's 4095 bytes because the last byte is never used.) Here is the serial_out() function in drivers/char/serial.c:
static _INLINE_ void serial_out(struct async_struct *info, int offset,
                                int value)
{
        switch (info->io_type) {
#ifdef CONFIG_HUB6
        case SERIAL_IO_HUB6:
                outb(info->hub6 - 1 + offset, info->port);
                outb(value, info->port+1);
                break;
#endif
        case SERIAL_IO_MEM:
                writeb(value, (unsigned long) info->iomem_base +
                              (offset<iomem_reg_shift));
                break;
#ifdef CONFIG_SERIAL_GSC
        case SERIAL_IO_GSC:
                gsc_writeb(value, info->iomem_base + offset);
                break;
#endif
        default:
                outb(value, info->port+offset);
        }
}

Generally the default case will be taken (I guess).
So the write command results only in outb() calls, which are very fast direct writes to (essentially) memory mapped I/O lines.

Conclusions.

The good news is that the delay through the whole path is very low and there is no blocking anywhere.
The bad news is that the buffer is 4095 bytes, which is rather high if you want to be able to pre-empt a PPP link to clobber the current packet transmission over a slow line and send something else instead.
On the other hand, there is this function in drivers/char/serial.c:

static void rs_flush_buffer(struct tty_struct *tty)
{
        struct async_struct *info = (struct async_struct *)tty->driver_data;
        unsigned long flags;

        if (serial_paranoia_check(info, tty->device, "rs_flush_buffer"))
                return;
        save_flags(flags); cli();
        info->xmit.head = info->xmit.tail = 0;
        restore_flags(flags);
        wake_up_interruptible(&tty->write_wait);
#ifdef SERIAL_HAVE_POLL_WAIT
        wake_up_interruptible(&tty->poll_wait);
#endif
        if ((tty->flags & (1 << TTY_DO_WRITE_WAKEUP)) && tty->ldisc.write_wakeup)
                (tty->ldisc.write_wakeup)(tty);
}

With this function, all data in the serial device kernel module buffer can be flushed without being transmitted.


The actual transmission of bytes occurs in the rs_interrupt() function, which is called whenever the serial hardware device interrupts the CPU, e.g. to notify that the serial chip's TX buffer has some spare room.
The bottom half function do_serial_bh() is initiated by the rs_interrupt() function.
Very indirectly via run_task_queue(), do_serial_bh() invokes do_softint().

/*
 * This routine is used to handle the "bottom half" processing for the
 * serial driver, known also the "software interrupt" processing.
 * This processing is done at the kernel interrupt level, after the
 * rs_interrupt() has returned, BUT WITH INTERRUPTS TURNED ON.  This
 * is where time-consuming activities which can not be done in the
 * interrupt driver proper are done; the interrupt driver schedules
 * them using rs_sched_event(), and they get done here.
 */
static void do_serial_bh(void)
{
        run_task_queue(&tq_serial);
}

static void do_softint(void *private_)
{
        struct async_struct     *info = (struct async_struct *) private_;
        struct tty_struct       *tty;

        tty = info->tty;
        if (!tty)
                return;

        if (test_and_clear_bit(RS_EVENT_WRITE_WAKEUP, &info->event)) {
                if ((tty->flags & (1 << TTY_DO_WRITE_WAKEUP)) && tty->ldisc.write_wakeup)
                        (tty->ldisc.write_wakeup)(tty);
                wake_up_interruptible(&tty->write_wait);
#ifdef SERIAL_HAVE_POLL_WAIT
                wake_up_interruptible(&tty->poll_wait);
#endif
        }
}

Function do_softint() just wakes up the PPP async software to send more data.


Go to some notes on linux pppd configuration.
Go to linux kernel links.
Go to linux links.
Go to ROHC (robust header compression) links.
Go to my notes on other linux configuration topics.
Go to Alan Kennington's home page.

Hosted by www.Geocities.ws

1