Nuff - The Language

Nuff - The Language


Nuff - The Language

This document is a description of the nuff programming language. Nuff is actually a collection of embedded languages built on top of scheme designed to make it easy to create, parse, analyse, and inject network packets in an asynchronous, efficient, and secure environment. This document is to serve as the definitive reference for programming nuff.

For most of this language description I will assume the reader is a competent scheme programmer and I intend to go fast. That said, since nuff was designed to make certain difficult tasks easy, you just might find you can keep up even without a deep understanding of scheme or symbol processing.

I hope that this document might provide an example of how scheme/lisp can be used in relevant network security applications for people who may not have been exposed to lisp in the past. In my opinion it is a shame that lisp has been under-represented in the security community so far. I think lisp has a lot to offer in almost every possible problem domain - even the traditionally C-dominated world of network security.

I also hope that experienced lisp programmers will enjoy this document since it describes an interesting application and, if I may humbly say so, is an example of using lisp as it is meant to be used - designing a language that would nicely fit a class of problems, then layering an interpreter/compiler for that language on top of lisp.

Nuff's APIs are, by design, very simple. Default values can be used almost everywhere making it easy to run advanced network experiments that would otherwise be major programming undertakings. In nuff there are no byte orders, no synchronisation issues, no pain-staking manual creating/parsing packets, and no worries of buffer overflows in the usual sense.

Unless they are obviously required, nuff removes the majority of your script's privileges before it runs. This makes security flaws in your script less serious. Unlike most other languages, unless you explicitly tell it, the compiler assumes every line of code you write is untrustable. With constructs like the privileged macro, you have to go out of your way to write insecure nuff code.

Nuff is interactive. You can use nuff as a tested, contained command line tool or as a user-centred environment for directly experimenting with the network. Packets can be captured, saved, modified, and replayed all from the interactive read-eval-print-loop (REPL). Nuff makes it easy to interact simultaneously with multiple network interfaces and sockets.

I hope you find nuff useful for your applications. I have. I enjoyed designing and creating nuff and hope that comes through in the documentation and code.

-Doug Hoyte, Hardcore Software

  1. Getting Started
  2. The Core
  3. Advanced Topics
  4. Misc Details, Notes, Examples


Getting Started

  1. Definition of a nuff script
  2. Arguments to nuff scripts
  3. Describe


Definition of a nuff script

A nuff script is a sequence of s-expressions to be read by a special nuff interpreter/compiler. These s-expressions are stored in special files each with a suffix ".nuff". They are normally stored in the directory NUFFDIR/code/ where NUFFDIR is your chosen nuff installation directory.

The list of s-expressions has no specific restrictions except that the first expression must be a valid module form. The module form has the syntax:

(module <module-symbol>
        <Short Description>
        <Long Description>
        <unix-bind-list for required args>
        <unix-bind-list for optional args>)

module-symbol is the scheme symbol used to refer to your nuff script inside scheme. The short description should be a string that gives a single-line overview and title for the script. The long description should be a list of strings representing paragraphs of documentation for your code. You can also use a formatting system called "nuffdoc" in the long description. This formatting system is described elsewhere.

The unix-bind-lists are lists of argument bindings that nuff will use to offer a familiar getopt(3)-like interface for users of your script. You can then access the parameters passed to the script with the param macro. We will discuss the specifics of unix-bind-lists in the next section.

Here is an example, the module form for the nuff help command:

(module help

"Displays information on different nuff commands"

("help gives you a human-readable display of all the options provided
  by a nuff command."

 "It uses the information in the module information at the top of the
  specified nuff command to construct the information. Since this is also
  a part of the code nuff uses, it is guaranteed to be up-to-date and
  accurate (except for bugs).")

((command "Command to get help with" (string? x)))

())

The remainder of the .nuff file should be scheme s-expressions. This is the code that you want your module to execute. Subject to a few restrictions detailed below, these scheme expressions are executed in-order and in a top-level, null lexical environment. Macros are pre-expanded during compilation by the define macro but not elsewhere so code that needs to be efficient should be inside a define form.

Nuff provides a full R5RS scheme system - probably the most flexible programming language ever designed - and a nearly complete common lisp format macro. What your nuff script does is limited only by your imagination.

For the remainder of this document, I will assume that you understood and are comfortable with everything you just read in this definition section. If not, you might want to refresh your scheme knowledge - particularly regarding macros and closures - before proceeding.

Ready?


Arguments to nuff scripts

Nuff provides a convenience feature with a macro called unix-bind-lists. In the many places that nuff incorporates them, using these "arguments" can be conceptualised as controlling programs through a unix shell with "switch" parameters. For instance, in unix to grep a pattern with 3 lines of text (-A)fter and case (-i)nsensitive, you can execute:

$ grep -A 3 -i pattern file

Nuff's unix-opts-get-bindlist macro was designed to mimic this and is roughly the nuff equivalent of getopt(3). For example in nuff to execute a ping command with an (-i)nterval of 0.5 seconds and without performing reverse DNS resolution (that is, giving (-n)umeric output):

# nuff ping -i 0.5 -n target.com

The specifics of unix-opts-get-bindlist are described elsewhere but in general it takes as a parameter a list, args, representing arguments passed by the user and a schema list, arg-schema, that specifies the possible arguments to accept, their default values, verifiers, and so on. The Nuff arg-schemas for the different network protocol layers, for instance, are stored in NUFFDIR/ffun/layers.scm .

Note that since Nuff doesn't use getopt(3) at all it doesn't behave in exactly the way you might expect an ordinary unix command to behave. The syntax is more orthogonal and lispy. For example, although you can group the two arguments in the grep command together like so:

$ grep -iA 3 pattern file

this sort of grouping is forbidden in nuff.

Exercise: What are common lisp keyword arguments? Are they really just another syntax for our unix "switches"?


Describe

In common lisp there is a wonderful function called "describe" that will give you a detailed description of any object or form or function or macro or ... in the system. This function - which is enabled by lisp's powerful dynamic type system - is very comprehensive and it is perhaps a little decieving to call nuff's much more limited version "describe" as well.

In nuff, describe gives detailed information on the protocols and interfaces nuff supports and insights into nuff forms via partial compilation. In a sense we can think of describe as one of nuff's online documentation systems - similar to unix's man command. If we want quick information on, say, the ethernet protocol:

$ nuff describe eth

For now don't worry about all the details the above command displays, just remember that the describe system is dynamically generated from the layer specifications themselves and is a handy resource for accurate, complete information. Well as accurate and complete as the nuff layer specifications are anyways. :]

I mention describe so early in this tutorial because of how useful it is. Learning what capabilities nuff provides will be a lot easier if describe is mastered early on. For more information on describe, see its nuff help information:

$ nuff -help describe

Exercise: How does describe work? What data does it read? Is that data really data or is it actually code for some specialised interpreter? Can it be both?

Exercise: Get a decent common lisp system and try describing things. Wow!


The Core

  1. Layers
  2. Parsepaq
  3. Time
  4. Nuff interfaces
  5. Order of evaluation of nuff scripts


Layers

The concept of a "layer" is the cornerstone of nuff's functionality. In networking it is very handy to think of data as being layered on top of other layers in an unraveling onion of network protocols. DNS is (usually) layered on top of UDP which is (again, usually) layered on top of IP which might be layered on top of Ethernet, and so on. Nuff tries to be, as the name suggests, a universal frame forge. This means that nuff tries to make it easy to craft and decode packets - any kind of packets.

As with almost all aspects of the nuff system, layers are implemented as macro-expansions involving our handy unix-bind-lists. Packets are treated as s-expressions which are eventually compiled down to strings of binary data to be sent out over the network. If you consider this for a moment you will realise that any packet can have many different possible representations. This:

(ip4 -evil-bit -src "1.2.3.4")

and this:

"E\x00\x00\x14\xFF\xFF\x10\x00@\x06f\xDF\x01\x02\x03\x04\x00\x00\x00\x00"

are equivalent s-expressions - just at different levels of compilation. Given a form like the (ip4 ...) one above - which contains no free variables - we can compile it immediatley down to the above string, ready to send over the network any time we need it.

What's more, the macro expansions are designed to be smart about variable capture. Instead of compiling immediatley down to a string as above, sometimes we will want to partially compile a packet. The form:

(ip4 -src x)

compiles to:

$ nuff describe '(ip4 -src x)'
----nuff description of (ip4 -src x)----
The nuff expansion of the form (ip4 -src x) is
 
(ip4-post
  (let ((-src (verify '-src x #<CLOSURE>)))
    (strcat "E\x00\x00\x14\xFF\xFF\x00\x00@\x06\x00\x00"
            (inet-aton -src)
            "\x00\x00\x00\x00")))
----

There are a number of things the compiler had to account for. The ip4 layer requires some post-processing in order to compute checksums and routing so the compiler added the ip4-post function. Don't worry about the exact syntax of the internal verify function except that it double checks x for validity against a pre-compiled closure and returns the correct value which is then bound to the symbol -src in the body. The body itself is just a string concatenation of some precompiled binary chunks and our verified variable -src.

This chunk of code can be inserted anywhere we have a variable x to efficiently create packets with different source IP addresses.

If you are wondering where those binary chunks came from, well, they were pre-compiled too! The "mother s-expression" for any given layer is

 <layer>-proto-schema

where <layer> is any nuff layer. In this case our symbol was ip4-proto-schema.

Layering is done by providing data for a layer's -data field and sometimes setting a protocol type. See the describe output on any layer for the specifics. The macro-expansions are designed to be smart about compiling across multiple layers and will generally make very efficient macro-expansions even with several nested levels of layers.

For example, here is a possible dns query:

(ip4 -dst *my-nameserver*
     -proto IP-PROTO-UDP
     -data (udp -dp 53
                -data (dns -rd
                           -question (list "hcsw.org" DNS-TYPE-A))))

Exercise: Try evaluating ip4-proto-schema in the nuff REPL. What are the uint8, uint16, uint32, bcomb, inet-aton, and inet-ntoa functions for? Experiment with those functions in the REPL.


Parsepaq

Parsepaq is a macro for parsing packets. Loosely stated, parsepaq is a process for taking packets as binary data and converting them back to their more verbose s-expression representations. In reality it's more complex because, as we saw earlier, there are many possible s-expression representations for any given packet. Parsing and expanding them in their entirety could be ambiguous and inefficient.

The mechanism parsepaq uses is subtle but demonstrably better. It involves analysing the forms passed to parsepaq - your code - then figuring out exactly what attributes are required out of the packet and expanding just those sections. Parsepaq can then make an efficient set of lexical bindings around your code when it gets packets you are interested in.

In nearly all cases the attributes that can be parsed are the same as those that can be created in the previously discussed nuff layers. Unfortunatley, due to the nature of certain protocols, the process isn't perfectly orthogonal but in the general case you should only have to learn one set of layer namings for both creators and accessors.

The syntax is designed to be similar to scheme/lisp's cond function. Given a "parse sequence" as a list of layers and a series of forms to execute, parsepaq will create a set of bindings around your code.

The first argument to parsepaq is a prefixing symbol for nuff to bind lexical variables around your code with. The second argument should evaluate to your packet or () for a timeout. The following arguments are cond-style forms except that the conditions are replaced with "parse sequences". Like cond, the tests will be evaluated in order until one is found to match.

Here is an example:

(parsepaq mypaq (get an ethernet packet from somewhere)
  ((eth ip4 icmp4)
     (format #t "Got an icmp4 packet from ~a, code is ~a"
                mypaq-ip4-src
                mypaq-icmp4-code))
  ((eth ip4)
     (format #t "Got some other ip4 packet from ~a. Proto is ~a"
                mypaq-ip4-src
                mypaq-ip4-proto))
  (timeout
     (format #t "We timed out - the packet we are parsing is null")))

Again, nuff will compile this form into a fairly efficient form for evaluation. For instance, in the first case (eth ip4 icmp4) nuff will verify that the packet is a valid Ethernet, IPv4, and ICMPv4 packet, but the only data it will pick out of and bind around your code are the IPv4 source address and the ICMPv4 code.


Time

A crucial part of any packet sniffing framework is time. Time is a central component of nuff and we have tried to make recording, processing, and calculating with it convenient and easy.

Nuff has a data-type called a "timeval" which is represented by the cons of 2 integers: a number of seconds and the microseconds past that second. Timevals can be grouped into 2 categories:

  • Absolute timevals - The (cons seconds useconds) past the epoch as defined by your unix system's gettimeofday(2) system call.

  • Relative timevals - The (cons seconds useconds) past some arbitrary point in time.

As we will see below, the routines that nuff uses to send and receive packets over the network provide timevals of when they were actually sent/received. This data is suitable for measuring the time taken by packets to traverse the network and other fine-grained I/O measurements. The data is as accurate as our C kernel allows.

While nuff provides fairly precise measurements of packet departure and arrival times, it doesn't make any guarantees as to how long it will take to parse or create packets. In particular, since nuff is a memory managed language we can't make any guarantees about when the garbage collector will run. If you have especially strict timing requirements, extra care might have to be taken so that the garbage collector won't run at inopportune moments. Try to minimise consing and periodically run (gc) when you are outside time-critical sections (like when you plan to block on I/O).

Here are some functions/macros you can use with timevals:

  • gettimeofday returns a timeval corresponding to the current moment in time.

    (gettimeofday)
    
  • reltime is a useful macro for finding timevals in the future relative to "now".

    (reltime <value> <unit>)
    

    Unit can be one of the symbols: ms, s, m, h, or d for milliseconds, seconds, minutes, hours, or days respectively. The timeval returned represents the point in time <value> <unit>s from now. For example, the following returns the absolute timeval for 1/2 an hour in the future:

    (reltime .5 h)
    
  • seconds->timeval converts a number of seconds to a relative timeval. The number can be an integer or a real.

    (seconds->timeval <seconds>)
    
  • timeval->seconds converts a relative timeval to a number of seconds.

    (timeval->seconds <timeval>)
    
  • timeval< returns true if timeval-a is before timeval-b and false otherwise. They should both be relative to the same point (or both absolute).

    (timeval< <timeval-a> <timeval-b>)
    
  • timeval-add sums 2 timevals. Summing 2 absolute timevals doesn't make sense.

    (timeval-add <timeval-a> <timeval-b>)
    
  • timeval-sub subtracts 2 timevals.

    (timeval-sub <timeval-a> <timeval-b>)
    
  • creation-time tells us the moment in time when a string was created so we can know when a packet was captured by the network interface.

    (creation-time <string>)
    

    creation-time only works with "shared strings" which are discussed as a note below. Shared strings allow us a convenient, efficient style of programming and provide the practical property of creation times on packet data. The creation time generally corresponds to the exact time a packet was captured by the network interface according to libpcap. Since all substrings are really just different components of the same shared string, they all share the same creation times.

    So in other words, if you parse a packet with parsepaq using mypaq as a bind symbol, mypaq-ip4, mypaq-udp, mypaq etc will all share the same memory and have the same creation times.


Nuff interfaces

In modern operating systems, there is usually a diverse and rich set of APIs for network programming. Depending on the exact type of network connection you want, you might perform some sort of operation and receive a socket or other kind of descriptor to use for reading or writing data. It may have its own quirks and rules about it, and may have its own (de)multiplexing requirements and specifications.

Most programming environments use this concept of a socket by using the underlying operating system's socket interface and wrapping their own functions directly around the underlying OS's system calls. Because of the non-orthogonality of the APIs and the fact that most programming environments aren't really designed with multi-programming in mind, (de)multiplexing of multiple protocols over multiple sockets and devices is often difficult and error-prone. This is the major focus of design in nuff - creating a convenient, scalable solution to (de)multiplexing.

  • Nuff abstracts away the device and/or API in favour of the nuff interface

  • Nuff abstracts away the socket in favour of the Packet Transport Descriptor (often shortened to PTD)

The interface/PTD combination is a more-general abstraction than that of the device/socket. No matter what type of communication there is to be done, you use interfaces to create PTDs.

The macro we use to create PTDs from interfaces is nuffopen:

(nuffopen <interface> [arguments])

As you might expect, nuffopen uses our unix-bind-list macros. Here is an example of opening a pcap PTD to sniff icmp traffic on the default ethernet device:

(nuffopen pcap -filter "icmp")

And here is an example of opening a dnet IP PTD to send and have routed raw IP packets:

(nuffopen dnet-ip)

Note that you generally want to bind the PTD to some variable so you can use it later:

(define my-dnet (nuffopen dnet-ip))

For the full list of interfaces available, see "nuff doc interfaces". The most basic functions available to send and receive data over PTDs are send and recv:

  • The send macro sends a packet over a Packet Transport Descriptor (PTD):

    (send <Packet Transport Descriptor> <data>)
    

    Returns:

     <absolute timeval packet was sent> 
    
  • The recv macro receives a packet from a PTD and returns it as a string or returns () if the time reaches <absolute timeval to expire>. If <absolute timeval to expire> is (), the call will not return until the PTD has data:

    (recv <Packet Transport Descriptor> <absolute timeval to expire>)
    

    Returns:

     <data or () for a timeout> 
    

Even if you haven't seen a data-structure exactly like a PTD in the past, PTDs were designed to be convenient and intuitive. Either it is a pipe you pump information into, or a pipe you suck information out of, or both at the same time.

The send and recv macros are "blocking" calls: The continuation that calls them will stay frozen until a packet arrives/is sent or the timeout occurs. These macros may not sound much like the foundational building blocks of a multi-programming environment but that is because we haven't explained nuff asynchronicity yet. The trick is that recv and send aren't implemented directly as OS system calls. Send and recv are actually macros whose expansions create a snapshot of your lexical environment, put it into a time-ordered data structure, register you as interested in a certain type of network I/O, and then jump to some other task. When there are no other tasks to jump to, a nuff kernel routine, super-select, is invoked to concurrently handle all of our I/O. We explain this "continuation-passing style" further, and provide details on how to use it, in the continuations section of this document.

Here are some important notes regarding PTDs:

  • Multiple continuations can all receive data from a PTD concurrently. All of them will receive a copy of the packet (except that it isn't really a copy - they share memory - see the note on shared strings below). This behaviour is different from sockets where you are simply not supposed to do this.

    Packets are buffered in the PTD until somebody starts recving from them, but once a packet has been recved by a continuation any continuation that wasn't registered for that data has no way to get it from the PTD. For correctly written nuff programs this does not cause "lost packets" because continuations are required to make sure they are registered for the data they are interested in when they block. A continuation should not perform both sends and recvs, only one or the other. See the sections on queues and dispatchers.

  • PTDs are opaque data structures and you aren't allowed to use them in any way except to send to and recv from them. In particular, you don't close PTDs. When the garbage collector has determined a PTD is no longer needed, it will automatically close it for you.

  • PTDs are generalisable in ways that sockets aren't. Because of the specific design of PTDs, and the power and transparency of scheme continuations that will become more apparent after reading the continuations section, we can create higher-order interfaces to the raw PTD operations. An analogy in other languages might be some sort of procedure that can create new types of sockets for us. These sockets would have to be orthogonal with the language in all its usual respects, including (de)multiplexing, thread-behaviour, select(2)-ability, etc. Difficult elsewhere, but transparent in nuff. See the dispatcher and queue sections for examples of higher-order interfaces to PTDs.

  • Nuff provides an interface called a "loopback" interface. When you send to a loopback PTD, that data will become available to the next continuation(s) that recv from that PTD. Loopbacks are simple yet powerful constructs. They can be used as buffers (send data to them without worrying about blocking and collect it later as necessary) or as resources for forking the processing of a packet into different, concurrent streams of execution without the sender even worrying about how many streams there are or when they start. They are also useful for simulating a network device for testing and experimenting.


Order of evaluation of nuff scripts

As mentioned above, nuff scripts generally execute sequentially, form by form, through your .nuff file. There are a few exceptions to this:

  • The module form is parsed and processed before any of your other s-expressions are read. It sets up the various parameters required by your script and binds the necessary macros.

  • (param ...) forms in the nuff script are pre-evaluated and spliced into the nuff code so that other forms of optimisation are more effective.

  • Forms inside the privileged macro are evaluated before any other code so that nuff can drop potentially dangerous root privileges (see "nuff doc security" for more info). You can have as many privileged macros as you want, and they can be at any locations in the file, but they must all be at the top-level and not depend on any code outside the privileged macro (since it hasn't been executed yet).

  • Nuff will parse your code and the user-supplied arguments for DNS resolve queries that your program is likely to execute and starts their resolution immediatley. Nuff continues to load your script and the resolution continues "in the background" only blocking nuff once the results are actually needed.


Advanced Topics

  1. Continuations, super-select, and asynchronicity
  2. Queues
  3. Dispatchers
  4. DNS Resolver
  5. mapasync


Continuations, super-select, and asynchronicity

Nuff always runs as a single process (never calls fork(2)) and doesn't use any sort of kernel or userland thread systems. Nuff instead uses a scheme continuation-based scheduler and tries to simulate a responsive, input/output oriented multi-programming environment.

Nuff does this by using scheme continuations and the unix select(2) call. Nuff never uses read(2), write(2), or any other potentially blocking system call until it has confirmed via select() that the operation won't block the process. Nuff has been designed such that polling and non-blocking sockets should never be necessary.

A scheme continuation can be thought of as a process or a thread but really it is just a reference to something representing the frozen state of a scheme evaluation. The details of continuations - and the amazing things that can be done with them - won't be discussed further here except to mention that they can be used to implement every type of programming control structure currently known to exist (and some that can't be represented without them).

Nuff makes extensive use of continuations in its core scheduler routines and elsewhere. The scheme scheduler code is defined in NUFFDIR/ffun/sched.scm and our so-called super-select function - which actually interfaces our continuations with the OS's select() call - is compiled into the nuff kernel. Its definition is in the file NUFFDIR/src/ifaces.c .

These are the scheme functions nuff scripts can use to interact with the scheduler and super-select:

  • The sleep macro will pause the continuation for <value> <unit>s. Unit can be ms, s, m, h, or d for milliseconds, seconds, minutes, hours, or days respectively. This is a macro that uses the reltime macro discussed above.

    (sleep <value> <time unit>)
    

    Example (sleeps for 5 seconds):

    (sleep 5 s)
    
  • The scheduler function is actually a continuation that is used to decide if and when other continuations should run. As with all continuations, invoking scheduler has the effect of replacing your current continuation, so unless you have stored a copy of your current continuation somewhere, scheduler will "exit" your current continuation and your whole thread of execution will be garbage collected. This is roughly analogous to "joining" a thread. Make sure this is what you want!

    (scheduler)
    

    Returns:

    Does not return!
    
  • The async macro creates a continuation which evaluates form1, form2, ... in sequence and then terminates (its continuation is removed by invoking (scheduler) - see above). We can think of these forms as executing "in the background". Async starts the continuation up in a frozen state - it doesn't actually evaluate the forms until the calling continuation blocks for I/O or invokes the scheduler.

    (async <form1 form2 ...>)
    

Although it is convenient to think about nuff's continuations as separate threads, it is important to remember that they are not. Nuff implements a "co-operative" multi-threading system. So unlike, for instance, pthreads, our continuations will never be pre-empted. Running a continuation that performs lengthy computations could have a negative performance impact on your other I/O bound continuations.

On the positive side, nuff's lack of pre-emption means fewer opportunities for race-conditions: Because a running continuation is always certain that all other continuations are blocked waiting for I/O there is no need for semaphores, locks, etc on most data structures. See the section on determinism in "nuff doc security" for more information.

When experimenting in the REPL, it is important to remember that continuations will not actually run until scheduler is invoked. When you are at a regular REPL prompt, no continuations are running "in the background". The REPL is similar to a GDB debugging prompt in this sense. Instead of typing "run" to resume your program, however, you type "(scheduler)". Imagine being able to code and start up threads in GDB though!

Exercise: Download the freely available e-book "On Lisp" by Paul Graham and read his excellent treatment of continuations.


Queues

Although nuff provides a macro to directly send to Packet Transport Descriptors, send, the recommended way to handle the transmission of packets is a nuff queue. A queue is an extensible, efficient interface for throttling data transmission speeds and scheduling packets by priority.

Queues are bound to a Packet Transport Descriptor upon creation. Queues then provide a method for adding packets to be sent over the PTD at some indetermined time in the future. When you add packets to a queue your continuation never blocks. Instead, a queue will run a single continuation whenever there is data to be transmitted and transmit it according to capacity restrictions you can extend as you like.

The queue will determine how long of an interval to put between packets before sending them over the PTD and also will determine which packet to send based on priority. The packet with the highest priority in the queue (according to scheme's <) is always sent first. Where packets have the same priority, service is on a first-come first-serve basis.

Creating a queue is easy. You use the make-queue function which returns a closure you will want to store:

(define MY-QUEUE
  (make-queue MY-PTD))

This creates an empty queue with no throttling restrictions. We can now enqueue data on MY-PTD by invoking the closure in the following way (which doesn't block):

(MY-QUEUE 'queue "some data")

Using the queue command with the closure enqueues a packet with priority 0. You can specify a different priority by using the priority command instead:

(MY-QUEUE 'priority 10 "some data")
(MY-QUEUE 'priority -1.2 "some data")

Using queues in this way is perfectly legal in nuff but rather pointless. We haven't put any capacity limits in place yet so the queue just spits the packets out on the PTD as fast as possible. Queues are given capacity limits by passing them "capacity restricting closures". There are a few macros for creating capacity restricting closures available. Here are some examples:

  • Always have a spacing (gap) of at least 10 ms between each transmission:

    (MY-QUEUE 'cap (queue-gap 10 ms))
    
  • No more than 100 packets per minute:

    (MY-QUEUE 'cap (queue-limit 100 packets per m))
    
  • No more than 50kbps:

    (MY-QUEUE 'cap (queue-limit 50000 bytes per s))
    

You can place as many capacity restricting closures on a queue as you like - the queue will treat the closure that returns the longest wait-time at any given moment in time as the bottleneck and will use that as the throttle.

With the queue-limit macro, you can optionally specify a Moving Average parameter like so:

(MY-QUEUE 'cap (queue-limit 0.5 packets per s ma 2))

The moving average, which defaults to 5, is the number of packets "back in time" to look when calculating the delay statistics. The higher the ma, the smoother the rate will be in the face of traffic irregularities but the longer it will take to converge to new rates.

Some other useful commands that you can pass to queues are:

  • Returns the number of packets currently in the queue:

    (MY-QUEUE 'buffer-packets)
    
  • Returns the number of bytes currently in the queue:

    (MY-QUEUE 'buffer-bytes)
    

Another powerful feature of queues - and part of the reason for their design - is that it is possible to have one or more queues share one or more capacity restricting closures. For example:

(define MY-CAP (queue-limit 100 packets per s))
(MY-QUEUE-1 'cap MY-CAP)
(MY-QUEUE-2 'cap MY-CAP)

In this example, both MY-QUEUE-1 and MY-QUEUE-2 share the same closure, so the *combined* output of MY-QUEUE-1 and MY-QUEUE-2 will not exceed 100 packets per second.

Exercise: What happens if you add the same capacity restricting closure twice to the same queue? Bonus points: Why?


Dispatchers

Although nuff provides a macro to directly read from Packet Transport Descriptors, recv, the recommended way to handle packet demultiplexing is a nuff dispatcher. Dispatchers provide an efficient, general interface for filtering and distributing packets.

Given a PTD, a dispatcher will run a single continuation that listens for packets on that interface. When it receives a packet, it checks if there are any other continuations interested in this packet (with a constant-time hash-table lookup) and passes it to them. It does this using a closure-based dispatch interface described here.

The first step to using a dispatcher is a call to the make-dispatcher macro. It returns a closure which you will want to store:

(define MY-DISP
    (make-dispatcher
      <PTD>
      <Layer List>
      <Parameters>
      <Desired field>))
  • <PTD> is the PTD to receive packets from

  • <Layer List> is a list of layer symbols as used in a parsepaq "parse sequence"

  • <Parameters> is a list of layer symbol accessors of the form <layer>-<accessor> to be verified for each dispatch client

  • <Desired field> is a layer symbol or a layer accessor symbol to return from inside parsepaq that the dispatch client is interested in

Example:

(define MY-DISP
  (make-dispatcher
    MY-PTD
    (eth ip4 udp)
    (ip4-src udp-dp)
    udp-data))

The closure that make-dispatcher returns, bound in our example to MY-DISP, can be used to retrieve packets from MY-PTD satsifying ip4 source and udp destination port restrictions. Its first argument is a timeout value and the remaining arguments are the filter values corresponding to the <Parameters> list. We can use it like so:

(MY-DISP (reltime 5 s) "192.168.0.1" 161)

Which returns the udp-data payload of the next packet on MY-PTD from "192.168.0.1" destined to port 161 unless it times-out after 5 seconds, in which case it returns ().

A convenience feature of dispatchers is that they can return the PTD they were created with by passing the dispatcher closure a single symbol, get-ptd:

(MY-DISP 'get-ptd)

While from the above description it doesn't sound as though dispatchers buy us much (after all you can just have multiple continuations recv from a PTD to achieve the same semantic effect) but dispatchers provide an efficient, scalable solution to packet demultiplexing. All packets that are received have their parameters minimally extracted and then hashed for lookup in the dispatcher's hash-table. Only when a continuation has registered a particular hash value is it woken up and given the desired data.

Notes:

  • As with all operations on PTDs, more than one dispatcher can receive packets from a PTD at once: they will both receive and process a copy of the packet (except see the note on shared strings - the packets, of course, share memory).

  • Similarly, multiple continuations can be listening on a dispatcher for the same set of parameters. Again, each continuation will get a copy (shared string) of the same packet.


DNS Resolver

Nuff doesn't use the system's supplied resolver functions at all. Instead, it implements an advanced DNS system from the ground up using the nuff features we've already discussed: layers, parsepaq, interfaces, queues, and dispatchers.

The nuff resolver is different from the system resolver - and from other common resolution systems - in several ways:

  • The nuff resolver is completely asynchronous. Any and all operations can be done in parallel. Even following chains of CNAMEs will not interfere with any other name resolutions.

    For instance, in the nuff traceroute program, this lets us perform the traceroute function and the reverse DNS resolution simultaneously for all hosts. The DNS resolution is started as soon as we have an IP. In fact, nuff's continuations make it easy for us to pipeline many operations even across multiple stages of processing.

  • Looking up a record that already has an outstanding query will not cause another DNS query to be issued. Both resolutions will use the data from the same request when it arrives - even if the TTL is 0. While this isn't noticable in most cases, for certain kinds of bulk resolution we can save a large proportion of queries, and we are also allowed to program DNS resoution in a very convenient style.

  • DNS resolution is very closely tied into the nuff system. Nuff is generally fairly smart about determining what DNS names will need to be looked up in advance and starting queries early.

Here are a number of common functions for using nuff's resolver:

  • The host function is the most important nuff DNS function. It takes a string argument, the hostname, and returns the corresponding IP address by using nuff's asynchronous DNS system. If hostname is an IP address, host returns it immediatley.

    (host <hostname>)
    
  • The resolve function is the next most common nuff DNS function. It takes 2 arguments, a DNS domain name to search for and the DNS type which is an integer. Nuff has a collection of constants called DNS-TYPE-X constants, where X is one of A, NS, CNAME, or PTR depending on the DNS type:

    (resolve <dns domain name> <dns type>)
    

    The domain name parameter should be a period (.) separated DNS name of the form "what.ever.com" except for PTR records which are normal nuff IPv4 addresses that nuff converts to the in-addr.arpa domain. Conversely, Nuff will normally return a period separated DNS name except for A records which are converted to their normal IPv4 addresses. IPv6 AAAA records are simply returned as 16 byte strings for now.

    * (resolve "google.com" DNS-TYPE-A)
    "64.233.167.99"
    
    * (resolve (resolve "google.com" DNS-TYPE-A) DNS-TYPE-PTR)
    "py-in-f99.google.com"
    
    * (resolve (resolve (resolve "google.com" DNS-TYPE-A) DNS-TYPE-PTR) DNS-TYPE-A)
    "64.233.167.99"
    

    If nuff is unable to look up a domain, resolve will return 1 of 2 symbols: NXDOMAIN if we were able to confirm that the record doesn't exist or SERVFAIL if we're still not sure.

    Note that since nuff has logically separated the DNS resolver from host-lookups, resolve doesn't support /etc/hosts or the search paths in /etc/resolv.conf .

  • Sometimes we would like to do start a DNS resolution so it will be ready as early as possible. asyncresolve is a function that has the exact same effect as (async (resolve ...)) except that it is more efficient if there are already outstanding queries. asyncresolve always returns (). Example:

    (asyncresolve "slashdot.org" DNS-TYPE-A)
    
  • When nuff makes its first DNS resolution, it will parse the /etc/resolv.conf file for nameserver directives which it will use to make DNS requests. You can add more DNS nameservers with the function add-nameserver :

    (add-nameserver "127.0.0.1" 53)
    
  • You can remove all the nameservers currently in use with the clear-nameservers function. Make sure you use the add-nameserver function above to add new nameservers before you do any new resolves!

    (clear-nameservers)
    
  • Nuff will remember that it was unable to resolve a record (because of an NXDOMAIN or SERVFAIL) for *neg-cache-time* seconds which you can change if you like, for instance, to 100 ms:

    (set! *neg-cache-time* 0.1)
    
  • Nuff will, by default, always cache a record for 60 seconds even if the TTL is below that. This is because sometimes we make DNS pre-fetch operations and super-low TTLs would make most of our effort useless. You can, however, choose to honour super-low TTLs by changing the *min-cache-time* variable:

    (set! *min-cache-time* 0)
    
  • Nuff maintains a hash-table of DNS records that were queried for. It can be viewed with the dump-name-cache function:

    * (dump-name-cache)
    Bucket #12: ((("99.167.233.64.in-addr.arpa" 12) "py-in-f99.google.com" (1173142150 . 472283) ()))
    Bucket #304: ((("py-in-f99.google.com" 1) "64.233.167.99" (1173151184 . 556989) ()))
    Bucket #336: ((("google.com" 1) "64.233.167.99" (1173064182 . 135167) ()))
    

    Each line is a list of records stored in its hash bucket. There is no meaningful ordering for the records. Each record is of the form:

    ((<query name> <dns query type>) <answer or ()> <absolute expiry> <list of waiting continuations>)
    
    • <query name> is the record to be searched for.

    • <query type> is a DNS-TYPE-X constant as described above.

    • <answer> is the resulting record or () if none has been found yet.

    • <absolute expiry> is a timeval as described in the "Time" section which represents the latest moment in time we will consider the record valid (from the DNS Time To Live value).

    • <list of waiting continuations> are all of the continuations who want this value - see the continuations section.

    One important note is that currently nuff never removes entries from the cache. Even when they expire, they remain in the cache indefinitley; nuff will re-issue the query if it's requested for again after the absolute expiration time.

Exercise: Write a function (define (fully-qualified? ip) ...) that checks if an IP address's reverse record (PTR) has an A record that resolves back to the original IP. You will want to use equal? or string=? to compare IP addresses.


mapasync

In most sorts of processes where request latency is a bottleneck rather than bandwidth, operations can be done simultaneously to increase overall throughput. Of course in nuff whenever we say "simultaneous" we don't literally mean "happening at the same time". Just as a single CPU can't really be running 2 processes at once, and an ethernet can't have more than one packet on it at once, nuff can't ever perform more than one task at once. What nuff can do, however, is create smart programming constructs that figure out how to do as much in parallel as possible and do them close enough together that they feel as though they are happening concurrently. One macro that nuff includes to do this is called mapasync.

In general, what programmers mean by a map is a higher-order function that applies another - user supplied - function to every element in a list, creating a new list with the return values. Mapping is such a universally good idea that all modern languages support it in one sense or another. Scheme has a basic map function that nuff uses frequently. Not suprisingly, CL has the most powerful collection of generalised mapping primitives. In functional languages (like haskell, ocaml) mapping can enable some incredible efficiency optimisations. Parallel computing infrastructures (MPI, PVM) use mapping to distribute computation.

mapasync is a mapping primitive designed to make it possible to perform network tasks concurrently while still programming sequentially. It takes a function (acually, since mapasync is a macro, a lambda form) that we create to perform whatever network tasks, in whatever sequence, we need.

Here are two concepts vital to the understanding of concurrent network applications:

  • Pipelining. When things are done in parallel, we get a higher total throughput of operations. So how many should we do in parallel? If we do too many, we can saturate our bandwidth and start to get a lower average throughput due to congestion, dropped packets, etc. If we do too few, we aren't taking full advantage of our pipeline. We will sometimes call this number of parallel operations the "pipeline width", the "pipeline depth", or the "window size".

  • Temporal correlation. Don't be scared by the name, it's just the most concise way of describing the relation different parallel operations in a pipeline have with each other. Different types of operations will have different temporal correlations. Often this will depend on the destinations, the bandwidths they consume, the network routes they take, the destination host resources they occupy, and many other network factors. Tasks that have a low temporal correlation are the low-hanging fruit for concurrent optimisation. It's actually very simple: find out what they are, and do them in parallel. Scheduling network tasks depending on when it is optimal to run them is analogous to other locality of reference optimisations and we can use similar conceptual models.

mapasync itself is deceptively simple. Let's start out with a basic example.

$ nuff -eval '(mapasync 3 host (list "hcsw.org" "slashdot.org" "google.com"))'
("65.98.116.106" "66.35.250.150" "72.14.207.99")

This applies the host function (can actually even be a macro!) to every element in the supplied list. Since the initial window size was specified to be 3, mapasync will perform at most 3 resolutions concurrently. This example is overly simple in a couple ways. Often you will want to use a lambda form instead of host so you can dynamically control the window size as discussed later. You may also want to change windowing options instead of the default range of [1-1000]. Here is the general description of mapasync:

(mapasync
  <window-spec>
  <user-function>
  <input-list>)

So a general mapasync template might be:

(mapasync
  (list min-pipeline initial-pipeline max-pipeline)
  (lambda (inp)
    (let* ((tp1 (task-1 inp))
           (tp2 (task-2 tp1))
           (tp3 (task-3 tp2)))
      (if (is-valid? tp3)
          tp3
          'error)))
  input-data)

Where task-1, task-2, task-3 can all be blocking operations.

The initial pipeline depth is <initial-window>. The target depth will always stay between <bottom-window> and <top-window>. The window spec can be one of the following, depending on how much control you need over the depth of your mapasync pipeline.

  • A list of 3 integers:

    (<bottom-window> <initial-window> <top-window>)
    
  • A single integer (default bottom/top is [1,1000]):

    <initial-window>
    

<user-function> should be a lambda form that is applied to one argument (an item of input) and returns one value (the corresponding output). <user-function> does not need to worry about concurrency, scheduling, etc. Simply perform the necessary blocking I/O sequence to convert your given item of input into the corresponding output.

To dynamically control the pipeline, you can use the following macros INSIDE the <user-function> lambda form:

  • Incrementally increase the window:

    (mapasync-window-up)
    
  • Incrementally decrease the window:

    (mapasync-window-down)
    
  • Exponentially decrease the window (by cutting it in half):

    (mapasync-window-expt-down)
    

<input-list> is a list of input values. As with all mapping functions, you can manually construct this list however you want. But before you do, please look closely at the following convenience functions nuff provides for this:

  • collect returns a list of increasing integers from <start> to <end>:

    (collect <start> <end>)
    

    For example:

    * (collect 1 10)
    (1 2 3 4 5 6 7 8 9 10)
    

    Or:

    * (map (lambda (x) (* 0.1 x)) (collect 0 10))
    (0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1)
    
  • cartesian performs an operation that is similar to the "cartesian product" of sets and similar to the "join" (mu) of category theory. Given 2 lists of atoms, cartesian will concatenate all possible tuples and return a list of length (* (length <a>) (length <b>)) :

    (cartesian <a> <b>)
    

    For example:

    * (cartesian '("google.com" "yahoo.com") '(22 80))
    (("google.com" 22) ("yahoo.com" 22) ("google.com" 80) ("yahoo.com" 80))
    

    If <a> is a list of lists, cartesian will append the values in <list-b> onto the lists in <list-a>:

    * (cartesian (cartesian '(op1 op2) (collect 1 5)) '(a b))
    ((op1 1 a) (op2 1 a) (op1 2 a) (op2 2 a) (op1 3 a) (op2 3 a)
     (op1 4 a) (op2 4 a) (op1 5 a) (op2 5 a) (op1 1 b) (op2 1 b)
     (op1 2 b) (op2 2 b) (op1 3 b) (op2 3 b) (op1 4 b) (op2 4 b)
     (op1 5 b) (op2 5 b))
    

For an example use of mapasync, see the trace program included with nuff in NUFFDIR/code/trace.nuff .


Misc Details, Notes, Examples

These are small notes on parts of the nuff system that are too trivial to warrant their own section, but important enough to mention.

All material is © Doug Hoyte and/or HCSW Labs unless otherwise noted or implied.