In a way, the most impressive aspect of the condition system covered in the previous chapter is that if it wasn't already part of the language, it could be written entirely as a user-level library. This is possible because Common Lisp's special operators--while none touches directly on signaling or handling conditions--provide enough access to the underlying machinery of the language to be able to do things such as control the unwinding of the stack.
In previous chapters I've discussed the most frequently used special operators, but it's worth being familiar with the others for two reasons. First, some of the infrequently used special operators are used infrequently simply because whatever need they address doesn't arise that often. It's good to be familiar with these special operators so when one of them is called for, you'll at least know it exists. Second, because the 25 special operators--along with the basic rule for evaluating function calls and the built-in data types--provide the foundation for the rest of the language, a passing familiarity with them will help you understand how the language works.
In this chapter, I'll discuss all the special operators, some briefly and some at length, so you can see how they fit together. I'll point out which ones you can expect to use directly in your own code, which ones serve as the basis for other constructs that you use all the time, and which ones you'll rarely use directly but which can be handy in macro-generated code.
The first category of special operators contains the three operators
that provide basic control over the evaluation of forms. They're
QUOTE, IF, and PROGN, and I've discussed them all
already. However, it's worth noting how each of these special
operators provides one fundamental kind of control over the
evaluation of one or more forms. QUOTE prevents evaluation
altogether and allows you to get at s-expressions as data. IF
provides the fundamental boolean choice operation from which all
other conditional execution constructs can be built.1 And
PROGN provides the ability to sequence a number of forms.
The largest class of special operators contains the operators that
manipulate and access the lexical environment. LET and
LET*, which I've already discussed, are examples of special
operators that manipulate the lexical environment since they can
introduce new lexical bindings for variables. Any construct, such as
a DO or DOTIMES, that binds lexical variables will have to
expand into a LET or LET*.2 The SETQ special operator is one that
accesses the lexical environment since it can be used to set
variables whose bindings were created by LET and LET*.
Variables, however, aren't the only thing that can be named within a
lexical scope. While most functions are defined globally with
DEFUN, it's also possible to create local functions with the
special operators FLET and LABELS, local macros with
MACROLET, and a special kind of macro, called a symbol
macro, with SYMBOL-MACROLET.
Much like LET allows you to introduce a lexical variable whose
scope is the body of the LET, FLET and LABELS let you
define a function that can be referred to only within the scope of
the FLET or LABELS form. These special operators are handy
when you need a local function that's a bit too complex to define
inline as a LAMBDA expression or that you need to use more than
once. Both have the same basic form, which looks like this:
(flet (function-definition*) body-form*)
and like this:
(labels (function-definition*) body-form*)
where each function-definition has the following form:
(name (parameter*) form*)
The difference between FLET and LABELS is that the names of
the functions defined with FLET can be used only in the body of
the FLET, while the names introduced by LABELS can be used
immediately, including in the bodies of the functions defined by the
LABELS. Thus, LABELS can define recursive functions, while
FLET can't. It might seem limiting that FLET can't be used
to define recursive functions, but Common Lisp provides both
FLET and LABELS because sometimes it's useful to be able to
write local functions that can call another function of the same
name, either a globally defined function or a local function from an
enclosing scope.
Within the body of a FLET or LABELS, you can use the names
of the functions defined just like any other function, including with
the FUNCTION special operator. Since you can use FUNCTION
to get the function object representing a function defined with
FLET or LABELS, and since a FLET or LABELS can be
in the scope of other binding forms such as LETs, these
functions can be closures.
Because the local functions can refer to variables from the enclosing
scope, they can often be written to take fewer parameters than the
equivalent helper functions. This is particularly handy when you need
to pass a function that takes a single argument as a functional
parameter. For example, in the following function, which you'll see
again in Chapter 25, the FLETed function, count-version,
takes a single argument, as required by walk-directory, but
can also use the variable versions, introduced by the
enclosing LET:
(defun count-versions (dir)
(let ((versions (mapcar #'(lambda (x) (cons x 0)) '(2 3 4))))
(flet ((count-version (file)
(incf (cdr (assoc (major-version (read-id3 file)) versions)))))
(walk-directory dir #'count-version :test #'mp3-p))
versions))This function could also be written using an anonymous function in
the place of the FLETed count-version, but giving the
function a meaningful name makes it a bit easier to read.
And when a helper function needs to recurse, an anonymous function
just won't do.3 When
you don't want to define a recursive helper function as a global
function, you can use LABELS. For example, the following
function, collect-leaves, uses the recursive helper function
walk to walk a tree and gather all the atoms in the tree into
a list, which collect-leaves then returns (after reversing
it):
(defun collect-leaves (tree)
(let ((leaves ()))
(labels ((walk (tree)
(cond
((null tree))
((atom tree) (push tree leaves))
(t (walk (car tree))
(walk (cdr tree))))))
(walk tree))
(nreverse leaves)))Notice again how, within the walk function, you can refer to
the variable, leaves, introduced by the enclosing LET.
FLET and LABELS are also useful operations to use in macro
expansions--a macro can expand into code that contains a FLET or
LABELS to create functions that can be used within the body of
the macro. This technique can be used either to introduce functions
that the user of the macro will call or simply as a way of organizing
the code generated by the macro. This, for instance, is how a
function such as CALL-NEXT-METHOD, which can be used only within
a method definition, might be defined.
A near relative to FLET and LABELS is the special operator
MACROLET, which you can use to define local macros. Local macros
work just like global macros defined with DEFMACRO except
without cluttering the global namespace. When a MACROLET form is
evaluated, the body forms are evaluated with the local macro
definitions in effect and possibly shadowing global function and
macro definitions or local definitions from enclosing forms. Like
FLET and LABELS, MACROLET can be used directly, but
it's also a handy target for macro-generated code--by wrapping some
user-supplied code in a MACROLET, a macro can provide constructs
that can be used only within that code or can shadow a globally
defined macro. You'll see an example of this latter use of
MACROLET in Chapter 31.
Finally, one last macro-defining special operator is
SYMBOL-MACROLET, which defines a special kind of macro called,
appropriately enough, a symbol macro. Symbol macros are like
regular macros except they can't take arguments and are referred to
with a plain symbol rather than a list form. In other words, after
you've defined a symbol macro with a particular name, any use of that
symbol in a value position will be expanded and the resulting form
evaluated in its place. This is how macros such as WITH-SLOTS
and WITH-ACCESSORS are able to define "variables" that access
the state of a particular object under the covers. For instance, the
following WITH-SLOTS form:
(with-slots (x y z) foo (list x y z)))
might expand into this code that uses SYMBOL-MACROLET:
(let ((#:g149 foo))
(symbol-macrolet
((x (slot-value #:g149 'x))
(y (slot-value #:g149 'y))
(z (slot-value #:g149 'z)))
(list x y z)))When the expression (list x y z) is evaluated, the symbols
x, y, and z will be replaced with their
expansions, such as (slot-value #:g149 'x).4
Symbol macros are most often local, defined with
SYMBOL-MACROLET, but Common Lisp also provides a macro
DEFINE-SYMBOL-MACRO that defines a global symbol macro. A symbol
macro defined with SYMBOL-MACROLET shadows other symbol macros
of the same name defined with DEFINE-SYMBOL-MACRO or enclosing
SYMBOL-MACROLET forms.
The next four special operators I'll discuss also create and use
names in the lexical environment but for the purposes of altering the
flow of control rather than defining new functions and macros. I've
mentioned all four of these special operators in passing because they
provide the underlying mechanisms used by other language features.
They're BLOCK, RETURN-FROM, TAGBODY, and GO. The
first two, BLOCK and RETURN-FROM, are used together to
write code that returns immediately from a section of code--I
discussed RETURN-FROM in Chapter 5 as a way to return
immediately from a function, but it's more general than that. The
other two, TAGBODY and GO, provide a quite low-level goto
construct that's the basis for all the higher-level looping
constructs you've already seen.
The basic skeleton of a BLOCK form is this:
(block name form*)
The name is a symbol, and the forms are Lisp forms. The forms
are evaluated in order, and the value of the last form is returned as
the value of the BLOCK unless a RETURN-FROM is used to
return from the block early. A RETURN-FROM form, as you saw in
Chapter 5, consists of the name of the block to return from and,
optionally, a form that provides a value to return. When a
RETURN-FROM is evaluated, it causes the named BLOCK to
return immediately. If RETURN-FROM is called with a return value
form, the BLOCK will return the resulting value; otherwise, the
BLOCK evaluates to NIL.
A BLOCK name can be any symbol, which includes NIL. Many of
the standard control construct macros, such as DO, DOTIMES,
and DOLIST, generate an expansion consisting of a BLOCK
named NIL. This allows you to use the RETURN macro, which
is a bit of syntactic sugar for (return-from nil ...), to
break out of such loops. Thus, the following loop will print at most
ten random numbers, stopping as soon as it gets a number greater than
50:
(dotimes (i 10)
(let ((answer (random 100)))
(print answer)
(if (> answer 50) (return))))Function-defining macros such as DEFUN, FLET, and
LABELS, on the other hand, wrap their bodies in a BLOCK
with the same name as the function. That's why you can use
RETURN-FROM to return from a function.
TAGBODY and GO have a similar relationship to each other as
BLOCK and RETURN-FROM: a TAGBODY form defines a
context in which names are defined that can be used by GO. The
skeleton of a TAGBODY is as follows:
(tagbody tag-or-compound-form*)
where each tag-or-compound-form is either a symbol, called a
tag, or a nonempty list form. The list forms are evaluated in
order and the tags ignored, except as I'll discuss in a moment. After
the last form of the TAGBODY is evaluated, the TAGBODY
returns NIL. Anywhere within the lexical scope of the
TAGBODY you can use the GO special operator to jump
immediately to any of the tags, and evaluation will resume with the
form following the tag. For instance, you can write a trivial
infinite loop with TAGBODY and GO like this:
(tagbody top (print 'hello) (go top))
Note that while the tag names must appear at the top level of the
TAGBODY, not nested within other forms, the GO special
operator can appear anywhere within the scope of the TAGBODY.
This means you could write a loop that loops a random number of times
like this:
(tagbody top (print 'hello) (when (plusp (random 10)) (go top)))
An even sillier example of TAGBODY, which shows you can have
multiple tags in a single TAGBODY, looks like this:
(tagbody a (print 'a) (if (zerop (random 2)) (go c)) b (print 'b) (if (zerop (random 2)) (go a)) c (print 'c) (if (zerop (random 2)) (go b)))
This form will jump around randomly printing as, bs, and
cs until eventually the last RANDOM expression returns 1 and
the control falls off the end of the TAGBODY.
TAGBODY is rarely used directly since it's almost always easier
to write iterative constructs in terms of the existing looping
macros. It's handy, however, for translating algorithms written in
other languages into Common Lisp, either automatically or manually.
An example of an automatic translation tool is the FORTRAN-to-Common
Lisp translator, f2cl, that translates FORTRAN source code into
Common Lisp in order to make various FORTRAN libraries available to
Common Lisp programmers. Since many FORTRAN libraries were written
before the structured programming revolution, they're full of gotos.
The f2cl compiler can simply translate those gotos to GOs within
appropriate TAGBODYs.5
Similarly, TAGBODY and GO can be handy when translating
algorithms described in prose or by flowcharts--for instance, in
Donald Knuth's classic series The Art of Computer Programming, he
describes algorithms using a "recipe" format: step 1, do this; step
2, do that; step 3, go back to step 2; and so on. For example, on
page 142 of The Art of Computer Programming, Volume 2:
Seminumerical Algorithms, Third Edition (Addison-Wesley, 1998), he
describes Algorithm S, which you'll use in Chapter 27, in this form:
Algorithm S (Selection sampling technique). To select n records at random from a set of N, where 0 < n <= N.
S1. [Initialize.] Set t <-- 0, m <-- 0. (During this algorithm, m represents the number of records selected so far, and t is the total number of input records that we have dealt with.)
S2. [Generate U.] Generate a random number U, uniformly distributed between zero and one.
S3. [Test.] If (N - t)U >= n - m, go to step S5.
S4. [Select.] Select the next record for the sample, and increase m and t by 1. If m < n, go to step S2; otherwise the sample is complete and the algorithm terminates.
S5. [Skip.] Skip the next record (do not include it in the sample), increase t by 1, and go back to step S2.
This description can be easily translated into a Common Lisp function, after renaming a few variables, as follows:
(defun algorithm-s (n max) ; max is N in Knuth's algorithm
(let (seen ; t in Knuth's algorithm
selected ; m in Knuth's algorithm
u ; U in Knuth's algorithm
(records ())) ; the list where we save the records selected
(tagbody
s1
(setf seen 0)
(setf selected 0)
s2
(setf u (random 1.0))
s3
(when (>= (* (- max seen) u) (- n selected)) (go s5))
s4
(push seen records)
(incf selected)
(incf seen)
(if (< selected n)
(go s2)
(return-from algorithm-s (nreverse records)))
s5
(incf seen)
(go s2))))It's not the prettiest code, but it's easy to verify that it's a faithful translation of Knuth's algorithm. But, this code, unlike Knuth's prose description, can be run and tested. Then you can start refactoring, checking after each change that the function still works.6
After pushing the pieces around a bit, you might end up with something like this:
(defun algorithm-s (n max)
(loop for seen from 0
when (< (* (- max seen) (random 1.0)) n)
collect seen and do (decf n)
until (zerop n)))While it may not be immediately obvious that this code correctly implements Algorithm S, if you got here via a series of functions that all behave identically to the original literal translation of Knuth's recipe, you'd have good reason to believe it's correct.
Another aspect of the language that special operators give you
control over is the behavior of the call stack. For instance, while
you normally use BLOCK and TAGBODY to manage the flow of
control within a single function, you can also use them, in
conjunction with closures, to force an immediate nonlocal return from
a function further down on the stack. That's because BLOCK names
and TAGBODY tags can be closed over by any code within the
lexical scope of the BLOCK or TAGBODY. For example,
consider this function:
(defun foo ()
(format t "Entering foo~%")
(block a
(format t " Entering BLOCK~%")
(bar #'(lambda () (return-from a)))
(format t " Leaving BLOCK~%"))
(format t "Leaving foo~%"))The anonymous function passed to bar uses RETURN-FROM to
return from the BLOCK. But that RETURN-FROM doesn't get
evaluated until the anonymous function is invoked with FUNCALL
or APPLY. Now suppose bar looks like this:
(defun bar (fn) (format t " Entering bar~%") (baz fn) (format t " Leaving bar~%"))
Still, the anonymous function isn't invoked. Now look at baz.
(defun baz (fn) (format t " Entering baz~%") (funcall fn) (format t " Leaving baz~%"))
Finally the function is invoked. But what does it mean to
RETURN-FROM a block that's several layers up on the call stack?
Turns out it works fine--the stack is unwound back to the frame where
the BLOCK was established and control returns from the
BLOCK. The FORMAT expressions in foo, bar,
and baz show this:
CL-USER> (foo) Entering foo Entering BLOCK Entering bar Entering baz Leaving foo NIL
Note that the only "Leaving . . ." message that prints is the one
that appears after the BLOCK in foo.
Because the names of blocks are lexically scoped, a RETURN-FROM
always returns from the smallest enclosing BLOCK in the lexical
environment where the RETURN-FROM form appears even if the
RETURN-FROM is executed in a different dynamic context. For
instance, bar could also contain a BLOCK named a,
like this:
(defun bar (fn) (format t " Entering bar~%") (block a (baz fn)) (format t " Leaving bar~%"))
This extra BLOCK won't change the behavior of foo at
all--the name a is resolved lexically, at compile time, not
dynamically, so the intervening block has no effect on the
RETURN-FROM. Conversely, the name of a BLOCK can be used
only by RETURN-FROMs appearing within the lexical scope of the
BLOCK; there's no way for code outside the block to return from
the block except by invoking a closure that closes over a
RETURN-FROM from the lexical scope of the BLOCK.
TAGBODY and GO work the same way, in this regard, as
BLOCK and RETURN-FROM. When you invoke a closure that
contains a GO form, if the GO is evaluated, the stack will
unwind back to the appropriate TAGBODY and then jump to the
specified tag.
BLOCK names and TAGBODY tags, however, differ from lexical
variable bindings in one important way. As I discussed in Chapter 6,
lexical bindings have indefinite extent, meaning the bindings can
stick around even after the binding form has returned. BLOCKs
and TAGBODYs, on the other hand, have dynamic extent--you can
RETURN-FROM a BLOCK or GO to a TAGBODY tag only
while the BLOCK or TAGBODY is on the call stack. In other
words, a closure that captures a block name or TAGBODY tag can
be passed down the stack to be invoked later, but it can't be
returned up the stack. If you invoke a closure that tries to
RETURN-FROM a BLOCK, after the BLOCK itself has
returned, you'll get an error. Likewise, trying to GO to a
TAGBODY that no longer exists will cause an error.7
It's unlikely you'll need to use BLOCK and TAGBODY yourself
for this kind of stack unwinding. But you'll likely be using them
indirectly whenever you use the condition system, so understanding
how they work should help you understand better what exactly, for
instance, invoking a restart is doing.8
CATCH and THROW are another pair of special operators that
can force the stack to unwind. You'll use these operators even less
often than the others mentioned so far--they're holdovers from
earlier Lisp dialects that didn't have Common Lisp's condition
system. They definitely shouldn't be confused with
try/catch and try/except constructs from
languages such as Java and Python.
CATCH and THROW are the dynamic counterparts of BLOCK
and RETURN-FROM. That is, you wrap CATCH around a body of
code and then use THROW to cause the CATCH form to return
immediately with a specified value. The difference is that the
association between a CATCH and THROW is established
dynamically--instead of a lexically scoped name, the label for a
CATCH is an object, called a catch tag, and any THROW
evaluated within the dynamic extent of the CATCH that throws
that object will unwind the stack back to the CATCH form and
cause it to return immediately. Thus, you can write a version of the
foo, bar, and baz functions from before using
CATCH and THROW instead of BLOCK and RETURN-FROM
like this:
(defparameter *obj* (cons nil nil)) ; i.e. some arbitrary object
(defun foo ()
(format t "Entering foo~%")
(catch *obj*
(format t " Entering CATCH~%")
(bar)
(format t " Leaving CATCH~%"))
(format t "Leaving foo~%"))
(defun bar ()
(format t " Entering bar~%")
(baz)
(format t " Leaving bar~%"))
(defun baz ()
(format t " Entering baz~%")
(throw *obj* nil)
(format t " Leaving baz~%"))Notice how it isn't necessary to pass a closure down the
stack--baz can call THROW directly. The result is quite
similar to the earlier version.
CL-USER> (foo) Entering foo Entering CATCH Entering bar Entering baz Leaving foo NIL
However, CATCH and THROW are almost too dynamic. In
both the CATCH and the THROW, the tag form is evaluated,
which means their values are both determined at runtime. Thus, if
some code in bar reassigned or rebound *obj*, the
THROW in baz wouldn't throw to the same CATCH. This
makes CATCH and THROW much harder to reason about than
BLOCK and RETURN-FROM. The only advantage, which the
version of foo, bar, and baz that use CATCH
and THROW demonstrates, is there's no need to pass down a
closure in order for low-level code to return from a CATCH--any
code that runs within the dynamic extent of a CATCH can cause it
to return by throwing the right object.
In older Lisp dialects that didn't have anything like Common Lisp's
condition system, CATCH and THROW were used for error
handling. However, to keep them manageable, the catch tags were
usually just quoted symbols, so you could tell by looking at a
CATCH and a THROW whether they would hook up at runtime. In
Common Lisp you'll rarely have any call to use CATCH and
THROW since the condition system is so much more flexible.
The last special operator related to controlling the stack is another
one I've mentioned in passing before--UNWIND-PROTECT.
UNWIND-PROTECT lets you control what happens as the stack
unwinds--to make sure that certain code always runs regardless of how
control leaves the scope of the UNWIND-PROTECT, whether by a
normal return, by a restart being invoked, or by any of the ways
discussed in this section.9 The
basic skeleton of UNWIND-PROTECT looks like this:
(unwind-protect protected-form cleanup-form*)
The single protected-form is evaluated, and then, regardless of
how it returns, the cleanup-forms are evaluated. If the
protected-form returns normally, then whatever it returns is
returned from the UNWIND-PROTECT after the cleanup forms run.
The cleanup forms are evaluated in the same dynamic environment as
the UNWIND-PROTECT, so the same dynamic variable bindings,
restarts, and condition handlers will be visible to code in cleanup
forms as were visible just before the UNWIND-PROTECT.
You'll occasionally use UNWIND-PROTECT directly. More often
you'll use it as the basis for WITH- style macros, similar to
WITH-OPEN-FILE, that evaluate any number of body forms in a
context where they have access to some resource that needs to be
cleaned up after they're done, regardless of whether they return
normally or bail via a restart or other nonlocal exit. For example,
if you were writing a database library that defined functions
open-connection and close-connection, you might write a
macro like this:10
(defmacro with-database-connection ((var &rest open-args) &body body)
`(let ((,var (open-connection ,@open-args)))
(unwind-protect (progn ,@body)
(close-connection ,var))))which lets you write code like this:
(with-database-connection (conn :host "foo" :user "scott" :password "tiger") (do-stuff conn) (do-more-stuff conn))
and not have to worry about closing the database connection, since
the UNWIND-PROTECT will make sure it gets closed no matter what
happens in the body of the with-database-connection form.
Another feature of Common Lisp that I've mentioned in passing--in
Chapter 11, when I discussed GETHASH--is the ability for a
single form to return multiple values. I'll discuss it in greater
detail now. It is, however, slightly misplaced in a chapter on
special operators since the ability to return multiple values isn't
provided by just one or two special operators but is deeply
integrated into the language. The operators you'll most often use
when dealing with multiple values are macros and functions, not
special operators. But it is the case that the basic ability to get
at multiple return values is provided by a special operator,
MULTIPLE-VALUE-CALL, upon which the more commonly used
MULTIPLE-VALUE-BIND macro is built.
The key thing to understand about multiple values is that returning
multiple values is quite different from returning a list--if a form
returns multiple values, unless you do something specific to capture
the multiple values, all but the primary value will be silently
discarded. To see the distinction, consider the function
GETHASH, which returns two values: the value found in the hash
table and a boolean that's NIL when no value was found. If it
returned those two values in a list, every time you called
GETHASH you'd have to take apart the list to get at the actual
value, regardless of whether you cared about the second return value.
Suppose you have a hash table, *h*, that contains numeric
values. If GETHASH returned a list, you couldn't write something
like this:
(+ (gethash 'a *h*) (gethash 'b *h*))
because + expects its arguments to be numbers, not lists. But
because the multiple value mechanism silently discards the secondary
return value when it's not wanted, this form works fine.
There are two aspects to using multiple values--returning multiple
values and getting at the nonprimary values returned by forms that
return multiple values. The starting points for returning multiple
values are the functions VALUES and VALUES-LIST. These are
regular functions, not special operators, so their arguments are
passed in the normal way. VALUES takes a variable number of
arguments and returns them as multiple values; VALUES-LIST takes
a single list and returns its elements as multiple values. In other
words:
(values-list x) === (apply #'values x)
The mechanism by which multiple values are returned is implementation
dependent just like the mechanism for passing arguments into
functions is. Almost all language constructs that return the value of
some subform will "pass through" multiple values, returning all the
values returned by the subform. Thus, a function that returns the
result of calling VALUES or VALUES-LIST will itself return
multiple values--and so will another function whose result comes from
calling the first function. And so on.11
But when a form is evaluated in a value position, only the primary
value will be used, which is why the previous addition form works the
way you'd expect. The special operator MULTIPLE-VALUE-CALL
provides the mechanism for getting your hands on the multiple values
returned by a form. MULTIPLE-VALUE-CALL is similar to
FUNCALL except that while FUNCALL is a regular function
and, therefore, can see and pass on only the primary values passed to
it, MULTIPLE-VALUE-CALL passes, to the function returned by its
first subform, all the values returned by the remaining subforms.
(funcall #'+ (values 1 2) (values 3 4)) ==> 4 (multiple-value-call #'+ (values 1 2) (values 3 4)) ==> 10
However, it's fairly rare that you'll simply want to pass all the
values returned by a function onto another function. More likely,
you'll want to stash the multiple values in different variables and
then do something with them. The MULTIPLE-VALUE-BIND macro, which
you saw in Chapter 11, is the most frequently used operator for
accepting multiple return values. Its skeleton looks like this:
(multiple-value-bind (variable*) values-form body-form*)
The values-form is evaluated, and the multiple values it returns are bound to the variables. Then the body-forms are evaluated with those bindings in effect. Thus:
(multiple-value-bind (x y) (values 1 2) (+ x y)) ==> 3
Another macro, MULTIPLE-VALUE-LIST, is even simpler--it takes a
single form, evaluates it, and collects the resulting multiple values
into a list. In other words, it's the inverse of VALUES-LIST.
CL-USER> (multiple-value-list (values 1 2)) (1 2) CL-USER> (values-list (multiple-value-list (values 1 2))) 1 2
However, if you find yourself using MULTIPLE-VALUE-LIST a lot,
it may be a sign that some function should be returning a list to
start with rather than multiple values.
Finally, if you want to assign multiple values returned by a form to
existing variables, you can use VALUES as a SETFable place.
For example:
CL-USER> (defparameter *x* nil) *X* CL-USER> (defparameter *y* nil) *Y* CL-USER> (setf (values *x* *y*) (floor (/ 57 34))) 1 23/34 CL-USER> *x* 1 CL-USER> *y* 23/34
A special operator you'll need to understand in order to write
certain kinds of macros is EVAL-WHEN. For some reason, Lisp
books often treat EVAL-WHEN as a wizards-only topic. But the
only prerequisite to understanding EVAL-WHEN is an understanding
of how the two functions LOAD and COMPILE-FILE interact.
And understanding EVAL-WHEN will be important as you start
writing certain kinds of more sophisticated macros, such as the ones
you'll write in Chapters 24 and 31.
I've touched briefly on the relation between LOAD and
COMPILE-FILE in previous chapters, but it's worth reviewing
again here. The job of LOAD is to load a file and evaluate all
the top-level forms it contains. The job of COMPILE-FILE is to
compile a source file into a FASL file, which can then be loaded with
LOAD such that (load "foo.lisp") and (load
"foo.fasl") are essentially equivalent.
Because LOAD evaluates each form before reading the next, the
side effects of evaluating forms earlier in the file can affect how
forms later in the form are read and evaluated. For instance,
evaluating an IN-PACKAGE form changes the value of
*PACKAGE*, which will affect the way subsequent forms are
read.12 Similarly, a DEFMACRO form early in a file can define a
macro that can then be used by code later in the file.13
COMPILE-FILE, on the other hand, normally doesn't evaluate the
forms it's compiling; it's when the FASL is loaded that the forms--or
their compiled equivalents--will be evaluated. However,
COMPILE-FILE must evaluate some forms, such as IN-PACKAGE
and DEFMACRO forms, in order to keep the behavior of (load
"foo.lisp") and (load "foo.fasl") consistent.
So how do macros such as IN-PACKAGE and DEFMACRO work when
processed by COMPILE-FILE? In some pre-Common Lisp versions of
Lisp, the file compiler simply knew it should evaluate certain macros
in addition to compiling them. Common Lisp avoided the need for such
kludges by borrowing the EVAL-WHEN special operator from Maclisp.
This operator, as its name suggests, allows you to control when
specific bits of code are evaluated. The skeleton of an EVAL-WHEN
form looks like this:
(eval-when (situation*) body-form*)
There are three possible situations--:compile-toplevel,
:load-toplevel, and :execute--and which ones you specify
controls when the body-forms will be evaluated. An EVAL-WHEN
with multiple situations is equivalent to several EVAL-WHEN
forms, one per situation, each with the same body code. To explain the
meaning of the three situations, I'll need to explain a bit about how
COMPILE-FILE, which is also referred to as the file compiler,
goes about compiling a file.
To explain how COMPILE-FILE compiles EVAL-WHEN forms, I
need to introduce a distinction between compiling top-level forms
and compiling non-top-level forms. A top-level form is, roughly
speaking, one that will be compiled into code that will be run when
the FASL is loaded. Thus, all forms that appear directly at the top
level of a source file are compiled as top-level forms. Similarly,
any forms appearing directly in a top-level PROGN are compiled
as top-level forms since the PROGN itself doesn't do
anything--it just groups together its subforms, which will be run
when the FASL is loaded.14 Similarly, forms appearing
directly in a MACROLET or SYMBOL-MACROLET are compiled as
top-level forms because after the compiler has expanded the local
macros or symbol macros, there will be no remnant of the
MACROLET or SYMBOL-MACROLET in the compiled code. Finally,
the expansion of a top-level macro form will be compiled as a
top-level form.
Thus, a DEFUN appearing at the top level of a source file is a
top-level form--the code that defines the function and associates it
with its name will run when the FASL is loaded--but the forms within
the body of the function, which won't run until the function is
called, aren't top-level forms. Most forms are compiled the same when
compiled as top-level and non-top-level forms, but the semantics of an
EVAL-WHEN depend on whether it's being compiled as a top- level
form, compiled as a non-top-level form, or simply evaluated, combined
with what situations are listed in its situation list.
The situations :compile-toplevel and :load-toplevel
control the meaning of an EVAL-WHEN compiled as a top-level
form. When :compile-toplevel is present, the file compiler
will evaluate the subforms at compile time. When
:load-toplevel is present, it will compile the subforms as
top-level forms. If neither of these situations is present in a
top-level EVAL-WHEN, the compiler ignores it.
When an EVAL-WHEN is compiled as a non-top-level form, it's
either compiled like a PROGN, if the :execute situation
is specified, or ignored. Similarly, an evaluated
EVAL-WHEN--which includes top-level EVAL-WHENs in a source
file processed by LOAD and EVAL-WHENs evaluated at compile
time because they appear as subforms of a top-level EVAL-WHEN
with the :compile-toplevel situation--is treated like a
PROGN if :execute is present and ignored otherwise.
Thus, a macro such as IN-PACKAGE can have the necessary effect
at both compile time and when loading from source by expanding into
an EVAL-WHEN like the following:
(eval-when (:compile-toplevel :load-toplevel :execute) (setf *package* (find-package "PACKAGE-NAME")))
*PACKAGE* will be set at compile time because of the
:compile-toplevel situation, set when the FASL is loaded
because of :load-toplevel, and set when the source is loaded
because of the :execute.
There are two ways you're most likely to use EVAL-WHEN. One is
if you want to write macros that need to save some information at
compile time to be used when generating the expansion of other macro
forms in the same file. This typically arises with definitional
macros where a definition early in a file can affect the code
generated for a definition later in the same file. You'll write this
kind of macro in Chapter 24.
The other time you might need EVAL-WHEN is if you want to put
the definition of a macro and helper functions it uses in the same
file as code that uses the macro. DEFMACRO already includes an
EVAL-WHEN in its expansion so the macro definition is
immediately available to be used later in the file. But DEFUN
normally doesn't make function definitions available at compile time.
But if you use a macro in the same file as it's defined in, you need
the macro and any functions it uses to be defined. If you wrap
the DEFUNs of any helper functions used by the macro in an
EVAL-WHEN with :compile-toplevel, the definitions will be
available when the macro's expansion function runs. You'll probably
want to include :load-toplevel and :execute as well
since the macros will also need the function definitions after the
file is compiled and loaded or if you load the source instead of
compiling.
The four remaining special operators, LOCALLY, THE,
LOAD-TIME-VALUE, and PROGV, all allow you to get at parts
of the underlying language that can't be accessed any other way.
LOCALLY and THE are part of Common Lisp's declaration
system, which is used to communicate things to the compiler that don't
affect the meaning of your code but that may help the compiler
generate better code--faster, clearer error messages, and so
on.15 I'll
discuss declarations briefly in Chapter 32.
The other two, LOAD-TIME-VALUE and PROGV, are infrequently
used, and explaining the reason why you might ever want to use
them would take longer than explaining what they do. So I'll just
tell you what they do so you know they're there. Someday you'll hit
on one of those rare times when they're just the thing, and then
you'll be ready.
LOAD-TIME-VALUE is used, as its name suggests, to create a value
that's determined at load time. When the file compiler compiles code
that contains a LOAD-TIME-VALUE form, it arranges to evaluate
the first subform once, when the FASL is loaded, and for the code
containing the LOAD-TIME-VALUE form to refer to that value. In
other words, instead of writing this:
(defvar *loaded-at* (get-universal-time)) (defun when-loaded () *loaded-at*)
you can write the following:
(defun when-loaded () (load-time-value (get-universal-time)))
In code not processed by COMPILE-FILE, LOAD-TIME-VALUE is
evaluated once when the code is compiled, which may be when you
explicitly compile a function with COMPILE or earlier because of
implicit compilation performed by the implementation in the course of
evaluating the code. In uncompiled code, LOAD-TIME-VALUE
evaluates its form each time it's evaluated.
Finally, PROGV creates new dynamic bindings for variables whose
names are determined at runtime. This is mostly useful for
implementing embedded interpreters for languages with dynamically
scoped variables. The basic skeleton is as follows:
(progv symbols-list values-list body-form*)
where symbols-list is a form that evaluates to a list of symbols
and values-list is a form that evaluates to a list of values.
Each symbol is dynamically bound to the corresponding value, and then
the body-forms are evaluated. The difference between PROGV
and LET is that because symbols-list is evaluated at
runtime, the names of the variables to bind can be determined
dynamically. As I say, this isn't something you need to do often.
And that's it for special operators. In the next chapter, I'll get back to hard-nosed practical topics and show you how to use Common Lisp's package system to take control of your namespaces so you can write libraries and applications that can coexist without stomping on each other's names.
1Of course,
if IF wasn't a special operator but some other conditional form,
such as COND, was, you could build IF as a macro. Indeed,
in many Lisp dialects, starting with McCarthy's original Lisp,
COND was the primitive conditional evaluation operator.
2Well, technically those
constructs could also expand into a LAMBDA expression since, as
I mentioned in Chapter 6, LET could be defined--and was in some
earlier Lisps--as a macro that expands into an invocation of an
anonymous function.
3Surprising as it may seem, it actually is possible to make anonymous functions recurse. However, you must use a rather esoteric mechanism known as the Y combinator. But the Y combinator is an interesting theoretical result, not a practical programming tool, so is well outside the scope of this book.
4It's not
required that WITH-SLOTS be implemented with
SYMBOL-MACROLET--in some implementations, WITH-SLOTS may
walk the code provided and generate an expansion with x,
y, and z already replaced with the appropriate
SLOT-VALUE forms. You can see how your implementation does it by
evaluating this form:
(macroexpand-1 '(with-slots (x y z) obj (list x y z)))
However, walking the body is much easier for the Lisp implementation
to do than for user code; to replace x, y, and z
only when they appear in value positions requires a code walker that
understands the syntax of all special operators and that recursively
expands all macro forms in order to determine whether their expansions
include the symbols in value positions. The Lisp implementation
obviously has such a code walker at its disposal, but it's one of the
few parts of Lisp that's not exposed to users of the language.
5One version of f2cl is available as
part of the Common Lisp Open Code Collection (CLOCC):
http://clocc.sourceforge.net/. By contrast, consider the
tricks the authors of f2j, a FORTRAN-to-Java translator, have to
play. Although the Java Virtual Machine (JVM) has a goto instruction,
it's not directly exposed in Java. So to compile FORTRAN gotos, they
first compile the FORTRAN code into legal Java source with calls to a
dummy class to represent the labels and gotos. Then they compile the
source with a regular Java compiler and postprocess the byte codes to
translate the dummy calls into JVM-level byte codes. Clever, but what
a pain.
6Since this algorithm depends on values returned by
RANDOM, you may want to test it with a consistent random seed,
which you can get by binding *RANDOM-STATE* to the value of
(make-random-state nil) around each call to
algorithm-s. For instance, you can do a basic sanity check of
algorithm-s by evaluating this:
(let ((*random-state* (make-random-state nil))) (algorithm-s 10 200))
If your refactorings are all valid, this expression should evaluate to the same list each time.
7This is a pretty reasonable restriction--it's not entirely clear what it'd mean to return from a form that has already returned--unless, of course, you're a Scheme programmer. Scheme supports continuations, a language construct that makes it possible to return from the same function call more than once. But for a variety of reasons, few, if any, languages other than Scheme support this kind of continuation.
8If you're the kind of
person who likes to know how things work all the way down to the
bits, it may be instructive to think about how you might implement
the condition system's macros using BLOCK, TAGBODY,
closures, and dynamic variables.
9UNWIND-PROTECT is essentially
equivalent to try/finally constructs in Java and Python.
10And indeed, CLSQL, the multi-Lisp,
multidatabase SQL interface library, provides a similar macro called
with-database. CLSQL's home page is at
http://clsql.b9.com.
11A small handful of macros
don't pass through extra return values of the forms they evaluate. In
particular, the PROG1 macro, which evaluates a number of forms
like a PROGN before returning the value of the first form,
returns that form's primary value only. Likewise, PROG2, which
returns the value of the second of its subforms, returns only the
primary value. The special operator MULTIPLE-VALUE-PROG1 is a
variant of PROG1 that returns all the values returned by the
first form. It's a minor wart that PROG1 doesn't already behave
like MULTIPLE-VALUE-PROG1, but neither is used often enough that
it matters much. The OR and COND macros are also not always
transparent to multiple values, returning only the primary value of
certain subforms.
12The reason loading a file with an IN-PACKAGE form in
it has no effect on the value of *PACKAGE* after LOAD
returns is because LOAD binds *PACKAGE* to its current
value before doing anything else. In other words, something
equivalent to the following LET is wrapped around the rest of
the code in LOAD:
(let ((*package* *package*)) ...)
Any assignment to *PACKAGE* will be to the new binding, and the
old binding will be restored when LOAD returns. It also binds the
variable *READTABLE*, which I haven't discussed, in the same
way.
13In some
implementations, you may be able to get away with evaluating
DEFUNs that use undefined macros in the function body as long as
the macros are defined before the function is actually called. But
that works, if at all, only when LOADing the definitions from
source, not when compiling with COMPILE-FILE, so in general macro
definitions must be evaluated before they're used.
14By contrast, the subforms in a
top-level LET aren't compiled as top-level forms because they're
not run directly when the FASL is loaded. They will run, but it's in
the runtime context of the bindings established by the LET.
Theoretically, a LET that binds no variables could be treated
like a PROGN, but it's not--the forms appearing in a LET
are never treated as top-level forms.
15The one declaration that has an effect on the semantics of a
program is the SPECIAL declaration mentioned in Chapter 6.