\documentclass[11pt]{report} % -*- mode: latex; mode: font-lock; mode: auto-fill-mode -*- 
\usepackage{rcs} 
\usepackage{verbatim}
\usepackage{makeidx}\makeindex 
\newcommand{\tag}[1]{$\langle$#1$\rangle$} % an element tag 
\newcommand{\abbrev}[1]{{#1}\index{#1}} % An abbreviation (usually an acronym) 
\newcommand{\att}[1]{\texttt{#1}}      % an attribute name 
\newcommand{\attval}[1]{\texttt{"#1"}} % an attribute value 
\newcommand{\app}[1]{\textsc{#1}\index{#1}}      % an application name (e.g. Jade) 
\newcommand{\filename}[1]{\texttt{#1}\index{#1}}     % a file name (e.g. readme.txt) 
\newcommand{\dsssl}[1]{\textit{#1}}    % a DSSSL function name 
\newcommand{\cmd}[1]{\par\noindent\texttt{#1}\par} 
        % a command line 
\renewcommand{\Diamond}{\relax} 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
% The following are used for NOTEs, (CAUTIONs, ...) 
\newenvironment{note}{\begin{trivlist}\item[NOTE:] }{\end{trivlist}} 
\newenvironment{caution}{\begin{trivlist}\item[CAUTION:] }{\end{trivlist}} 
\newenvironment{warning}{\begin{trivlist}\item[WARNING:] }{\end{trivlist}} 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
\RCS$Revision: 0.109 $ \RCS$Date: 1999/12/31 19:35:12 $ 
\newcommand{\release}{Revision~\RCSRevision} 
\newcommand{\acronym}[1]{\textsc{\lowercase{#1}}\index{#1}} 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
\iftrue \usepackage[pdftex]{hyperref}  
 \hypersetup{pdftitle={An SGML-bsed Literate Programming System},
             pdfkeywords={SGML, DSSSL, literate programming}, 
             pdfpagemode={UseOutlines} }
\else  
   \newcommand{\url}[1]{\texttt{#1}}  
 \fi 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
\title{An Experiment in Literate Programming Using \acronym{SGML} and 
       \acronym{DSSSL}\\ {\large \release}}  
\author{Mark B. Wroth} 
\date{\RCSDate\relax}
\begin{document}\maketitle 

\tableofcontents

\chapter{Purpose} 
 
\section{Background}

\textit{Literate programming} is a style of computer programming in
which priority is given to the exposition of the
program \emph{to the human reader}, rather than for the convenience of
the computer which will execute the program.  Since computers are
notoriously intolerant of changes in the way their inputs are
structured, and most computer programming languages have, at best,
limited facilities for any but the simplest comments, this style
requires a set of tools which allow the author to
both explain the program for the human audience, and give precise
instructions on how the program is to be read by the computer.

The original literate programming system, \app{WEB}, was developed by
Profesor Donald Knuth, who used it in the production of \TeX\ and
other programs.  This system combined the \TeX\ typesetting system
with the \app{Pascal} programming language.  Subsequent developments
have largely been confined to adding to the choice of programming
languages.  Various ``language-aware'' \app{WEB} variants have
appeared since the original \app{WEB} system, covering a variety of
programming languages.  The difficulty of converting \app{WEB} systems
to other languages prompted Norman Ramsey to continue SIlvio Levy's
work to develop a \app{SPIDER} system to assist in generating new
\app{WEB}s for different languages, but even this system does not make
it trivial to add a new language-aware \app{WEB}. Krommes, with his
development of \app{FWEB}, appears to have developed the concept of a
``current language'', allowing the same program web to contain program
text in multiple languages while retaining the language awareness and
specialized typesetting of the original \app{WEB}.

A second branch of literate programming development addressed the
difficulties of developing \app{WEB} systems in new languages by
ignoring the language characteristics entirely. This branch, typified
by such programs as \app{Noweb} and \app{Nuweb}, makes no attempt to
``pretty print'' the program text or take advantage of syntactic
knowledge of the program.  Instead, the program text is reproduced
verbatim in the typeset documentation. In addition to simplifying the
use of a new programming language---since there are no language
dependencies---some programmers prefer seeing the source code
reproduced more or less as they would in the editor in which it was
written.  Since a language independent \app{WEB} can be dramatically
simpler than a language aware versio, this approach has some obvious
advantages which offset its inability to take advantage of the target
language.

But despite all of the activity centered around adapting \app{WEB} to
various programming languages, relatively little effort has been
devoted to changing the documentation language used.  With the
exception of one early effort combining the \app{troff} typesetting
system (Thimbleby's system is confusingly called \app{Cweb} although
it is unrelated to the more commonly known literate programming system
named \app{cweb}, by Silvio Levy), almost all of the literate
programming systems use \TeX\ as their documentation language.  This
may be due to the difficulty of the typesetting task---Sewell reports
that Thimbleby (the author of the \app{troff} version of \app{Cweb})
estimated that 95 percent of the effort involved in that system was in
this area \cite[p.~144]{Sewell89}.  Recently, a few systems have
emerged which relax this---\app{Nuweb} \cite{Nuweb} for example, can
be used to produce \acronym{HTML} with some effort, and
\app{FunnelWEB} \cite{FunnelWeb} attempts to provide some formatter
independence.

I am not aware, however, of any released systems which have used
\acronym{SGML} (or its cousin \acronym{XML}) markup to define a
literate program, despite the apparent easy fit of the
concept\footnote{As I asked for (and got) assistance with the
  \acronym{DSSSL} code for the \attval{tangle} style sheet used in
  this paper from the \texttt{DSSSList}, an email list devoted to the
  \acronym{DSSSL} programming language, Christopher R. Madden
  (\texttt{chrism@@exemplary.net}) commented that he was in the
  process of such a project using the \acronym{XML} variant of
  \app{DocBook}.  Additionally, C.M. Sperberg-McQueen has written a
  tag set for literate programming called \app{Sweb}
  \cite{SperbergMcQueen96}.  While he appears to have implemented at
  least part of the necessary processing software using \app{Lex} and
  \app{YACC}, and has made papers discussing the system available on
  the web, he labels the work unfinished and unpublished.  It is
  nonetheless a very interesting system, and shows considerable
  thought.}.  This seems odd, since the structured nature of
\acronym{SGML} would seem to lend itself to the natural intermingling
of code and documentation that is at the heart of literate
programming.  Additionally, a variety of powerful tools to author and
manipulate \acronym{SGML}-marked up documents have emerged; such tools
would appear to greatly simplify the creation of an
\app{SGML}-based \app{WEB} system. Among the readily available tools
that would seem applicable are \app{Perl} (with freely available
\acronym{SGML} libraries), \app{Omnimark}, and \acronym{DSSSL}.

\acronym{DSSSL}, the Document Style Semantics and Specification
Language, is an \acronym{ISO} standard \cite{DSSSL} language aimed at
producing output documents from \acronym{SGML}-marked up input files.
Frustratingly, the release of the \acronym{DSSSL} standard in 1996
appears not to have been accompanied by any programs implementing the
defined language, nor have any complete implementations appeared
since.  James Clark's \acronym{DSSSL} implementation, \app{Jade}
\cite{JADE}, was used for this paper. It is free, readily available,
and implements a significant fraction of the \acronym{ISO}-defined
style language.  In addition, it implements a number of extensions for
\acronym{SGML}-to-\acronym{SGML} transformations which make it quite
effective for that purpose despite not implementing the
\acronym{ISO}-defined \acronym{DSSSL} transformation language.

This paper is an experiment in creating a ``proof of concept''
literate programming system using \acronym{SGML} markup for
documentation and code scrap delimination and
\acronym{DSSSL}\nocite{DSSSL} in the form of James Clark \cite{JADE}
processor for the implementation language.

\section{Design} 

A literate programming system has two basic processing branches, which
we will call the \attval{tangle} and \attval{weave} branches after the
original programs defined by Dr.~Knuth \cite{Knuth91}. The
\attval{weave} branch produces the form that is converted into a
human-readable program listing (traditionally in hard-copy but more
recently in on-line forms as well). The \attval{tangle} branch
produces the source files as they are used by the computer itself
 
The \attval{weave} branch is straightforward, at least in principle.
It amounts to using \acronym{SGML} to mark up a document for printing,
and this is an area where a great deal of effort has been expended.
Including ``pretty printing'' of the source code appears
straightforward, if not necessarily trival, if we are willing to mark
up the source code.  It is probably doable even if we are not.
However, for the first cut, we will simply assume that no pretty
printing is needed and very simple documentation is used.  Basically,
we are not going to spend much effort here because we think that this
branch is clearly within the capabilities of \acronym{SGML} and
\acronym{SGML}-based processing systems.  The simplified
\attval{weave} processing script is shown in
Section~\ref{sec:dssslweave}, and the complete script is reproduced in
Appendix~\ref{app:dsssl}.
 
The \attval{tangle} branch is more challenging. The goal of the 
experiment is to demonstrate: 
\begin{itemize} 
\item Assembly of code scraps; 
\item Insertion of assembled code scraps into other scraps; 
\item Output of assembled scraps to disk file. 
\end{itemize} 
The \attval{tangle} script is discussed in
Section~\ref{sec:dsssltangle}, and the complete script is in
Appendix~\ref{app:dsssl}. 
 
In essence, we have two kinds of \textit{header} scraps: scraps which
will be written to a file, and \textit{definition} scraps which are
included in other scraps as part of the definition of the top level
program. Either kind of scrap may be continued by other scrap
definitions, which shall be assembled in the order they appear in the
input file.

Other possible functionality to consider: 
\begin{itemize} 
\item Macro definition.  Deferred for future continuation.  Experience 
  with \app{Nuweb} indicates that this functionality may not be 
  necessary.  Additionally, \acronym{SGML} itself allows for a
  primitive macro facility in the form of entity definitions.  While
  this has some disadvantages from the perspective of clear
  elucidation of the concepts (the entity definitions are hidden from
  the reader), the fact that some literate programming systems omit
  the macro capability while retaining significant functionality,
  combined with the \acronym{SGML} entity facility, persuades me to
  defer this capability---perhaps forever.
\item Scrap numbering
\item Scrap usage listing
\item List of files output
\end{itemize} 
 
\chapter{The Source \acronym{SGML} Document} 
 
In order to test the concepts, we need a sample document.  This 
provides the valid SGML document as a test case. 
 
\section{The Document Type Description} 
 
The basic \acronym{DTD} is very simple. 
 
@D Test DTD @{<!DOCTYPE document [ 
<!ELEMENT document o o (p|scrap|continuation)*> 
<!ELEMENT p        - o (#PCDATA|scrapref)*> 
@<The `scrap' element@> 
@<The `continuation' element@>
@<The `scrapref' element@> 
@<The `literal' element@> 
]> 
@| document p #PCDATA @} 
 

\subsection{The `scrap' element} 

There are, in fact, a number of syntactic uses for code scrap 
elements:  
\begin{itemize} 
\item Beginning of an output file definition (the ``unnamed section'' 
  in the original \app{WEB} system); 
\item Continuation of an output file definition; 
\item Beginning of a ``defined'' section---one which will eventually 
  be inserted into an output file section; 
\item Continuation of a ``defined'' section; 
\item Reference to a scrap within a scrap, intended to be result in 
  the referenced scrap being inserted in the code in place of the 
  reference; 
\item Reference to a scrap in documentation, where is should be 
  treated as a citation. 
\end{itemize} 
All of these might be handled with a single element type.  In our
initial implementation we used two types, a \tag{scrap} for all of the
code definitions, and a \tag{scrapref} for the references to a
scrap. The initial implementation ran into trouble with nested,
continued scraps, and so we split out \tag{continuation}s.

 
The \tag{scrap} is the key element of the literate programming 
setup. It contains program code, which may be either inserted into 
another scrap or output to a file.  Scraps are not necessarily defined 
at a single point in the literate program; following Knuth's 
convention, they may be arbitrarily continued over many parts of the 
input file, and are assembled in the order in which they appear.   

It's not clear at this point if this is the right approach, but for
now we will define the initial scrap to have an \att{id} attribute and
possibly a \att{file} attribute indicating the output file.
Continuations use the \tag{continuation} element, with the scrap being
continued identified by the \att{continues} attribute, with its value
equal to the \att{id} of the beginning scrap.
 
@D The `scrap' element @{ 
<!ELEMENT scrap    - o (title, code)> 
<!ATTLIST scrap    file      CDATA #IMPLIED  
                   id        ID    #REQUIRED 
> 
<!ELEMENT title    o o (#PCDATA) > 
<!ELEMENT code     o o (#PCDATA|scrapref|literal)* > 
@| scrap title code CDATA ID IDREF #IMPLIED @} 
 

\subsection{The `continuation' element}

The \tag{continuation} element continues a scrap previously opened.

Because of difficulties with mixing modes associated with having
continuation scraps and a desire to clarify the syntax, we add a
\tag{continuation} element. This also significantly simplifies the
handling of nested scraps and their continuations.

@D The `continuation' element @{
<!ELEMENT continuation - o (code)>
<!ATTLIST continuation
                   continues IDREF #REQUIRED
>
@}

\subsection{The `scrapref' element} 
 
The \tag{scrapref} element is to be used to insert a scrap into 
a code section; the \att{id} attribute specifies the (head of) the 
scrap to be inserted.  It will also be used in documentation in a 
similar manner, except that there only a cross reference will be 
used.  
 
@D The `scrapref' element @{ 
<!ELEMENT scrapref - o EMPTY> 
<!ATTLIST scrapref id IDREF #REQUIRED > 
@| scrapref #REQUIRED @} 
 
 
\subsection{The `literal' element} 
 
The following definitions are used to provide a workaround to get an 
actual ``less than'' character into the \acronym{SGML} output. Since the 
character has syntactic meaning to the \acronym{SGML} parser, by default  
it is `escaped' when placed in the \acronym{SGML} output as character data. 
 
By defining an element to contain the required information, we let the 
\acronym{DSSSL} processor have access to it.  Defining  entity 
references to it simplifies the actual data entry.  If particular 
combinations seem appropriate for a specific programming language (for 
example the \verb+&&+ used below, which acts lie a logical and), it 
would make sense to define entities which make syntactic sense.  This 
would allow one to use, for example \texttt{\&and;} instead of 
\texttt{\&amp;\&amp;}\footnote{The basic suggestion to use a
  formatting-instruction to address the problem came from David
  Carlisle \texttt{davidc@@nag.co.uk} in a post to the DSSSList,
  Vol~3, Number~241.}. 
 
@D The `literal' element @{ 
<!ELEMENT literal  - o EMPTY  
       -- literal data, to be handled in the DSSSL --> 
<!ATTLIST literal data CDATA #REQUIRED> 
<!ENTITY  lt  "<literal data='&#60;'>"    
       -- ``less than'' sign--> 
<!ENTITY  gt  "<literal data='>'>"    
       -- ``greater than'' sign--> 
<!ENTITY  amp "<literal data='&#38;'>"    
       -- ``ampersand'' sign--> 
@| literal &gt; &lt; &amp;@} 
 
\section{The Document Instance} 
 
And here is the actual document.   
 
@O test.sgm @{@<Test DTD@> 
<document> 
<p>This is some sample documentation text. It is entirely
unremarkable. The included code conforms to no particular programming
language. It is chosen just to provide examples that can be examined
to see if it is being reproduced properly.  Becaus of this, it
includes punctuation marks that are likely to be syntactically
significant to the various processors. This particular scrap includes
a "less than" character, "<" which is the SGML element
start character.</p> 
<scrap file="scrap1.out" id="scrap1">The main code
<code>
-- scrap1 head 
  for i = 1 to 10 
    write i 
  rof 
  if a &lt; b fi 
-- include scrap2 by reference 
<scrapref id="scrap2"> 
 
</scrap> 
@}  
 
We split the desired output file into multiple scraps to test how the 
output entity is formed.  Unfortunately, if we just give all of the 
scraps the same file id, only one scrap is in the result.  While 
expected, this means we're going to have to be more canny in the 
\texttt{tangle} script so that we can get the desired concatenation. 
 
@O test.sgm @{  
<p>This is documentation of a continuation scrap, specifically the
first continuation of the first scrap. It is entirely unremarkable.</p>
<continuation continues="scrap1"> 
<code> 
-- first continuation of scrap1
  if (i &lt; 10) 
    call iout 
  fi 
 
</continuation>@} 
 

 
@O test.sgm @{  
<p>This scrap is another continuation.  It is unremarkable, except that
it contains two other characters likely to be an issue for the SGML
tools, specifically the "greater than" and ampersand characters (">"
and "&").</p>

<continuation continues="scrap1"> 
<code> 
  -- Second continuation of scrap1
  if (i &lt; 10) &amp;&amp; (j &gt; 12) 
    call iout 
  fi 
</continuation>@} 
 
Now we write another scrap which will be included in an 
output file.  Since this is a header scrap, it is defined with the
\tag{scrap} element.
@O test.sgm @{ 
<p>This is a header scrap, which is intended to be included in another
scrap in order to finally be included in an output file. The scrap
documentation is entirely unremarkable.</p>
<scrap id="scrap2"> 
<title>An included scrap (scrap2)
<code> 
-- included scrap2 head 
while a % b &lt; c 
  incr(a) 
end 
 
</scrap> 

<p>Some documentation of the next scrap. It is unremarkable in every way.</p>
<continuation continues="scrap2">
-- included scrap2 continuation 1
some more code

-- include scrap3 by reference
<scrapref id="scrap3">
</continuation>

<p> And finally, documentation of the third scrap. It is entirely
unremarkable, except that it includes a reference to the scrap that
it is included in, which is <scrapref id="scrap2">.</p>
<scrap id="scrap3">A nested scrap
<code>
-- contents of scrap3
-- scrap 3 should have continuation 1

</scrap>

<p>The third scrap is continued. The documentation is entirely
unremarkable, and is extended only to provide some reasonable text in
the woven file.</p>
<continuation continues="scrap3">
-- continuation 1 of scrap 3
</continuation>
</document>@} 
 
\chapter{Processing Scripts} 
\label{cha:dssslscripts}
 
We create the two desired processing scripts as simple shells 
to begin with. While in some ways it would be convenient to use a
third \dsssl{style-sheet} to contain common code, by using defined
scraps (in \app{Nuweb}, the literate programming tool being used fir
this experiment) we can define the code once and use it as needed with
little trouble.
 
@O test.dsl @{ 
<!-- $Id: Experiment.w,v 0.109 1999/12/31 19:35:12 penny Exp penny $ --> 
<!DOCTYPE style-sheet  
  PUBLIC "-//James Clark//DTD DSSSL Style Sheet//EN"> 
<style-sheet> 
<style-specification 
 id = "tangle"> 
 @<DSSSL Tangle@> 
 @<Function to find the document element@> 
</style-specification> 
<style-specification 
 id = "weave"> 
 @<Weave declarations@>
 @<DSSSL Weave@> 
 @<Function to find the document element@> 
</style-specification> 
</style-sheet> 
@} 
 
\section{Supporting Functions} 
 
This section describes several \app{DSSSL} functions which are needed
at various spots in the \app{DSSSL} code.

\subsection{Finding the `document' Element}

The following function definition is from Norman Gray, 
\texttt{norman@@astro.gla.ac.uk}, and returns a singleton node list 
consisting of the root ``document-element'' of the current document 
(or \#f if there is no such element).   
 
@D Function to find the document element @{ 
(define (document-element #!optional (node (current-node))) 
  (let ((gr (node-property 'grove-root node))) 
   (if gr  ; gr is the grove root 
      (node-property 'document-element gr default: #f) 
      ;; else we're in the root rule now 
      (node-property 'document-element node default: #f)))) 
@} 
 


\subsection{Passing Literal Data}
\label{sec:literals}

This processing rule is used to pass literal data specified in a 
\tag{literal} element through to the \acronym{SGML} output.
 
@D Output literal data @{ 
(element literal 
 (make sequence 
  (make formatting-instruction 
    data: (attribute-string "data")))) 
 
@} 

This requires a non standard flow object, which must be declared before use.
\index{non-standard flow objects}\index{flow objects, non-standard}% 
\index{Jade extensions} 
@D Tangle non-standard flow objects @{ 
(declare-flow-object-class formatting-instruction 
  "UNREGISTERED::James Clark//Flow Object Class::formatting-instruction") 
@| formatting-instruction @} 
 
\section{The \acronym{DSSSL} `Tangle' Script} 
\label{sec:dsssltangle}
 
Like the original \app{TANGLE}, the \attval{tangle}
style-specification is intended to output the code parts of the input
file in the order specified by the author. This is probably \emph{not}
the order they appear in the document; the document is organized for
human comprehension, while the output file must satisfy the needs of
the computer.
 
In essence, we have three major tasks to perform:
\begin{itemize}
\item Assemble a scrap from its defining pieces, which may include a
  header and any number of continuation pieces;
\item Insert an assembled scrap into another scrap;
\item Write an assembled scrap (including any inserted scraps) to a
  specified data file.
\end{itemize}

Both scraps which are to be directly written to file and those which
are used internally share the need to assemble all of their
continuation scraps.  This common need is discussed in
Section~\ref{sec:continuations}.  The insertion of defined scraps into
other scraps is covered in Section~\ref{sec:scrapref}, and the
top level output to file is the subject of Section~\ref{sec:fileoutput}.

@D DSSSL Tangle @{ 
@<Tangle non-standard flow objects@> 
@<Process a file output scrap@> 
@<Insert a scrap via `scrapref'@> 
@<Output literal data@> 
@} 
 
 
\subsection{File Output Scraps} 
\label{sec:fileoutput}
 
The file output scrap is in a sense the basic element of a literate 
program. It provides the ``top level'' output to  a file---which is 
the ultimate purpose of the \attval{tangle} routine! 
 
While we're processing the input file, we will ignore all scraps 
except those which produce file output.  There should be exactly one 
scrap with a \att{file} attribute refering to each output 
file. (Perhaps we could enforce this by defining the \att{file} 
attribute to be of type \texttt{ID}?) 

@D Process a file output scrap@{ 
(element scrap 
 (make sequence 
  (if (attribute-string "file") 
    (make entity 
      system-id: (attribute-string "file") 
      (make sequence 
        (process-matching-children 'code) 
        @<Find and process all scraps that refer to this one@>))
    (empty-sosofo))))
@} 
 
This requires a non-standard flow object, the \texttt{entity}, which
must be declared before use.

\index{non-standard flow objects}\index{flow objects, non-standard}% 
\index{Jade extensions} 
@D Tangle non-standard flow objects @{ 
(declare-flow-object-class entity 
  "UNREGISTERED::James Clark//Flow Object Class::entity") 
@| entity @} 
 
\subsection{File Output Continuations} 
\label{sec:continuations} 
 
The point of this scrap is to find and process all of the
\tag{continuation} scraps of the current node.
We do this by selecting from all of the descendants of the document
node (i.e. the whole document instance) the nodes which have
\acronym{GI} ``continuation'' and a \att{continues} attribute with value
equal to the \att{id} attribute of the current node.
 

The implementation of this raises an interesting question regarding
the \acronym{DSSSL} language.  While the processing order of a
\texttt{process-node-list} is defined to be that of the list order
\cite[Section~12.4.3]{DSSSL}, it is less clear that the
\texttt{select-elements} and the \texttt{descendants} will provide the
node list in the correct order.  It appears from Chapter~10, and
specifically \cite[Section~10.2.5]{DSSSL} that
\texttt{select-elements} will preserve the order existing in the node
list that is its argument, although this is not explicitly
stated\footnote{My thanks to Brandon Ibach
  (\texttt{bibach@@infomansol.com}), in discussion on the DSSSList,
  for his assistance in clarifying this point.}.  The nodelist
provided to \texttt{select-elements} is created by the
\texttt{descendants} procedure \cite[10.2.3]{DSSSL}; here again the
implication is that the document order is preserved, but this is not
explicit.
 
The expression needed to parse the attribute is somewhat tricky (at
least for the author, who found this to be an instructive example on
the difference between quotation and quasi-quotation in
\acronym{DSSSL}).  Quoting Brandon Ibach
(\texttt{bibach@@infomansol.com})\footnote{in the DSSSList Digest
  Vol.~3, Number~242}, who resolved the problem in a post
to the DSSSList:
\begin{quotation}
  The problem here is that the single quote in your version quoted
the \emph{entire} expression, meaning that the ``attribute-string'' symbol
and the ``id'' string got passed in as part of the pattern, rather than
being evaluated and replaced with the value of the \attval{ID} attribute.
The backquote, above, introduces a ``quasi-quote'' expression, which is
similar to a regular quoted expression, except that you can ``unquote''
certain parts of it, so that they will be evaluated.  In this case,
we're unquoting the (attribute-string) call, such that the final
result of this would be a structure like:

         \texttt{(scrap (continues "ABC"))}
         
\noindent if the current node was an element with an \att{ID} of
\attval{ABC}, that is. \texttt{:)}
\end{quotation}
 
We will reuse this code to process the continuation scraps for a scrap
reference, as well.

@D Find and process...@{ 
  (make sequence 
    (process-node-list 
      (select-elements 
        (descendants 
          (document-element (current-node))) 
          `(continuation 
             (continues ,(attribute-string "id"))))))
@} 
 
\subsection{Scrap References} 
\label{sec:scrapref} 

A \tag{scrapref} in program code indicates that we should insert the 
complete scrap referenced at this point in the program.  The basic 
strategy is the same as with a file output scrap, except that we need 
to start by finding the scrap head, and we need the other processing
branch (when the \att{file} is \emph{not} specified).

@D Insert a scrap via `scrapref'@{ 
(element scrapref 
  (with-mode scrapreference
    (make sequence 
      (process-element-with-id 
        (attribute-string "id")))))
(mode scrapreference
  (element scrap
    (make sequence 
      (if (attribute-string "file") 
          (empty-sosofo)
          (make sequence 
            (process-matching-children 'code) 
            @<Find and process all scraps that refer to this one@>))))
)
@} 
 
\section{The \acronym{DSSSL} `Weave' Script} 
\label{sec:dssslweave}
 
In contrast to the \attval{tangle} specification, the \attval{weave} 
style-specification produces the human readable documentation. 

The only element we are using for general documentation is the \tag{p}
element for general paragraphs.
 
@D DSSSL Weave @{ 
(element p 
  (make paragraph 
    (process-children))) 
@}

A \tag{scrapref} appearing in running text is set using the
\dsssl{scraptitle} mode, which we will reuse at the beginning of each
defined scrap.
@D DSSSL Weave @{ 
(element scrapref
  (make sequence
    (with-mode scraptitle
      (process-element-with-id 
        (attribute-string "id")))))
@}

For header scraps, we show the name of the scrap followed by an
equivalence sign, followed by the text of the scrap itself
@D DSSSL Weave @{ 
(element scrap 
  (make sequence 
    (make paragraph 
      (make sequence
        (with-mode scraptitle
          (process-matching-children 'title))
      (literal "\identical-to")))
    (make paragraph 
      lines: 'asis 
      font-family-name: "Courier New" 
      (process-matching-children 'code))))
@}

@D DSSSL Weave @{ 
(element (code scrapref)
  (make sequence
     lines: 'asis 
     font-family-name: "Courier New" 
     (process-children)
     @<Find and process ...@>
))
@}

For continuation scraps, we do the same as with header scraps, adding
at plus-sign to indicate that this is a continuation.
@D DSSSL Weave @{ 
(element continuation
  (make sequence 
    (make paragraph 
      (make sequence
        (with-mode scraptitle
          (process-element-with-id 
            (attribute-string "continues")))
        (literal "\identical-to +")))
    (make paragraph 
      lines: 'asis 
      font-family-name: "Courier New" 
      (process-matching-children 'code))))
@}

The \dsssl{scraptitle} mode sets the title of the referenced scrap. We 
also include a section number indicating the section being written or
continued, and, in the case of a file output scrap, the name of the
file. 

This code refers to the \dsssl{current-node}, which is the
\emph{header} scrap, not the continuation.  This is not exactly the
desired behavior; we need a way to number each scrap.  The
``traditional'' way to do this is to number each scrap sequentially;
\app{Nuweb} numbers the scraps with the page number and a suffix if
there is more than one scrap on a page.
@D DSSSL Weave @{ 
(mode scraptitle
  (element scrap
    (process-matching-children 'title))
  (element title
    (make sequence
      (literal "\left-pointing-angle-bracket")
      (process-children-trim)
      (literal " (\section-sign")
      (literal 
        (format-number 
          (element-number 
            (parent (current-node))) "1"))
      (if (attribute-string "file" 
            (parent (current-node)))
          (make sequence
            font-family-name: "Courier New"
            (literal "'")
            (literal 
              (attribute-string "file" 
                (parent (current-node))))
            (literal "'"))
          (empty-sosofo))
      (literal ")")
      (literal "\right-pointing-angle-bracket")))
)
@} 
 
Finally, we need a similar mechanism for passing literal data through
to the back end as in the \attval{tangle} script.
@D DSSSL Weave @{ 
(element literal 
 (make sequence 
  (make sequence
    (literal
      (attribute-string "data")))))
@}

The literal output requires the \app{Jade} extension
\dsssl{formatting-instruction}, which must be declared.

@D Weave declarations @{
(declare-flow-object-class formatting-instruction 
  "UNREGISTERED::James Clark//Flow Object
  Class::formatting-instruction") 
@}
\chapter{Results and Analysis}
\label{cha:results}

The processing scripts shown here appear to work as advertised:
\attval{weave} produces a (simplified) version of printed
documentation, and \attval{tangle} produces an output file which
concatentates the defined scraps as we expect.


\section{`Tangle' Output}
\label{sec:tangleoutput}

The test source file, \texttt{test.sgm}, is designed to provide
examples of the assembly of program texts from sequences of scraps,
and scraps inserted into other scraps.

The resulting file does in fact assemble the scraps in the intended
order, as shown below:
\verbatiminput{scrap1.out} 

The inserted scrap is assembled from the two scraps intended, and the
basic file is assembled from it and the designated file output scrap..

\appendix

\chapter{The Assembled SGML Input File}
\label{app:sgml}

The complete \acronym{SGML} input file (sample document) is included
here for reference.

\verbatiminput{test.sgm}

\chapter{The Assembled DSSSL Script File}
\label{app:dsssl}

The complete \acronym{DSSSL} script file is included here for
reference.

\verbatiminput{test.dsl}

\chapter*{Bibliography and Indices} 
 
\bibliographystyle{plain} 
\bibliography{E:/Household/Library} 
 
\section*{Cross References} 
 
\section*{Identifiers} 
 
@u 
 
\section*{Files} 
 
@f 
 
\section*{Scraps} 
 
@m 
 
\printindex 

\section*{Colophon}

This paper was written primarily with \app{Emacs} as the text editor,
and \app{Nuweb} as the literate programming system. \app{pdflatex}
(with the \app{hyperref} package) was the primary document compiler,
and \app{Jade} was the \acronym{DSSSL} processor.
\end{document}