\input avm.sty
\input eplain
\input xy
\xyoption{all}
\SelectTips{cm}{}
\CompileMatrices
\tolerance=2200
\font\sf = cmss10
\font\sfs = cmss8
\font\sfi = cmssi10
\font\sfis = cmssi8
\font\sssc = cmtt10
\font\ssscs = cmtt8
\font\rm = cmr10
\font\rmh = cmr10 scaled \magstephalf
\font\rmhh = cmr10 scaled \magstep 1
\font\lab = cmr8
\avmfont{\sssc}
\avmvalfont{\sf}
\avmsortfont{\sfi}
\avmoptions{center}
\listleftindent = 0.5 in
\def\numberedmarker{(\number\itemnumber)}
\def\numberedprintmarker#1{\llap{#1 \listmarkerspace}}
\def\heading#1{\rmhh\par#1\par\rm}
\def\sheading#1{\rmh\par#1\par\rm}
\parindent = 0 pt
\parskip = 12 pt
\rm
\centerline{Michael W. Daniels}
\centerline{May 14, 1999}
\medskip
\centerline{On The Extension of {\sssc UNICORN} for Lexical Rule Processing}
\baselineskip = 18 pt
\heading{1. Introduction}

Feature structure-based grammar formalisms have become increasingly
important in recent times. Many current linguistic theories can be
expressed in terms of feature structure-based grammars, and they are
particularly well-suited for computational implementation. In many of
these theories, the lexicon has two parts. The first is a set, often
structured in some way, of lexical entries.  These lexical entries
completely encapsulate the linguistic properties of the lexeme being
described -- for instance, its pronunciation, meaning, and valence
properties.  The second component of the lexicon is a set of lexical
rules that encapsulate regular correspondences within the lexicon.
For instance, a theory might posit one lexical entry for each verb and
then use a lexical rule to describe how inflected verb forms relate to
uninflected verb forms.  These lexical rules must therefore describe
the correspondences between all parts of a lexical entry: the
phonological descriptions, the orthographic descriptions, the semantic
descriptions, and the syntactic descriptions.

The {\sssc UNICORN} parser\cite{Gerde} was created by Dale Gerdemann
and Erhard Hinrichs at the University of Illinois as a tool for
working with feature-structure grammars and investigating their
properties.  It represents the lexicon as a simple list of lexical
entries, however, and therefore lacks the economy afforded by lexical
rules.  For instance, separate lexical entries are required for {\it
dog\/} and {\it dogs}, {\it cat\/} and {\it cats}, {\it bird\/} and
{\it birds}, and any other singular/plural noun pair.  For any verb,
separate entries are required for the third-person singular, the
simple past, the present participle, the past participle, and so on.
While this can make the task of parsing more efficient, it slows down
grammar development considerably and increases the likelihood of
error.  It is therefore advantageous for a parsing system to be able
to process lexical rules.

This paper describes the changes necessary to add such lexical rule
functionality to {\sssc UNICORN}. Section 2 begins by describing the
nature and foundations of context-free parsing, on which {\sssc
UNICORN} is based. Section 3 describes how Earley's algorithm was
adapted to handle feature-structure grammars. In section 4, I show how
the relationship between Earley's algorithm and {\sssc UNICORN}'s
parsing algorithm can be extended to allow {\sssc UNICORN} to process
lexical rules.  Finally, section 5 describes the architecture
necessary to handle morphological correspondences. Appendix A
summarizes the sample grammar used to illustrate the parsing
algorithms.

\heading{2. Context-free Parsing}

To understand the motivations behind the extensions of {\sssc UNICORN}
described here, it is necessary to understand the relationship between
{\sssc UNICORN}'s parsing algorithm and Earley's algorithm for
context-free parsing.

\sheading{2.1 Definitions}

A context-free grammar (CFG) is formally defined\cite{HopcroftUllman}
as a quadruple $G = \left<N, T, P, S\right>$, where $N$ is a set of
non-terminal symbols, $T$ a set of terminal symbols, $P$ the set of
rules in the grammar (represented as a relation from $N$ onto $(N \cup
T)^*$), and $S$ the distinct (non-terminal) start symbol (sometimes
referred to as the {\it axiom}).

We write $\alpha A\gamma \to \alpha\beta\gamma$ (that is, a given
string {\it directly derives\/} another) when $(A, \beta) \in P$
(i.e., when a rule $A \to \beta$ exists in the grammar).  If a
string $\alpha_1$ directly derives $\alpha_2$ and $\alpha_2$ directly
derives $\alpha_3$, then $\alpha_1$ is said to derive $\alpha_3$ in
two steps. To say that $\alpha_1 \Rightarrow \alpha_k$ (``derives'')
is to say that there exists some $k$ for which $\alpha_1$ derives
$\alpha_k$ in $k$ steps.  Then the {\it language\/} generated by $G$
is the set of all strings $w \in T^*$ for which $S \Rightarrow w$.

\sheading{2.2 Earley's Algorithm}

Earley's algorithm\cite{Earley} for context-free parsing (the task of
enumerating the rules involved in deriving $w$ from $S$) is based on
the manipulation of {\it states\/} and {\it statesets\/} (sets of
states). Throughout the parsing process, a set of statesets will be
maintained, each corresponding to a particular symbol in the input
string.  The algorithm will process each stateset in turn and add new
states to either the current stateset or the next stateset.

A state is an annotated rule that describes the current status of a
parse.  If $A \to \beta$ is a rule in the grammar, then its states
will be of the form $A \to \gamma . \delta, n$, where $\gamma$ refers
to the recognized part of the string and $\delta$ to the as-yet
unrecognized portion.  In all cases, $\beta$ must be the concatenation
of $\gamma$ with $\delta$.  The integer $n$ represents the {\it
origin\/} stateset: the stateset from which the state was
generated. If $\gamma$ is empty, we refer to the state as an {\it
initial state}; if $\delta$ is empty, we refer to the state as a {\it
final state}.

The input to Earley's algorithm consists of two parts: a grammar (as
described above) and an input string. To begin, the algorithm adds a
new rule $S' \to S$ (where $S$ is the original start symbol) to the
grammar and sets $S'$ as the start symbol for the grammar. This
ensures that the starting symbol appears on the left-hand side of only
one rule, avoiding the premature termination that would otherwise
occur upon recognizing an embedded $S$. The algorithm then initializes
the first stateset with the state $S' \to . S, 0$ and appends an
explicit end marker (here, \$) to the input string.

Then, starting with the newly-added state, the algorithm examines each
state in turn. Based on the nature of the symbol immediately to the
right of the dot (the {\it active\/} symbol), the algorithm applies
one of three procedures to the current state, adding additional states
to either the current or future statesets---at no point will states
ever be added to an already-processed stateset.  These three
procedures are the predictor, scanner, and completer. Each takes a
state as input and adds new states to various statesets according to
the nature of the input state.  The parser decides which procedure
applies by examining the active symbol.

If the active symbol is a non-terminal, the {\it predictor\/} applies.
For every rule whose left hand side is identical to the active symbol,
the predictor adds the initial state corresponding to that rule to the
current state set.  This new state has its origin set to the current
stateset.

If the active symbol is a terminal, the {\it scanner\/} applies. It
compares the current stateset's input token to the active symbol in
the state. If they are identical, a new state is created in the next
stateset with the dot advanced one position and the origin
unchanged. If they differ, we simply move to the next state.

If there is no active symbol (and this is therefore a final state),
the {\it completer\/} applies. For each state in the origin whose
active symbol is the left hand side of the current state, the
completer creates a new state in the current stateset with the dot
advanced one position and the origin unchanged.  If the left-hand
symbol in the rule is $S'$ (the new start symbol) and the current
stateset's input token is the end marker, then the completer has been
applied to the initial rule and the string is successfully parsed.
Otherwise, execution continues normally.

After states have been added, the next state in the current stateset
is processed.  This continues until all the states in the current
stateset have been processed, at which point the next stateset is
processed.  If at any point the states in the current stateset have
been exhausted, and the next stateset is empty, the parse fails.

The execution trace in (\xrefn{EPar}) summarizes the following
illustration of how Earley's algorithm parses the string $xyx$ with
the rules in (\xrefn{EGram}).

$$\vcenter{\offinterlineskip
\halign{\strut
\vrule $\>#\>$ \hfil & 
\vrule $\>#\>$ \hfil & 
\vrule $\>#\>$ \hfil & 
\vrule\ # \hfil\vrule\cr
\noalign{\hrule}
& I_o & \hbox{state} & created by \cr
\noalign{\hrule}
\multispan4\strut\vrule\hfil Stateset 0 --- Input: x \hfil\vrule\cr
\noalign{\hrule}
0 & 0 & S' \to .\,S & initial \cr
1 & 0 & S \to .\,S\,A & predict(0,1) \cr
2 & 0 & S \to .\,x & predict(0,2) \cr
\noalign{\hrule}
\multispan4\strut\vrule\hfil Stateset 1 --- Input: y \hfil\vrule\cr
\noalign{\hrule}
3 & 0 & S \to x\,. & scan(2) \cr
4 & 0 & S' \to S\,. & complete(3,0) \cr
5 & 0 & S \to S\,.\,A & complete(3,1) \cr
6 & 1 & A \to .\,y & predict(5,4) \cr
7 & 1 & A \to .\,A\,S & predict(5,3) \cr
\noalign{\hrule}
\multispan4\strut\vrule\hfil Stateset 2 --- Input: x \hfil\vrule\cr
\noalign{\hrule}
8 & 1 & A \to y\,. & scan(6) \cr
9 & 0 & S \to S\,A\,. & complete(8,5) \cr
10 & 1 & A \to A\,.\,S & complete(8,7) \cr
11 & 0 & S' \to S\,. & complete(9,0) \cr
12 & 0 & S \to S\,.\,A & complete(9,1) \cr
13 & 2 & S \to .\,x & predict(10,2) \cr
14 & 2 & S \to .\,S\,A & predict(10,1) \cr
15 & 2 & A \to .\,y & predict(12,4) \cr
16 & 2 & A \to .\,A\,S & predict(12,3) \cr
\noalign{\hrule}
\multispan4\strut\vrule\hfil Stateset 3 --- Input: \$ \hfil\vrule\cr
\noalign{\hrule}
17 & 2 & S \to x\,. & scan(13) \cr
18 & 1 & A \to A\,S\,. & complete(17,10) \cr
19 & 2 & S \to S\,.\,A & complete(17,14) \cr
20 & 0 & S \to S\,A\,. & complete(18,5) \cr
21 & 1 & A \to A\,.\,S & complete(18,7) \cr
22 & 3 & A \to .\,y & predict(19,4) \cr
23 & 3 & A \to .\,A\,S & predict(19,3) \cr
24 & 0 & S' \to S\,. & complete(20,0) \cr
25 & 0 & S \to S\,.\,A & complete(20,2) \cr
26 & 3 & S \to .\,x & predict(21,2) \cr
27 & 3 & S \to .\,S\,A & predict(21,1) \cr
\noalign{\hrule}
}}\eqdef{EPar}$$

$$\eqalign{ 0&: S' \to S \cr 1&: S \to S\,A \cr 2&: S \to x \cr 3&: A
\to A\,S \cr 4&: A \to y \cr}\eqdef{EGram}$$

Line 0 is the initial state.  Since the non-terminal $S$ is to the
right of the dot, the predictor applies, and lines 1 and 2
(corresponding to rules 1 and 2) are added to stateset 0 with origin
0. The predictor also applies to line 1, but the states it would add
are identical to lines 1 and 2, so no action is taken.  For line 2,
the terminal $x$ is active and the scanner applies.  Since the active
symbol $x$ matches the current stateset's input, line 3 is added to
stateset 1 with origin 0.

The completer applies to line 3, since there is no active symbol (the
dot is at the end of the rule).  Since line 3 has origin 0, the
completer examines stateset 0 for rules where the symbol $S$ (the
left-hand side of line 3) is the active symbol.  In this case, lines 0
and 1 match this condition, and therefore lines 4 and 5 are added to
the current stateset.  Line 4 represents a possible completion of the
initial state, as its left-hand symbol is $S'$.  Since the current
stateset's input is not the end marker \$, this state cannot be
completed. When line 5 is processed, the scanner adds lines 6 and 7.
Here, the origin is set to 1, the value of the current stateset. Lines
6 through 23 are processed similarly.

At line 24, the completer is again applied to the initial state.  At
this point, the current stateset's input is the end marker and the
algorithm terminates, indicating a successful recognition.

While Earley's algorithm has been presented as a mechanism for parsing
context-free grammars, it may also be extended to handle more complex
types of grammars.

\heading{3. Feature-structure Parsing}

\sheading{3.1 Definitions}

In general, unification grammars are based on relations between {\it
feature structures}.  A feature structure is a directed acyclic graph
(DAG) whose arcs are labelled with {\it attributes\/} and whose nodes
are labelled with {\it values}.  Figure \xrefn{DAG} depicts a sample
feature structure.

$$\xymatrix@-1pc{
&& \ar@/_/[dd]|*+\txt{\lab x$_1$} \ar@/^/[dr]|*+\txt{\lab x$_0$} \\
&&& \ar@/_/[drr]|*+\txt{\lab maj} \ar@/^1pc/[drrr]|*+\txt{\lab syn} \\
&& \ar@/_/[rrr]|*+\txt{\lab maj} \ar@/_2pc/[rrrr]|*+\txt{\lab syn} 
&&& {v} & {sg}}\eqdef{DAG}$$

Formally, such a DAG is defined\cite{Gerde} as a sextuple $\left< N,
L, A, \delta, \alpha, r \right>$ where $N$ is a set of {\it nodes},
$L$ a set of {\it attributes\/} (that is, arc labels), $A$ a set of
{\it atoms}, $\delta$ the connection function that maps node-attribute
pairs onto destination nodes, $\alpha$ the valuation function that
maps nodes onto atoms (representing the arcs in the DAG), and $r$ the
designated {\it root\/} node.  In addition, there must be a path from
the root node to every other node in the graph.

It is common to represent feature structures by an {\it
attribute-value matrix} (AVM).  Figure (\xrefn{AVM}) depicts the AVM
corresponding to (\xrefn{DAG}).

$$\avm \[ x0 & \[ maj & \@1 v \cr syn & \@2 sg \] \cr x1 & \[ maj &
\@1 \cr syn & \@2 \] \] \endavm \eqdef{AVM}$$

Here, the boxed 1 represents the fact that both {\tt maj} arcs point
to the same node; the boxed 2 similarly represents the sharing of a
node by both {\tt syn} arcs.  This notation can also be used to
describe classes of feature structures by leaving some or all of the
values unspecified.  For instance, (\xrefn{AVM2}) represents the
class of feature structures that include (\xrefn{DAG}) as well as
those where, for instance, both {\tt syn} attributes have the value
{\sf pl}.

$$\avm \[ x0 & \[ maj & \@1 v \cr syn & \@2 \] \cr x1 & \[ maj & \@1
\cr syn & \@2 \] \] \endavm \eqdef{AVM2}$$

\sheading{3.2 Augmenting Earley's Algorithm}

A feature structure grammar, then, has two parts.  The first is a set
of feature structures that form the rules of the grammar, describing
how a composite linguistic entity is related to the entities
corresponding to its components.  {\sssc UNICORN} represents these as
CFG rules augmented so that feature structures have taken the place of
terminals and non-terminals.  They have the general form of
(\xrefn{DAG}): the X$_0$ attribute's value is the parent feature
structure, corresponding to the left-hand side of a CFG rule, and the
remaining attributes' values are the feature structures for the
daughters, corresponding to the right-side of a CFG rule.

The second component of a feature structure grammar is a set of
feature structures representing lexical entries. These typically have
one attribute whose value can be matched against the input (i.e. some
sort of phonological or orthographical representation) and other
attributes that describe the lexical item itself, depending on the
intended use of the grammar and the underlying linguistic theory.

Since the feature-structure grammar is represented as an augmentation
of a context-free grammar, it is possible to construct a feature
structure parsing algorithm from a context-free parsing algorithm;
indeed, this is the approach Gerdemann and Hinrich took in adapting
Earley's algorithm.  The nature of this construction is a direct
consequence of the replacement of atomic symbols with complex feature
structures: for context-free parsing, the tests for state identity and
symbol identity are trivial operations, requiring a single
comparison. For feature-structure parsing, however, the analagous
comparisons are more complex.

The predictor uses a state identity test to avoid adding redundant
states to statesets. When an identical state already exists in that
stateset (as in line 1 in (\xrefn{EPar})), it does not add that state
to the stateset again. With a feature structure grammar, we need to
test for {\it subsumption\/} instead. In essence, feature structure
$A$ subsumes another feature structure $B$ when $A$ and $B$ do not
conflict in any way and $A$ is more general than $B$.  The
feature-structure predictor, then, will not add a state to a stateset
when a subsuming DAG is already present in that stateset.

Symbol identity tests are used by all three of the state-adding
procedures.  The predictor looks for rules in the grammar whose
left-hand symbol matches the active non-terminal in the current state.
The scanner checks to see that the active terminal matches the active
input symbol.  The completer looks for states in the origin stateset
whose active symbols match the left-hand symbol of the current state.
In all these cases, we need to replace this identity test with {\it
unification}.  In essence, the unification of two feature structures
$A$ and $B$ (written $A \sqcup B$) is the most general feature
structure subsumed by both $A$ and $B$. The feature-structure
predictor, scanner, and completer, then, will add new states when the
active feature structure sucessfully unifies with a feature structure
from other states or the set of rules in the grammar.

\sheading{3.3 Restriction}

By replacing state and symbol identity tests with the more complex
subsumption and unification tests, the execution time of the algorithm
increases considerably.  A technique known as {\it restriction}
optimizes the application of the subsumption and unification tests,
compensating for this increase.

Restriction uses a function (a {\it restrictor\/}) from DAGs onto
DAGs to reduce the amount of time spent performing such tests. The
tests are performed on the {\it restricted DAGs\/} (RDs) first; if the
test succeeds, then it is performed on the actual DAGs, but if it
fails, there is no need to consider the actual DAGs. 

The exact nature of the optimal restrictor depends heavily on the
nature of the grammar itself. In a grammar designed for {\sssc
UNICORN}, it will consist of a set of {\it paths}.  A path is a
connected series of attributes in the DAG (written as a list of
attribute labels separated with vertical bars).  For example, {\sssc
X0|MAJ} is a path to the node labelled $v$ in (\xrefn{DAG}). 

The output of this kind of restrictor is an RD consisting of the
restrictor's paths and their values.  For complex values, we use empty
brackets ({\sssc []}) to represent the complex wildcard, a symbol that
unifies with any complex value but not with any atomic value. For
example, if we apply the restrictor (\xrefn{REST}) to the sub-DAG
X$_0$ of rule (\xrefn{RL1}) in Appendix A (reproduced here as
(\xrefn{RESTI1})), we get (\xrefn{RESTR1}).

$$\left< \hbox{\sssc MAJ, SYN|FORM, SYN|COMPS, SYN|SPR} \right>
\eqdef{REST}$$

$$\avm \[ maj & v \cr syn & \[ form & finite \cr agr & \[ cls & non \]
\cr comps & \[$\,$\] \cr spr & \[$\,$\]\] \cr surf & plurv(\[$\,$\])
\] \endavm\eqdef{RESTI1}$$

$$\avm \[ maj & v \cr syn & \[ form & base \cr comps & \[$\,$\] \cr
spr & \[$\,$\]\] \] \endavm\eqdef{RESTR1}$$

Unifying two RDs is a linear-time operation, as each RD contains a
finite number of paths having atomic values.  Since there are a finite
number of RDs, restriction has the effect of dividing the infinite
feature structure space into a finite number of equivalence classes.
Then, as it is impossible for feature structures in different
equivalence classes to unify, the algorithm can recognize pairs of
DAGs that ``obviously'' wouldn't unify without actually performing the
unification test. 

\sheading{3.4 {\sssc UNICORN}'s parser}

The basic algorithm used by {\sssc UNICORN} follows the outline of
Earley's algorithm, maintaining a similar collection of statesets.
Each stateset contains, in addition to its states, a list of all RDs
that have been used to make predictions in this stateset (the {\it
predictors list\/}).  Each state contains a DAG representing an
instantiation of a rule in the grammar, the location of the dot, a
pointer to the state's origin, and two RDs: one acting as a
backpointer, set when the state is created, and one as a forward
pointer, set when the predictor is applied to the state.

The input to the {\sssc UNICORN\/} parser consists of a grammar and an
input string, as for Earley's algorithm.  The user also provides the
parser with a {\it goal\/} DAG, representing the desired root of the
final parse tree.  The initial state is constructed by embedding this
goal DAG under the X$_0$ and X$_1$ attributes of the initial DAG. For
instance, if the goal is \avm \[ maj & s \] \endavm, the initial DAG
would be as described in figure \xrefn{INIT}.

$$\avm \[ x0 & \@1 \[ maj & s \] \cr x1 & \@1 \] \endavm\eqdef{INIT}$$

The backpointer of the initial state is given the distinct value \$ to
mark it as initial; the dot is placed before X$_1$.

Unlike Earley's algorithm for context-free grammars, the predictor and
scanner will always both be applied to non-final states. For
prediction, the parser constructs an RD from the active feature
structure.  If this RD is subsumed by any RD on the predictors list,
processing continues to the next state.  Otherwise, this RD is unified
with the RD for each rule in the grammar.  If they unify, the actual
DAGs for the rule and the active feature structure are unified.  If
this operation succeeds, a state is added to the current stateset
whose DAG is the result of the unification.  The dot is set to the
initial position, the origin to the current stateset and the
backpointer to the RD that generated this state.  This RD is also
added to the prediction list for this stateset and becomes the forward
pointer of the state that generated this prediction.

When the scanner is applied, it attempts to unify the active feature
structure with each lexical item in the grammar matching the current
input word (using the RDs to filter possibilities as before).  If the
unification succeeds, a new state is created from the result of the
unification with the same origin and backpointer as the current state
but with the dot advanced one position.

To determine to which stateset this new state must be added, the
length of the scanned phrase is added to the current stateset's index.
Thus a three-word phrase scanned in stateset 2 would cause a new state
to be added to stateset 5; a zero-length phrase (a trace) would cause
a new state to be added to the current stateset.  This allows the
parser to use the same mechanisms to scan phrasal lexical items
(proper names, for example) as it uses to scan individual words.

Finally, the completer is applied whenever the current state is final.
For each state in the origin stateset whose forward pointers match the
current state's backpointer (the potential {\it ancestors\/}), the
completer unifies the X$_0$ attribute of the current state with the
active feature structure in the potential ancestor.  For each success,
a new state is created from the result of the unification with the dot
advanced one position.  The new state retains the ancestor's origin as
well as its back- and forward pointers, and is added to the current
stateset.

To illustrate this process, some of the states involved in the parse
of (\xrefn{ExSen}) in the grammar given in appendix A will be shown.

$$\hbox{The swimmer splashes the dogs.}\eqdef{ExSen}$$

Before the parse begins, the parser pre-processes the grammar by
applying the restrictor (\xrefn{REST}) to the X$_0$ sub-DAG of each
rule and to each lexical entry.  The resulting RDs are stored for
future reference.

Here, the goal has been provided as a verb phrase whose {\tt comps}
and {\tt spr} attributes are {\sf nil}.  Figure \xrefn{Par0} shows the
initial state (the backpointer is represented as \$ to encode the fact
that this is the initial state).

$$\left< 0, \avm \[ x0 & \@1 \[ maj & v \cr syn & \[ spr & nil \cr
comps & nil \] \] \cr x1 & \@1\] \endavm , 1, 0, \$, \avm \[ maj & v
\cr syn & \[ comps & nil \cr spr & nil \] \]
\endavm\right>\eqdef{Par0}$$

In this and the following figures, the individual states are
represented with the components in the following order: stateset
number, rule feature structure, dot position, originating stateset,
backpointer, forward pointer.

In applying the predictor to (\xrefn{Par0}), the parser begins by
applying the restrictor (\xrefn{REST}) to the active feature structure
in this state: the X$_1$ attribute. Since the parser has not yet
applied the predictor to any state, the list of predictors for this
stateset is empty.  The forward pointer of the current state is
therefore set to the resulting RD and added to the predictors list.
Next, the parser looks for rules in the grammar whose RDs unify with
the current forward pointer.  In this case, both rules (\xrefn{ParR1})
and (\xrefn{ParR2}) meet this criterion.  Both of these rules are then
unified with the active feature structure (by extracting it from the
state and embedding it under an X$_0$ attribute in a new DAG first) to
produce states (\xrefn{Par1}) and (\xrefn{Par2}).

$$\avm \[ x0 & \[ maj & \@4 \cr syn & \[ form & \@2 \cr comps & \@3
\cr agr & \@5 \cr spr & \@6 \] \cr surf & append(\@7, \@8) \] \cr
x1 & \[ maj & \@4 \cr syn & \[ form & \@2 \cr comps & \< \@1, \@3 \>
\cr agr & \@5 \cr spr & \@6 \] \cr surf & \@7 \] \cr x2 & \@1 \[
surf & \@8 \] \] \endavm\eqdef{ParR1}$$

$$\avm \[ x0 & \[ maj & \@3 \cr syn & \[ form & \@2 \cr comps & nil
\cr agr & \@4 \cr spr & nil \] \cr surf & append(\@5, \@6) \] \cr
x1 & \@1 \[ syn & \[ agr & \@4 \cr spr & nil \] \cr surf & \@5 \]
\cr x2 & \[ maj & \@3 \cr syn & \[ form & \@2 \cr comps & nil \cr agr
& \@4 \cr spr & \@1 \] \cr surf & \@6 \] \] \endavm\eqdef{ParR2}$$

$$\left< 0, \avm \[ x0 & \[ maj & \@4 v \cr syn & \[ form & \@2 \cr
comps & \@3 nil \cr agr & \@5 \cr spr & \@6 nil \] \cr surf &
append(\@7, \@8) \] \cr x1 & \[ maj & \@4 \cr syn & \[ form & \@2 \cr
comps & \< \@1, \@3 \> \cr agr & \@5 \cr spr & \@6 \] \cr surf &
\@7 \] \cr x2 & \@1 \[ surf \@8 \] \] \endavm , 1, 0, \avm \[ maj &
v \cr syn & \[ comps & nil \cr spr & nil\] \] \endavm , \avm \[ maj &
v \cr syn & \[ comps & \[\] \cr spr & nil\] \]
\endavm\right>\eqdef{Par1}$$

$$\left< 0, \avm \[ x0 & \[ maj & \@3 v \cr syn & \[ form & \@2 \cr
comps & nil \cr agr & \@4 \cr spr & nil \] \cr surf & append(\@5,
\@6) \] \cr x1 & \@1 \[ syn & \[ spr & nil \cr comps & nil \cr agr &
\@4 \] \cr surf & \@5 \] \cr x2 & \[ maj & \@3 \cr syn & \[ form &
\@2 \cr comps & nil \cr agr & \@4 \cr spr & \@1 \] \cr surf & \@6
\] \] \endavm , 1, 0, \avm \[ maj & v \cr syn & \[ comps & nil \cr spr
& nil\] \] \endavm , \avm \[ syn & \[ comps & nil \cr spr & nil\] \]
\endavm\right> \eqdef{Par2}$$

The rule instantiations in states (\xrefn{Par1}) and (\xrefn{Par2})
show that the information from the rule has been combined (through
unification) with the information provided by the goal DAG (i.e. that
the parent's major category ({\tt maj}) is {\sf v} and that its {\tt
spr} and {\tt comps} attributes have a value of {\sf nil}). The parser
next applies the scanner to state (\xrefn{Par0}), but none of the
lexical entries unify with the current forward pointer. Processing
moves on to state (\xrefn{Par1}); here, the predictor produces state
(\xrefn{Par3}) from rule (\xrefn{ParR1}).

$$\left< 0, \avm \[ x0 & \[ maj & \@4 v \cr syn & \[ form & \@2 \cr
comps & \@3 \< \[\], nil \> \cr agr & \@5 \cr spr & \@6 nil \] \cr
surf & append(\@7, \@8) \] \cr x1 & \[ maj & \@4 \cr syn & \[ form
& \@2 \cr comps & \< \@1, \@3 \> \cr agr & \@5 \cr spr & \@6 \] \cr
surf & \@7 \] \cr x2 & \@1 \[ surf & \@8 \] \] \endavm , 1, 0,
\avm \[ maj & v \cr syn & \[ comps & \[\] \cr spr & nil\] \] \endavm ,
\avm \[ maj & v \cr syn & \[ comps & \[\] \cr spr & nil\] \]
\endavm\right>\eqdef{Par3}$$

Note that (\xrefn{Par1}) differs from (\xrefn{Par3}) in that it is an
instantiation of (\xrefn{ParR1}) based on an earlier prediction,
rather than on the goal DAG; here, the only difference is in the value
of the parent's {\tt comps} attribute. 

The scanner produces no new states when applied to (\xrefn{Par1}).
When processing state (\xrefn{Par2}), the predictor uses rule
(\xrefn{ParR2}) to add state (\xrefn{Par4}) to the current stateset.

$$\left< 0, \avm \[ x0 & \[ maj & \@3 \cr syn & \[ form & \@2 \cr
comps & nil \cr agr & \@4 \cr spr & nil \] \cr surf & append(\@5, \@6)
\] \cr x1 & \@1 \[ syn & \[ spr & nil \cr comps & nil \cr agr & \@4 \]
\cr surf & \@5 \] \cr x2 & \[ maj & \@3 \cr syn & \[ form & \@2 \cr
comps & nil \cr agr & \@4 \cr spr & \@1 \] \cr surf & \@6
\] \] \endavm , 1, 0, \avm \[ syn & \[ comps & nil \cr spr & nil \] \]
\endavm , \avm \[ syn & \[ comps & nil \cr spr & nil \] \]
\endavm\right>\eqdef{Par4}$$

State (\xrefn{Par4}) differs from (\xrefn{Par2}) in that the parent's
major category is left unspecified.  When the scanner is applied to
state (\xrefn{Par2}), it successfully unifies the state's forward
pointer with the RD for {\it the} (the current input token).  The
scanner then unifies the lexical entry for {\it the\/} with the
current state (embedding the lexical entry under the X$_1$ attribute
of a new DAG first), producing state (\xrefn{Par5}), which is then
added to stateset 1.

$$\left< 1, \avm \[ x0 & \[ maj & \@3 v \cr syn & \[ form & \@2 \cr
comps & nil \cr agr & \@4 \cr spr & nil \] \cr surf & append(\@5,
\@6) \] \cr x1 & \@1 \[ maj & det \cr syn & \[ spr & nil \cr comps &
nil \cr agr & \@4 \] \cr surf & \@5 `the' \] \cr x2 & \[ maj & \@3
\cr syn & \[ form & \@2 \cr comps & nil \cr agr & \@4 \cr spr & \@1 \]
\cr surf & \@6 \] \] \endavm , 2, 0, \avm \[ maj & v \cr syn & \[
comps & nil \cr spr & nil \] \] \endavm , \avm \[ maj & v \cr syn & \[
comps & nil \cr spr & \[\] \] \] \endavm\right>\eqdef{Par5}$$

State (\xrefn{Par5}) therefore differs from (\xrefn{Par2}) in the
values for the first daughter's attributes: the major category and
surface form ({\tt surf}) are now recognized.

When the parser constructs the forward pointer for state
(\xrefn{Par3}) (in order to begin prediction), it recognizes that it
is subsumed by an RD that has previously been used to make
predictions: the forward pointer in (\xrefn{Par1}).  At this point,
the predictor simply halts; no further processing occurs.  The
scanner, however, is still applied to this state; it does not find
lexical items that unify with the active feature structure.

In state (\xrefn{Par4}), the forward pointer is also subsumed by an
element on the predictor list (in this case, that of state
(\xrefn{Par2})) and no further predictions are made.  The scanner,
however, successfully unifies the forward pointer with the RD of the
lexical item {\it the}. It therefore adds state (\xrefn{Par6}) to
stateset 1.

$$\left< 1, \avm \[ x0 & \[ maj & \@3 \cr syn & \[ form & \@2 \cr
comps & nil \cr agr & \@4 \cr spr & nil \] \cr surf & append(\@5,
\@6) \] \cr x1 & \@1 \[ maj & det \cr syn & \[ spr & nil \cr comps &
nil \cr agr & \@4 \] \cr surf & \@5 `the' \] \cr x2 & \[ maj & \@3
\cr syn & \[ form & \@2 \cr comps & nil \cr agr & \@4 \cr spr & \@1 \]
\cr surf & \@6 \] \] \endavm , 2, 0, \avm \[ syn & \[ comps & nil
\cr spr & nil \] \] \endavm , \avm \[ syn & \[ comps & nil \cr spr &
\[\] \] \] \endavm\right>\eqdef{Par6}$$

State (\xrefn{Par6}) differs from (\xrefn{Par4}) in its values for the
first daughter and from (\xrefn{Par5}) in leaving the parent's major
category unspecified.

In state (\xrefn{Par5}), the dot is at X$_2$, reflecting the fact that
the lexical item for X$_1$ has already been scanned.  The predictor
will automatically apply here; as this is a new stateset, the
predictor list is empty.  (Both of these predictions lead to
dead-ends, however, and are omitted here.) The scanner applies to this
state, but fails to find any lexical items whose RDs unify with the
forward pointer.

The predictor similarly creates two dead-end states from state
(\xrefn{Par6}).  When the scanner is applied to (\xrefn{Par6}),
however, it finds the lexical item {\it swimmer}, whose RD unifies
with this state's forward pointer, and adds state (\xrefn{Par7}) to
stateset 2.

$$\left< 2, \avm \[ x0 & \[ maj & \@3 n \cr syn & \[ form & \@2 base
\cr comps & nil \cr agr & \@4 \[ cls & tsg \cr num & sg \] \cr spr &
nil \] \cr surf & append(\@5, \@6) \] \cr x1 & \@1 \[ maj & det \cr
syn & \[ spr & nil \cr comps & nil \cr agr & \@4 \] \cr surf & \@5
`the' \] \cr x2 & \[ maj & \@3 \cr syn & \[ form & \@2 \cr comps & nil
\cr agr & \@4 \cr spr & \@1 \] \cr surf & \@6 `swimmer' \] \]
\endavm , 3, 0, \avm \[ syn & \[ comps & nil \cr spr & nil \] \]
\endavm , \avm \[ syn & \[ comps & nil \cr spr & \[\] \] \]
\endavm\right>\eqdef{Par7}$$

Notice that in state (\xrefn{Par7}), the unification of the lexical
information for the second daughter has added new information to the
description of the parent's attributes; in particular, values are now
present for the {\tt maj}, {\tt form}, and {\tt agr} attributes.

Since state (\xrefn{Par7}) is a final state, the predictor and scanner
do not apply.  Instead, the completer recognizes that this state
originated from stateset 0.  It examines that stateset and finds three
states whose forward pointers unify with (\xrefn{Par7})'s backpointer
(i.e. three potential ancestors): (\xrefn{Par0}), (\xrefn{Par2}), and
(\xrefn{Par4}).  For each of these, the parent node is extracted and
unified with the active feature structure in the ancestor states.  For
(\xrefn{Par0}), this unification fails.  With (\xrefn{Par2}) and
(\xrefn{Par4}), however, the unification succeeds.  The completer
constructs new states in the current stateset out of these states by
leaving the origin and backpointer the same and advancing the dot by
one position.  The results are (\xrefn{Par8}) and (\xrefn{Par9}),
respectively.

$$\left< 2, \avm \[ x0 & \[ maj & \@3 v \cr syn & \[ form & \@2 \cr
comps & nil \cr agr & \@4 \[ cls & tsg \cr num & sg \] \cr spr & nil
\] \cr surf & append(\@5, \@6) \] \cr x1 & \@1 \[ maj & n \cr syn &
\[ spr & nil \cr comps & nil \cr agr & \@4 \cr form & base \] \cr
surf & \@5 `the swimmer'\] \cr x2 & \[ maj & \@3 \cr syn & \[ form
& \@2 \cr comps & nil \cr agr & \@4 \cr spr & \@1 \] \cr surf & \@6
\] \] \endavm , 2, 0, \avm \[ maj & v \cr syn & \[ comps & nil \cr spr
& nil\] \] \endavm , \avm \[ maj & v \cr syn & \[ comps & nil \cr spr
& \[ \]\] \] \endavm\right> \eqdef{Par8}$$

$$\left< 2, \avm \[ x0 & \[ maj & \@3 \cr syn & \[ form & \@2 \cr
comps & nil \cr agr & \@4 \[ cls & tsg \cr num & sg \] \cr spr & nil
\] \cr surf & append(\@5, \@6) \] \cr x1 & \@1 \[ maj & n \cr syn &
\[ spr & nil \cr comps & nil \cr agr & \@4 \cr form & base \] \cr
surf & \@5 `the swimmer'\] \cr x2 & \[ maj & \@3 \cr syn & \[ form
& \@2 \cr comps & nil \cr agr & \@4 \cr spr & \@1 \] \cr surf & \@6
\] \] \endavm , 2, 0, \avm \[ syn & \[ comps & nil \cr spr & nil\] \]
\endavm , \avm \[ syn & \[ comps & nil \cr spr & \[ \]\] \]
\endavm\right> \eqdef{Par9}$$

The rest of the parse proceeds similarly.  As before, once the
completer successfully applies to the initial state, the parse
succeeds.  If at any point there are no states left to process in any
future stateset, the parse fails.

\heading{4. Lexical Rule Implementation}

In a feature structure grammar, a lexical rule must completely
describe the correspondences between the two forms related by the
rule. For instance, a grammar might posit (\xrefn{PSV}) as the lexical
rule relating passive verbs to their base lexical entries.

$$\avm \[ x0 & \[ maj & \@1 v \cr syn & \[ form & ppar \cr comps & \@3
\cr spr & \@2 \] \cr surf & ppar(\@4) \] \cr x1 & \[ maj & \@1 \cr
syn & \[ form & base \cr comps & \< \@2, \@3 \> \] \cr surf & \@4
\] \] \endavm\eqdef{PSV}$$

Here, the X$_0$ attribute describes the passive form, while the X$_1$
attribute describes the base form (i.e. the form given in the
lexicon).  The rule states that the surface form (the value of the
{\tt surf} attribute) of the passive is related to the form of the
base verb through the {\sf ppar} function (as described in Section 5).
Similarly, the rule specifies the relationship between the {\tt spr}
and {\tt comps} attributes of each form: the specifier of the passive
form is the first complement of the base form, while the remaining
complements of the base form become the complements of the passive
form. The two forms share the same major category ({\tt maj}), and
each receives an appropriate {\tt form} value: {\sf ppar} for the
passive form and {\sf base} for the base form.

As described so far, {\sssc UNICORN} cannot accurately parse with a
grammar containing lexical rules: it expects to be able to directly
match lexical entries to input tokens. As some input tokens might only
be produced as the output of a lexical rule, they will never be found
in the lexicon, and they will be impossible to scan.

\sheading{4.1 Context-free Lexical Rules}

To see how this capability can be added to {\sssc UNICORN}, consider
the context-free equivalent to a lexical rule: a rule that has
terminal symbols on both sides of the rule, as in the set of rules in
(\xrefn{LGram}).

$$\eqalign{0&:S' \to S \cr 1&:S \to A\,A \cr 2&:A \to a \cr 3&:A \to b
\cr 4&:b \to c \cr}\eqdef{LGram}$$

The only change we need make to Earley's algorithm is to allow the
predictor to apply to states whose active symbols are terminals. A
trace of the parse for the string {\it bc\/} is given in
(\xrefn{LPar}).

$$\vcenter{\offinterlineskip
\halign{\strut
\vrule $\>#\>$ \hfil & 
\vrule $\>#\>$ \hfil & 
\vrule $\>#\>$ \hfil & 
\vrule\ # \hfil\vrule\cr
\noalign{\hrule}
& I_o & \hbox{state} & created by \cr
\noalign{\hrule}
\multispan4\strut\vrule\hfil Stateset 0 --- Input: b \hfil\vrule\cr
\noalign{\hrule}
0 & 0 & S' \to .\,S & initial \cr
1 & 0 & S \to .\,A\,A & predict(0,1) \cr
2 & 0 & A \to .\,a & predict(1,2) \cr
3 & 0 & A \to .\,b & predict(1,3) \cr
4 & 0 & b \to .\,c & predict(3,4) \cr
\noalign{\hrule}
\multispan4\strut\vrule\hfil Stateset 1 --- Input: c \hfil\vrule\cr
\noalign{\hrule}
5 & 0 & A \to b\,. & scan(3) \cr
6 & 0 & S \to A\,.\,A & complete(5,1) \cr
7 & 1 & A \to .\,a & predict(6,2) \cr
8 & 1 & A \to .\,b & predict(6,3) \cr
9 & 1 & b \to .\,c & predict(8,4) \cr
\noalign{\hrule}
\multispan4\strut\vrule\hfil Stateset 2 --- Input: \$ \hfil\vrule\cr
\noalign{\hrule}
10 & 1 & b \to c\,. & scan(9) \cr
11 & 1 & A \to b\,. & complete(10,8) \cr
12 & 0 & S \to A\,A\,. & complete(11,6) \cr
13 & 0 & S' \to S\,. & complete(12,1) \cr
\noalign{\hrule}
}}\eqdef{LPar}$$

Line 4 in this trace represents a lexical rule prediction. When
applying the scanner to line 3, the parser recognizes that the symbol
$b$ is a terminal.  Under the original Earley's algorithm, the
predictor would not apply to this state.  Here, the predictor
recognizes that a lexical rule exists whose left-hand side matches the
active symbol, and therefore adds line 4 to stateset 0. This also
occurs when the parser processes line 8, predicting line 9. The rest
of the parse proceeds exactly as in section 2.

\sheading{4.2 Feature Structure Lexical Rules}

As mentioned before, {\sssc UNICORN} expects to directly match input
symbols against the surface forms of the lexical entries, before that
lexical entry is unified with the active symbol.  To accurately parse
with lexical rules, however, we must defer this input scanning until
after the unifications, when the surface form can be accurately
matched to the output of the morphological functions.

To accomplish this, each of the three state-adding components of the
parser is modified. The modifications to the predictor are minor: any
state predicted from a lexical rule is now tagged as a lexical state;
any state predicted from a non-lexical rule is now tagged as a
non-lexical state.

The scanner is unchanged with respect to non-lexical states (scanning
preserves the lexical/non-lexical status of a state).  With lexical
states, however, the input scanning is omitted.  Instead, when the
scanner applies to a lexical state, we produce a new state from every
lexical item that successfully unifies with the active feature
structure.

Non-lexical states are also completed the same way as before, as are
lexical states with lexical ancestors.  When the completer encounters
a lexical state with a non-lexical ancestor, the surface form of the
parent node of the current state must match the current input token
for completion to succeed.  This has the effect of deferring input
matching until all lexical rule application has completed

To illustrate these changes, consider the parse presented in section
3. When the new predictor applies to state (\xrefn{Par9}) (repeated
here as (\xrefn{LPar1})), it will produce state (\xrefn{LPar2}) (among
others) from rule (\xrefn{LParR1}).

$$\left< 2, \avm \[ x0 & \[ maj & \@3 \cr syn & \[ form & \@2 \cr
comps & nil \cr agr & \@4 \[ cls & tsg \cr num & sg \] \cr spr & nil
\] \cr surf & append(\@5, \@6) \] \cr x1 & \@1 \[ maj & n \cr syn &
\[ spr & nil \cr comps & nil \cr agr & \@4 \cr form & base \] \cr
surf & \@5 `the swimmer'\] \cr x2 & \[ maj & \@3 \cr syn & \[ form
& \@2 \cr comps & nil \cr agr & \@4 \cr spr & \@1 \] \cr surf & \@6
\] \] \endavm , 2, 0, nl, \avm \[ syn & \[ comps & nil \cr spr & nil\] \]
\endavm , \avm \[ syn & \[ comps & nil \cr spr & \[ \]\] \]
\endavm\right> \eqdef{LPar1}$$

\avmfont{\ssscs}
\avmvalfont{\sfs}
\avmsortfont{\sfis}
$$\left< 2, \avm \[ x0 & \[ maj & \@3 v \cr syn & \[ form & \@2 \cr
agr & \@4 \[ cls & tsg \cr num & sg \] \cr comps & nil \cr spr & \@5
\[ maj & n \cr syn & \[ spr & nil \cr comps & nil \cr agr & \@4 \cr
form & base \] \cr surf & `the swimmer' \] \] \cr surf &
append(\@6, \@7) \] \cr x1 & \[ maj & \@3 \cr syn & \[ form & \@2 \cr
comps & \< \@1 \> \cr spr & \@5 \cr agr & \@4 \] \cr surf & \@6 \]
\cr x2 & \@1 \[ surf & \@7 \] \] \endavm , 1, 2, nl, \avm \[ syn &
\[ comps & nil \cr spr & \[ \] \] \] \endavm , \avm \[ syn & \[ comps
& \[ \] \cr spr & \[ \] \] \] \endavm\right>\eqdef{LPar2}$$

$$\avm \[ x0 & \[ maj & \@4 \cr syn & \[ form & \@2 \cr comps & \@3
\cr agr & \@5 \cr spr & \@6 \] \cr surf & append(\@7, \@8) \] \cr
x1 & \[ maj & \@4 \cr syn & \[ form & \@2 \cr comps & \< \@1, \@3 \>
\cr agr & \@5 \cr spr & \@6 \] \cr surf & \@7 \] \cr x2 & \@1 \[
surf & \@8 \] \] \endavm\eqdef{LParR1}$$

State (\xrefn{LPar2}) therefore represents an instantation of
(\xrefn{LParR1}) in which the parent's attributes are more fully
specified; values are now present for the {\tt maj}, {\tt agr}, {\tt
comps}, and {\tt spr} attributes.  When the predictor is applied to
(\xrefn{LPar2}), it uses rule (\xrefn{LParR2}) to produce state
(\xrefn{LPar3}).

\goodbreak

$$\avm \[ x0 & \[ maj & \@3 v \cr syn & \[ form & finite \cr agr & \[
cls & tsg \] \cr comps & \@1 \cr spr & \@2 \] \cr surf & sing(\@4)
\] \cr x1 & \[ maj & \@3 \cr syn & \[ form & base \cr comps & \@1 \cr
spr & \@2 \] \cr surf & \@4 \] \] \endavm\eqdef{LParR2}$$

$$\left< 2, \avm \[ x0 & \[ maj & \@3 v \cr syn & \[ form & finite \cr
agr & \@5 \[ cls & tsg \cr num & sg \] \cr comps & \< \@1 \> \cr spr &
\@2 \[ maj & n \cr syn & \[ spr & nil \cr comps & nil \cr agr & \@5
\cr form & base \] \cr surf & `the swimmer' \] \] \cr surf &
sing(\@4) \] \cr x1 & \[ maj & \@3 \cr syn & \[ form & base \cr comps
& \< \@1 \> \cr spr & \@2 \] \cr surf & \@4 \] \] \endavm , 1, 2,
lex, \avm \[ syn & \[ comps & \[ \] \cr spr & \[ \] \] \] \endavm ,
\avm \[ maj & v \cr syn & \[ form & base \cr comps & \[ \] \cr spr &
\[ \] \] \] \endavm\right>\eqdef{LPar3}$$ 

\nobreak

State (\xrefn{LPar3}) similarly represents a fuller instantation of
(\xrefn{LParR2}) in which values for {\tt agr|num}, {\tt comps}, and
{\tt spr} have been unified in. From (\xrefn{LPar3}), the scanner
unifies the X$_1$ node with the lexical entry for {\it splash},
producing state (\xrefn{LPar4}).

\goodbreak

$$\left< 2, \avm \[ x0 & \[ maj & \@3 v \cr syn & \[ form & finite \cr
agr & \@5 \[ cls & tsg \cr num & sg \] \cr comps & \< \@1 \> \cr spr &
\@2 \[ maj & n \cr syn & \[ spr & nil \cr comps & nil \cr agr & \@5
\cr form & base \cr cse & nom \] \cr surf & `the swimmer' \] \] \cr
surf & sing(\@4) \] \cr x1 & \[ maj & \@3 \cr syn & \[ form & base
\cr comps & \< \@1 \[ maj & n \cr syn & \[ comps & nil \cr spr & nil
\cr cse & acc \] \] \> \cr spr & \@2 \] \cr surf & \@4 `splash' \]
\] \endavm , 2, 2, lex, \avm \[ syn & \[ comps & \[ \] \cr spr & \[ \]
\] \] \endavm , \avm \[ maj & v \cr syn & \[ form & base \cr comps &
\[ \] \cr spr & \[ \] \] \] \endavm\right>\eqdef{LPar4}$$

This has the effect of adding values to the parent's {\tt
syn|spr|syn|case} attribute, as well as the daughter's {\tt surf} and
{\tt comps} attributes.

At this point, the completer is applied to state (\xrefn{LPar4}). The
X$_0$ node is extracted and unified with the active symbol of state
(\xrefn{LPar2}). Unification succeeds, and the surface form of the
active symbol in the resulting feature structure (here, the value of
its {\tt surf} attribute) is matched against the active input token.
Since they match, the completer adds state (\xrefn{LPar5}) to the
current stateset.

$$\left< 2, \avm \[ x0 & \[ maj & \@3 v \cr syn & \[ form & \@2 finite
\cr agr & \@4 \[ cls & tsg \cr num & sg \] \cr comps & nil \cr spr &
\@5 \[ maj & n \cr syn & \[ spr & nil \cr comps & nil \cr agr & \@4
\cr form & base \cr cse & nom \] \cr surf & `the swimmer' \] \] \cr
surf & append(\@6, \@7) \] \cr x1 & \[ maj & \@3 \cr syn & \[ form &
\@2 \cr comps & \< \@1 \> \cr spr & \@5 \cr agr & \@4 \] \cr surf &
\@6 `splashes'\] \cr x2 & \@1 \[ maj & n \cr syn & \[ comps & nil \cr
spr & nil \cr cse & acc \] \cr surf & \@7 \] \] \endavm , 2, 2, nl,
\avm \[ syn & \[ comps & nil \cr spr & \[ \] \] \] \endavm , \avm
\[ maj & n \cr syn & \[ comps & nil \cr spr & nil \] \]
\endavm\right>\eqdef{LPar5}$$

Here, the parent has received values for the {\tt form} and {\tt
syn|spr|syn|cse} attributes, the first daughter has its {\tt surf}
attribute specified, and the second daughter now has values for {\tt
maj}, and {\tt syn}.  The rest of the parse is constructed similarly.

\sheading{4.3 A Potential Problem}

The deferral of input checking when the scanner is applied to a
lexical state can be a potential problem.  As presented here, the
scanner must, in effect, attempt to unify the active feature structure
with every compatible lexical item.  In a lexicon with a thousand
transitive verbs, then, the scanner would have to create a thousand
states every time it prepared to scan a transitive verb, rather than
just those whose surface forms match the input (and would be
retrievable through a simple hash table query).

In some sense, this is simply an unavoidable consequence of the
top-down nature of Earley's algorithm.  The simplest top-down parser
for a grammar with $n$ words and an input of $k$ tokens simply
enumerates the $k^n$ possible sentences and checks to see which of
these matches the input, outputting the respective parse tree.  All
top-down parsing algorithms are refinements of this strategy in which
the possibilities are kept as few as possible, usually by collapsing
as many possibilities as possible into one prediction. It is therefore
natural to expect this explosion of states.

On the other hand, it is similarly natural to expect that we can
sufficiently collapse these states into a small number of
predictions. This can be done in two different (but not mutually
exclusive) ways. One method would be to design the grammar in such a
way that the chosen restrictor was more effective in selecting a
subrange of lexical items.  In some cases, simply adding more paths to
the restrictor would have this effect.

The other method would involve changing the way words are stored in
the lexicon.  Instead of a list of words with corresponding feature
structures, the lexicon would consist of several feature structures
for word classes and a list of lexical items belonging to each class.
For instance, ``singular noun'' and ``intransitive verb'' might be two
word classes.  Under this system, the scanner would simply find an
appropriate word class feature structure to unify with the active
symbol; the word lists would not be accessed until completion.  As
many linguistic theories posit some sort of structured lexicon, this is
a natural modification to make.

\heading{5. Finite-State Morphology}

The modifications described in section 4 successfully allow the
relationship between base and derived-form feature structures to be
represented in the grammar.  This section describes how the parser
interprets a value like {\sf ppar(splash)}.

Morphological relationships are described in terms of morphological
correspondence functions (i.e. {\sf sing} in (\xrefn{LPar4})). These
functions have two components.  The first is an exception list.  This
handles irregular forms (for instance, {\it be $\to$ is}).  The second
part specifies an input to the finite-state transducer (for instance,
{\sssc \_\_+s}) for regular cases.  Thus the input to the function is
first matched against the items on the exception list.  If it matches
any of them, the function simply returns the corresponding output
form.  Otherwise, the input to the transducer is formed by replacing
the {\sssc \_\_} in the output pattern with the input.

The transducer then processes this input according to a user-provided
set of morphological rules.  These rules are represented in the form
of finite state tables.  From these tables, a list of {\it feasible
pairs\/} (possible correspondences between surface and underlying
forms) is formed. Upon receiving an input string, the transducer runs
all possible surface forms for the given underlying form through each
rule.  Only those forms which every finite state machine accepts are
legitimate outputs.

\heading{6. Conclusion}

The ability to parse with lexical rules is crucial to a proper
implementation of many modern linguistic theories.  As
feature-structure grammars become increasingly popular for linguistic
modelling, it is important that they also gain this ability. Once a
mechanism for handling surface form correspondence has been
incorporated into the parser, only a simple set of changes to the
parser is required to accomplish this.

\vfill\eject

\heading{A. Sample Grammar}

\sheading{A.1 Lexicon}

(Here, (\xrefn{RN1}) is the version used in the Section 3 (non-lexical
rule) grammar; (\xrefn{RN1a}) is the version used in the Section 4
(lexical rule) grammar.)

\avmfont{\sssc}
\avmvalfont{\sf}
\avmsortfont{\sfi}
$$\avm \[ maj & v \cr surf & splashes \cr syn & \[ form & finite
\cr spr & \[ maj & n \cr syn & \[ comps & nil \cr spr & nil \cr cse &
nom \cr agr & \[ cls & tsg \] \] \] \cr comps & \< \[ maj & n \cr syn
& \[ comps & nil \cr spr & nil \cr cse & acc\]\], nil \> \] \]
\endavm\eqdef{RN1}$$

$$\avm \[ maj & v \cr surf & splash \cr syn & \[ form & base \cr
spr & \[ maj & n \cr syn & \[ comps & nil \cr spr & nil \cr cse & nom
\] \] \cr comps & \< \[ maj & n \cr syn & \[ comps & nil \cr spr & nil
\cr cse & acc\]\], nil \> \] \] \endavm\eqdef{RN1a}$$

$$\avm \[ maj & det \cr surf & the \cr syn & \[ spr & nil \cr comps
& nil \] \] \endavm\eqdef{RN2}$$

$$\avm \[ maj & n \cr surf & swimmer \cr syn & \[ spr & \[ maj &
det \] \cr agr & \[ cls & tsg \cr num & sg \] \cr comps & nil \cr form
& base \] \] \endavm\eqdef{RN3}$$

\sheading{A.2 Non-Lexical Rules}

$$\avm \[ x0 & \[ maj & \@4 \cr syn & \[ form & \@2 \cr comps & \@3
\cr agr & \@5 \cr spr & \@6 \] \cr surf & append(\@7, \@8) \] \cr
x1 & \[ maj & \@4 \cr syn & \[ form & \@2 \cr comps & \< \@1, \@3 \>
\cr agr & \@5 \cr spr & \@6 \] \cr surf & \@7 \] \cr x2 & \@1 \[
surf & \@8 \] \] \endavm\eqdef{RN4}$$

$$\avm \[ x0 & \[ maj & \@3 \cr syn & \[ form & \@2 \cr comps & nil
\cr agr & \@4 \cr spr & nil \] \cr surf & append(\@5, \@6) \] \cr
x1 & \@1 \[ syn & \[ agr & \@4 \cr spr & nil \] \cr surf & \@5 \]
\cr x2 & \[ maj & \@3 \cr syn & \[ form & \@2 \cr comps & nil \cr agr
& \@4 \cr spr & \@1 \] \cr surf & \@6 \] \] \endavm\eqdef{RN5}$$

\sheading{A.3 Lexical Rules}

$$\avm \[ x0 & \[ maj & \@3 v \cr syn & \[ form & finite \cr agr & \[
cls & non \] \cr comps & \@1 \cr spr & \@2 \] \cr surf & plurv(\@4)
\] \cr x1 & \[ maj & \@3 \cr syn & \[ form & base \cr comps & \@1 \cr
spr & \@2 \] \cr surf & \@4 \] \] \endavm\eqdef{RL1}$$

$$\avm \[ x0 & \[ maj & \@3 v \cr syn & \[ form & finite \cr agr & \[
cls & tsg \] \cr comps & \@1 \cr spr & \@2 \] \cr surf & sing(\@4)
\] \cr x1 & \[ maj & \@3 \cr syn & \[ form & base \cr comps & \@1 \cr
spr & \@2 \] \cr surf & \@4 \] \] \endavm\eqdef{RL2}$$

$$\avm \[ x0 & \[ maj & \@1 n \cr syn & \[ form & \@3 \cr agr & \[
cls & non \cr num & pl \] \cr comps & \@4 \cr spr & \@5 \] \cr surf
& plurn(\@2) \] \cr x1 & \[ maj & \@1 \cr syn & \[ form & \@3 \cr agr
& \[ num & sg \cr cls & tsg \] \cr comps & \@4 \cr spr & \@5 \] \cr
surf & \@2 \] \] \endavm\eqdef{RL3}$$

\vfill\eject
\heading{Works Cited}
\bibliography{990514-seniorthesis}
\bibliographystyle{plain}
\bye