Editing trees and AVMs in LaTeX using Emacs (or, more generally, editing bracketed objects)


In many LaTeX packages, trees and AVMs are usually written as recursively embedded structures of bracketed objects. In qtree.sty, for example, [.XP YP ZP ] produces the following tree:
      XP
     / \
   YP   ZP 
YP and ZP could be trees (bracketed objects in the source file) themselves in a more complicated tree. So, in real life, you have things like:
 \Tree [.{\bf S}  [.NP {buta ga} ]
                  [.NP {kuma_i ni} ]
                  [.{\bf S} [.NP PRO_i ]
                            [.NP {boosi o} ]
                            [.V kabur- ] ] [.V -ase-ta ] ]
 
If you indent it properly, the code is relatively readable (but by default Emacs doesn't automatically do this for you – if you know of a good way of doing this, please let me know), but as the tree gets more complicated, it becomes increasingly more difficult to figure out the matching of brackets and to edit parts of a tree without messing up the entire thing (and you spend your whole night reconstructing the single most complicated tree in your paper, which, before making that slight modification, always compiled happily).

Okay. But the solution is simple (since the data structure is simple). You only need some way of selecting a portion of the text that is enclosed by matching parentheses (or brackets). And a package for identifying matching parentheses is already out there. That is, mic-paren.el highlights the matching parentheses (or the whole text enclosed by matching parentheses) automatically when the cursor is on a parenthesis as follows:



So, let's just use this package to define a function that marks a region enclosed by parentheses and then you no longer need to worry about which opening parenthesis matches which closing parenthesis and it will become a trivial task to edit (copy, delete etc.) chunks of text that correspond to subtrees or substructures of AVMs. (After all, computers are better at doing this kind of thing than humans and that's one of the reasons you decided to use Emacs, right?)

Here is a function (copy-match-paren) that For example, if you want to copy the embedded S in the above tree to somewhere else, you can just put the cursor on the opening bracket for this node and call this function (by typing M-x copy-match-paren) to mark and copy the relevant portion and then move the cursor to wherever you want to paste the stuff and type C-y to yank the text.

If you instead want to remove the embedded S from the above tree structure, after calling copy-match-paren, call kill-region (C-w) to remove the selected region, which gives you the following result:
 \Tree [.{\bf S}  [.NP {buta ga} ]
                  [.NP {kuma_i ni} ]
                   [.V -ase-ta ] ]
 


AVMs in avm.sty like the following can be edited in a similar way.
\[\textit{verb}\\
  PHON & \textit{nom-ase} \\
  CONT & \[QUANTS &  \@a  \\
           NCL    &  \[\textit{cause-rel}\\
                       ACTOR & \textit{i} \\
                       UNDERGOER & \textit{j} \\
                       EFFECT \; \[QUANTS & \@b \\
                                   NCL &  \[\textit{drink} &  \\
                                            ACTOR & \textit{j} \\
                                            UNDERGOER & \textit{k} \]\]\]\]\]
 
Here, the only difference is that the brackets are quoted. So, if you want to delete a portion of a feature structure like the following, you should first move to the closing parenthesis and select and delete the region or select and delete the region on the opening parenthesis and get rid of the extra backlash by hand (but copying works in an expected way).


Yusuke Kubota
Last Modified: <March 21 2006>

This document was translated from LATEX by HEVEA.