Unix tools exercises

These all use a sample text called exatext1, which you should download into a directory of your own.

Pipelines

  1. Create a frequency list of the words in exatext1 with two consecutive vowels?
  2. How many words of 5 letters or more are there in exatext1
  3. How many different words of exactly 5 letters are there in exatext1
  4. What is the most frequent 7-letter word?
  5. List all words with exactly two non-consecutive vowels.
  6. List all words in exatext1 ending in ``-ing''. Which of those words are morphologically derived words? (Hint: On Julius spell -v shows morphological derivations. But careful, since GNU Spell derived versions don't seem to)

Awk

Imaginary data in exatext2. Two texts A, B nouns verbs words.
shop   noun  41  32
shop   verb  13  7
red    adj   2   0
work   noun  17  19
bowl   noun  3   1
awk '$3 > 15 {print $1}' < exatext2
shop
work
awk '$4>$3 {print $1,"occurred",$3+$4,"times when used as a",$2 }' <
      exatext2

work occurred 36 times when used as a noun
awk '$2 == "noun" {texta = texta + $3; textb = textb + $4} 
     END {print "Nouns:", texta, "times in Text A and", 
      textb, "times in Text B"}' < exatext2

Chris Brew
Last modified: Thu Apr 5 15:58:23 EDT 2007