Unix tools exercises
These all use a sample
text called exatext1, which you should download
into a directory of your own.
Pipelines
-
Create a frequency list of
the words in exatext1 with two consecutive vowels?
- How many words of 5 letters or more are there in
exatext1
- How many different words of exactly 5 letters are there in
exatext1
- What is the most frequent 7-letter word?
- List all words with exactly two non-consecutive vowels.
- List all words in exatext1 ending in ``-ing''.
Which of those words are morphologically derived words?
(Hint: On Julius spell -v shows morphological
derivations. But careful, since GNU Spell derived
versions don't seem to)
Awk
Imaginary data in exatext2. Two texts A, B nouns verbs words.
shop noun 41 32
shop verb 13 7
red adj 2 0
work noun 17 19
bowl noun 3 1
awk '$3 > 15 {print $1}' < exatext2
shop
work
awk '$4>$3 {print $1,"occurred",$3+$4,"times when used as a",$2 }' <
exatext2
work occurred 36 times when used as a noun
awk '$2 == "noun" {texta = texta + $3; textb = textb + $4}
END {print "Nouns:", texta, "times in Text A and",
textb, "times in Text B"}' < exatext2
Chris Brew
Last modified: Thu Apr 5 15:58:23 EDT 2007