Phase 8: Using Your Search Tree
To round out our assignment, we'll use our TreeSet class template as an actual dictionary of English words.
We've provided a source file, minispell.cpp, which uses TreeSet<std::string> to provide a proof-of-concept for a rudimentary spelling checker. It's basically the same program you saw in Homework 6, but with some additional options to control insertion style.
Let's Go!
Adapt Your Makefile to Build minispell and Build It
Adjust your Makefile so that you build the program minispell (the executable will link just minispell.o since TreeSet is now a class template), and create this executable.
It will help if your Makefile has an OPTFLAGS= variable, as you then can set that variable from the command-line to easily have make build everything with optimization, as, for example,
make clean
make OPTFLAGS="-O3 -flto" LDFLAGS="-flto" minispell
or, if your LDFLAGS variable in your Makefile includes other libraries, you should instead run
make clean
make OPTFLAGS="-O3 -flto" LDFLAGS="-lsomelibraryineed -flto" minispell
where you replace -lLDFLAGS= on the command line overrides whatever is in your Makefile).
Run minispell
If you run the program with the --help argument, it will produce a usage message:
> ./minispell --help
Usage: ./minispell [options] [file-to-check ...]
Options:
-h, --help Print this message and exit.
-f, --file-order Insert words in the order they appear (default).
-s, --shuffled-order Insert words in random order, by shuffling the input.
-b, --balanced-order Insert words in a balanced order
-r, --random-insert Use a tree with RANDOMIZED insertion style (default)
-t, --root-insert Use a tree with ROOT insertion style
-l, --leaf-insert Use a tree with LEAF insertion style
-S, --use-std-set Use std::set instead of TreeSet
-n, --num-dict-words Number of words to read from the dictionary.
-m, --num-check-words Number of words to check for spelling.
-d, --dict-file Use a different dictionary file.
Default dictionary file: /cs70/data/smalldict.words
Default file to check: /cs70/data/ispell.words
If you run it without arguments, you should see output similar to
> ./minispell
Reading words from /cs70/data/smalldict.words... done!
Inserting into TreeSet, using randomized insertion (in order read)... done!
- insertion took 0.000517011 seconds
- 341 nodes, height 20, average depth 8.78299
- median word in dictionary: 'long'
Reading words from /cs70/data/ispell.words... done!
Looking up these words in the dictionary... done!
- looking up took 0.00717264 seconds
- 34831 words read, 325 in dictionary
Exploration
There are five dictionary files available in /cs70/data/ that you can use with this program:
smalldict.words- A small dictionary of 341 common English words.
ispell.words- The main dictionary used by the
ispellspelling checker, containing 34,831 common words. largedict.words- A larger dictionary of 281,478 English words, mostly taken from scrabble word lists.
web2- A dictionary of 234,937 words, originally based on some version of “Webster's Dictionary” and provided on most Unix systems.
all.words- The above dictionaries combined, for a total of 410,239 words.
Explore some of the other options to the program by doing some other runs. Also, feel free to run the program multiple times to see if/when the program output varies.
Interesting things to try include running
./minispell -r -d /cs70/data/all.words /cs70/data/web2
and contrasting the results with running
./minispell -S -d /cs70/data/all.words /cs70/data/web2
Questions
On Gradescope, you'll find some questions asking you to run this program and think about the results.
(When logged in, completion status appears here.)