Lab 6: fgrep
fgrep is a simple command that searches one or more files
for a given string. In the I/O lecture we saw the “guts” of fgrep,
n = strlen(search_string);
while (1) {
nbytes = read(fd, buf, sizeof buf);
for (int i = 0; i < nbytes - n + 1; i++) {
if (strncmp(&buf[i], search_string, n) == 0)
/* Print line containing search_string */
}
}
and discussed the fact that this code fails if the search string spans two I/O buffers.
In this lab you'll create a version of fgrep that doesn't have
that major bug, and that also offers a number of other features that are
common in well-written Unix filters, such as
- “Switches” (options) on the command line can be used to modify the program's behavior in useful ways.
- Input can come from
stdinor from a file. - Multiple files can be processed in a single invocation.
- There is no built-in limit on the size of the file being processed, nor on the length of lines in that file.
Overview of a Unix Filter
A “proper” Unix filter program has a number of common characteristics:
- By default, it reads from standard input (
stdin) and writes to standard output (stdout). - It isn't “chatty”: it does its job without progress reports. (By default.)
- One or more file names can be given on the command line, in which
case it reads those files rather than
stdin. - Switches (options) are introduced by a single dash (
-) followed by a single character, or, alternatively, by two dashes (--) followed by a longer name. - Switches can appear in any order.
- Switches always precede file names and other arguments. Switches and arguments can't be intermixed.
- The exit status indicates whether the filter succeeded (using a filter-specific definition of “success”).
- If the filter is invoked incorrectly, it prints a “usage” message that briefly summarizes the correct invocation.
- Errors (including the usage message) are reported to
stderr. - A separate manual page thoroughly documents the program. The documentation is explicitly not part of the program itself.
For this lab, we'll ignore the man page requirement, but implement
the rest correctly.
Grading
The lab is worth 100 points, scored as follows:
- Basic Functionality
- 40 points for being able to find (and not find) strings in a single file, correctly handling files that do not end in a newline, and generating a correct exit status.
- Switches
- 5 points for each of the four switches you're asked to implement (i.e., 20 points if you implement all four).
- Standard Input and Multiple Files
- 15 points for correctly handling multiple files or no files on the command line (including correct line prefixes).
- Long Lines
- 10 points for handling lines of arbitrary length.
- Switch Ordering
- 5 points for accepting arbitrary switch orders (note that the
library functions
getoptandgetopt_longdo this for you---easy points!). - Missing Files
- 5 points for correctly handling missing files, including producing the proper exit code.
- Miscellaneous
- 5 points for generating correct usage messages.
Steps
(When logged in, completion status appears here.)