CS 105

Writing and Testing Your Code

Write Your Code

Use the fgrep specification document to write your code. Be sure to debug your code as you go.

Note that there are some useful hints included later on the page.

C Library Header Files and Functions

The C library includes a great deal of well-tested functionality that you should take advantage of—there's almost never any advantage to writing your own version of a function already in the library.

Following are some suggested header libraries and functions to get you started.

Useful C Library Header Files

We found the following C header files to be either useful or necessary in our solution:

errno.h
Declares the global pseudovariable errno, which is needed for understanding the cause of an error.
stdio.h
Declares all the functions need to use the stdio (standard I/O) package, which is the standard method for conveniently handling I/O in Unix.
stdlib.h
Contains the declarations for a large number of library functions, notably including malloc, realloc, and free.
string.h
Declares all the important string functions.

You may find that you need other header files as well; if you compile with -Wall the compiler will give you hints about missing headers.

Useful C Library Functions

There are a number of useful functions in the C library that will make writing fgrep simpler:

getline(bp, bsizep, f)
Reads a line of input from the file f into a buffer pointed to by *bp, which is of size *bsizep. If the buffer is too small to hold the line, getline will automatically resize it using realloc. The return value is the length of the line read, or -1 on error or end of file.
getopt(argc, argv, optstring)
Parses command-line options. optstring is a string that specifies the valid options; for example, if optstring is "ab:c", then -a and -c are valid options that don't take an argument, while -b is a valid option that does take an argument. The return value is the option character for the next option found, or -1 when there are no more options. The getopt function also sets the global variables optind (the index of the next argument to be processed) and optarg (the argument for the current option, if it requires one).
strstr(haystack, needle)
Returns a pointer to the first occurrence of the substring needle in the string haystack, or NULL if needle is not found.
strcmp(a, b)
Compares the strings a and b, returning 0 if they match and nonzero otherwise.
strcasecmp(a, b)
Is the same as strcmp except that it ignores case.
strlen(s)
Returns the length of string s. strlen takes time linear in the length of the string, so DO NOT use it inside a for construct; call it outside the loop and save its value.
ferror(f)
Returns nonzero (true) if there has been an error on the FILE * f.
strerror(errno)
Gives a string representation of the most recent I/O error, which is encoded in the global variable errno (you need to include <errno.h> and <string.h> to use it).
malloc, realloc, and free
Allocate, expand, and free memory.
fopen and fclose
Open and close files and set them up for using the standard input/output functions.

Hints

  • Don't try to write everything at once! Either:
    • Start with argument processing. After you've processed the switches, put in some temporary code that prints the values of all your switch variables, and then loops through the remaining arguments, printing them with the %s format specifier. Once you have argument processing done, you can start writing the search functions.
    • Start with the line-reading algorithm. Get that working first, and test it by printing out the lines you read. Once you have that working, you can start writing the search functions.
  • Similarly, start writing the search function in a basic way. Don't try to handle all the options from the start; instead, concentrate on finding and printing matching lines. Once you have that going, you can adapt your code to add more features.
  • Use gdb! Debuggers are amazingly useful; that's why we taught you how to use gdb.
  • In gdb, use step and next, not stepi and nexti. You really don't want to debug this lab at the instruction level.
  • As mentioned above, you should start your self-expanding buffer at a very small size; we recommend 2 bytes. (A 1-byte buffer will cause problems with fgets, so don't go overboard.) Make the initial buffer size a constant by using #define. If necessary, step through the code with gdb. But be sure the use next to skip over malloc and realloc; you really don't want to wade through those functions. If you get into them accidentally, you can get out again with finish.
  • Be sure to test as you go along!
  • Once you're starting to get close to a full implementation, you can (and should) compare your output to the results from the “official” (system) version of fgrep.

    If you tested your program with

    ./fgrep -i test test1.txt; echo $?
    

    then you can run the system version by simply removing the ./, as in

    fgrep -i test test1.txt; echo $?
    

    Your output (and exit code) should match the system fgrep exactly. In fact, that's how we're going to test your implementation.

Testing

Both as you work and when you're done, you should be testing to make sure your code is working properly.

Some things to test for:

Short Patterns
Make sure you can match single characters and short strings.
Short Lines
Make sure you can match a line that is exactly the same as the pattern.
Very Long Lines
Make sure you can match lines that are extremely long, such as hundreds or even thousands of bytes. And make sure you don't have \( \mathrm{O}(N^2) \) behavior on those long lines.
Files With No Final Newline
The file test3.txt deliberately omits the final newline, which is a common violation of good Unix practice. Make sure you can find the word “newline” in it.
Line Prefixes
When there are multiple files, the matched lines should be prefixed with the filename. If the -n switch is present, the line number should be given after the filename. Test both with and without the -n switch and with and without multiple file arguments.
Multiple Matches
When a line contains multiple matches, it should be printed only once. Similarly, with the -l switch, if a file contains multiple matches it should still only be listed once.
Arbitrary Switch Ordering
Test with different switch orderings (e.g., both -i -n and -n -i).
Switch Combinations
What happens when two switches or more are combined, such as -i -l? What about when they don't make sense when used together, such as with -l -n?
Illegal Usage
Be sure to generate a usage message if a supplied switch isn't supported, or if there is no pattern provided.
Missing Files
Handle inaccessible files correctly. Pay particular attention to fgrep's behavior when using the -q switch with and without missing files.
Exit Status
Use echo $? (which must be the very next command after you run fgrep) to show fgrep's exit status after various tests.

Note that the above is not necessarily an exhaustive list. You should also come up with tests of your own.

To Complete This Part of the Assignment

You'll know you're done with this part of the assignment when you've done all of the following:

(When logged in, completion status appears here.)