Writing and Testing Your Code
Write Your Code
Use the fgrep specification document to write
your code. Be sure to debug your code as you go.
Note that there are some useful hints included later on the page.
C Library Header Files and Functions
The C library includes a great deal of well-tested functionality that you should take advantage of—there's almost never any advantage to writing your own version of a function already in the library.
Following are some suggested header libraries and functions to get you started.
Useful C Library Header Files
We found the following C header files to be either useful or necessary in our solution:
errno.h- Declares the global pseudovariable
errno, which is needed for understanding the cause of an error. stdio.h- Declares all the functions need to use the
stdio(standard I/O) package, which is the standard method for conveniently handling I/O in Unix. stdlib.h- Contains the declarations for a large number of library
functions, notably including
malloc,realloc, andfree. string.h- Declares all the important string functions.
You may find that you need other header files as well; if you compile
with -Wall the compiler will give you hints about missing headers.
Useful C Library Functions
There are a number of useful functions in the C library that will make
writing fgrep simpler:
getline(bp, bsizep, f)- Reads a line of input from the file
finto a buffer pointed to by*bp, which is of size*bsizep. If the buffer is too small to hold the line,getlinewill automatically resize it usingrealloc. The return value is the length of the line read, or -1 on error or end of file. getopt(argc, argv, optstring)- Parses command-line options.
optstringis a string that specifies the valid options; for example, ifoptstringis"ab:c", then-aand-care valid options that don't take an argument, while-bis a valid option that does take an argument. The return value is the option character for the next option found, or -1 when there are no more options. Thegetoptfunction also sets the global variablesoptind(the index of the next argument to be processed) andoptarg(the argument for the current option, if it requires one). strstr(haystack, needle)- Returns a pointer to the first occurrence of the substring
needlein the stringhaystack, or NULL ifneedleis not found. strcmp(a, b)- Compares the strings
aandb, returning 0 if they match and nonzero otherwise. strcasecmp(a, b)- Is the same as
strcmpexcept that it ignores case.
strlen(s)- Returns the length of string
s.strlentakes time linear in the length of the string, so DO NOT use it inside aforconstruct; call it outside the loop and save its value. ferror(f)- Returns nonzero (true) if there has been an error on the
FILE *f. strerror(errno)- Gives a string representation of the most recent I/O error, which
is encoded in the global variable
errno(you need to include<errno.h>and<string.h>to use it). malloc, realloc,andfree- Allocate, expand, and free memory.
fopenandfclose- Open and close files and set them up for using the standard input/output functions.
Hints
- Don't try to write everything at once! Either:
- Start with argument
processing. After you've processed the switches, put in some
temporary code that prints the values of all your switch
variables, and then loops through the remaining arguments,
printing them with the
%sformat specifier. Once you have argument processing done, you can start writing the search functions. - Start with the line-reading algorithm. Get that working first, and test it by printing out the lines you read. Once you have that working, you can start writing the search functions.
- Start with argument
processing. After you've processed the switches, put in some
temporary code that prints the values of all your switch
variables, and then loops through the remaining arguments,
printing them with the
- Similarly, start writing the search function in a basic way. Don't try to handle all the options from the start; instead, concentrate on finding and printing matching lines. Once you have that going, you can adapt your code to add more features.
- Use
gdb! Debuggers are amazingly useful; that's why we taught you how to usegdb. - In
gdb, usestepandnext, notstepiandnexti. You really don't want to debug this lab at the instruction level. - As mentioned above, you should start
your self-expanding buffer at a very small size; we recommend 2
bytes. (A 1-byte buffer will cause problems with
fgets, so don't go overboard.) Make the initial buffer size a constant by using#define. If necessary, step through the code withgdb. But be sure the usenextto skip overmallocandrealloc; you really don't want to wade through those functions. If you get into them accidentally, you can get out again withfinish. - Be sure to test as you go along!
-
Once you're starting to get close to a full implementation, you can (and should) compare your output to the results from the “official” (system) version of
fgrep.If you tested your program with
./fgrep -i test test1.txt; echo $?then you can run the system version by simply removing the
./, as infgrep -i test test1.txt; echo $?Your output (and exit code) should match the system
fgrepexactly. In fact, that's how we're going to test your implementation.
Testing
Both as you work and when you're done, you should be testing to make sure your code is working properly.
Some things to test for:
- Short Patterns
- Make sure you can match single characters and short strings.
- Short Lines
- Make sure you can match a line that is exactly the same as the pattern.
- Very Long Lines
- Make sure you can match lines that are extremely long, such as hundreds or even thousands of bytes. And make sure you don't have \( \mathrm{O}(N^2) \) behavior on those long lines.
- Files With No Final Newline
- The file
test3.txtdeliberately omits the final newline, which is a common violation of good Unix practice. Make sure you can find the word “newline” in it. - Line Prefixes
- When there are multiple files, the matched lines should be
prefixed with the filename. If the
-nswitch is present, the line number should be given after the filename. Test both with and without the-nswitch and with and without multiple file arguments. - Multiple Matches
- When a line contains multiple matches, it should be printed only
once. Similarly, with the
-lswitch, if a file contains multiple matches it should still only be listed once. - Arbitrary Switch Ordering
- Test with different switch orderings (e.g., both
-i -nand-n -i). - Switch Combinations
- What happens when two switches or more are combined, such as
-i -l? What about when they don't make sense when used together, such as with-l -n? - Illegal Usage
- Be sure to generate a usage message if a supplied switch isn't supported, or if there is no pattern provided.
- Missing Files
- Handle inaccessible files correctly. Pay particular attention to
fgrep's behavior when using the-qswitch with and without missing files. - Exit Status
- Use
echo $?(which must be the very next command after you runfgrep) to showfgrep's exit status after various tests.
Note that the above is not necessarily an exhaustive list. You should also come up with tests of your own.
(When logged in, completion status appears here.)