Guidelines for Command Line Interface Design

Mark Kampe

1. Introduction

Ninety-nine percent of everything that is written about human interface engineering is about Graphical User Interfaces (GUIs). There are good reasons for this:

It is said that a picture is worth a thousand words. Much more information can be presented in a two dimensional graphical display than could be represented by text occupying the same number of square inches.
Just as people who do not speak a language often resort to gestures for communication, it feels quite natural to express our needs by pointing at mnemonic graphical icons.

Information presented in a graphical interface is easier to comprehend. Graphical interfaces are easier to learn. Many people consider GUIs to be the foundation of usability.

All of this is true, and yet we still find command line interfaces (and their sibling, simple textual output) all around us. There are many reasons to use command line interfaces, and there are usability factors that can be applied to the design of these interfaces.

2. Why Command Line Interfaces

I know people who believe the only reasons that CLIs exist are: troglodyte users and lazy programmers. Both of these do exist, and they do indeed favor CLIs ... but there are much better reasons for CLIs to continue to exist:

scriptability
expressive power
conciseness of expression
composability of functions
parseability
minimal environments

2.1 Scriptability

Many of the things that people do with computers are done, perhaps with minor variations, over and over again. If the functions to be performed can be controlled from a CLI, it is easy to create a script of commands to be run. The use of command scripts is so popular that "command line interpreters" have evolved into rich programming languages with variables, decisions, iteration and functions.

Scripts can be run many times, and will always be the same. There is no danger of a person misspelling a word, or clicking slightly outside of a box. Scripts can be reviewed, and adapted for new uses. Once they are perfected, they can be run successfully by people who do not understand how they work. It is tempting to think that we can record cursor motions and mouse-clicks and replay them, but this doesn't work. The sizes and positions of dialog boxes change with the text they contain, the size and contents of the screen, and from one release to another. If you want a complex operation performed correctly, time after time, a command script is the simplest and best way to do it.

2.2 Expressive Power

Most of us can express ourselves far more clearly in a written or verbal language than we can with gestures. This is because our verbal lanuages have a richer vocabulary and more elaborate syntax. The same is true when it comes to describing a set of operations to perform. A richer language can express more complex thoughts.

Once simple command line interpreters have evolved into complex programming languages:

they have variables
they can perform arithmetic, logical, and character computations.
they can make decisions
they support various types of loops
they have functions with parameter substitution
they are capable of operating on the results of the programs they run.

It is possible to describe computations in a scripting language that could never be described through a Graphical User Interface (unless the GUI was for program editor ... it's been done).

GUIs make it very easy to perform standard operations on a limited set of objects with a few simple variations. GUIs do not have a rich enough vocabulary and syntax to describe complex operations.

2.3 Conciseness of Expression

For doing obvious operations on a limited set of objects, GUIs can be very simple to use.

For the primary operation, move the cursor to the object and left-click.
To select from a list of simple secondary operations, you can usually move the cursor to the object and right-click.
To process an object with a running application, you can drag the object into an icon for the desired service.
To process multiple objects you can do regional or incremental selects before selecting the operation to perform.

The ease, simplicity, and intuitiveness of these visual metaphors are major reasons for the success of GUIs. Why would anyone need to use a more complex language?

I just say "a porterhouse, medium rare, baked potato, with sour cream, salad with ranch dressing".

Sit an experienced Word user next to an experienced vi/emacs user, and see how long it takes each of them to make a set of edits. The latter editors may have arcane command languages ... but the simple fact is that 3 or 4 keystrokes can be entered much more quickly and easily than shifting a hand over to the mouse, moving to an object, pulling down a menu, and selecting an option.

When the lists of options are long, the decision trees are deep, or there are many different selections to be combined, selecting what you want from menus becomes extremely cumbersome ... and the cost of learning a real language becomes a better investment. It takes less body movement, and less time to enter a command than it takes to use a mouse ... and the more complex the command, the greater the difference.

2.4 Composability of Functions

A key premise of object oriented programming is that one can define a useful object, and then use it over and over again, without having to reinvent the same functionality every time you need it. This is the "toolbox" principle ... which was made famous by the cryptic commands of the early UNIX systems.

Major graphical applications tend to have numerous menus to import, export, and transform data in all kinds of interesting ways. The "toolbox" approach is to build a dozen small commands that each perform an interesting data manipulation:

sort reads input lines, and writes them back out in a sorted order.
grep reads input lines, compares them against a regular expression, and writes the lines that match.
uniq reads input lines, compares them, and prints out only the unique ones (no duplicates).
cut reads input lines, extracts fields from them based on specified delimiters, and writes the extracted fields to its output.
find walks a directory tree looking for names that match a specified pattern, and executes a specified command on each of them.
tr reads an input stream and performs specified character translations.
mc reads input lines and rewrites them in multiple columns.
more reads input lines and sends them to its output, one screen-full at a time.
pr reads input lines, breaks them into pages with headers and footers and sends them to its output.
print reads input lines and sends them to a printer.
mail reads input lines and turns them into a message, which it sends to a specified recipent.

If you think in GUI terms, these would seem to be quirky little parts of programs ... and you might laugh at how strange and cryptic they are ... until you see someone type in a one line command that searches his entire disk for music, pulls out the Blues albums, and copies them all into an artist/album/title directory hierarchy on your flash drive, and emails you a list of the albums you just got. Then you will realize that there may be something to this blue-collar toolkit stuff.

In a GUI world, you think about what you can do with the program you are running. In a CLI world you think about what can be done with the transitive closure of all existing programs, and the new ones that you write. You don't write a program that does exactly what you need. You write a new program that does the piece of what you need that isn't covered by existing programs.

2.5 Output Parseability

GUIs tend to display information in well delimited dialog boxes, scrollable lists and navigable data trees. These representations are designed for visual clarity and intuitive traversal by human beings. If a program wanted to find selective information in such a display it might be relatively difficult to find.

CLI programs tend to produce output in lines, columns, or other more readily parseable formats. This makes it relatively easy for another program to find and extract specific pieces of information from CLI output. As an example, it has long been common for people to write shut-down scripts that run the ps (process status) command to get information about running processes, use the grep (regular expression processor) command to extract the line that describes a desired daemon, use the cut command to pull out the process ID field from that line, and then use the kill command to send a signal to the desired process.

2.6 Minimal Environments

Thus far we have considered advantages that CLIs may offer over GUIs. CLIs are also used in situations where there is insufficient infrastructure to support a GUI.

There are systems that lack either graphical displays or the storage, cycles required by the programs that support graphical output.
During the early stages of system start-up the complex user-mode services that are required to manage graphical sessions may have not yet been started.
When debugging the operating system, we cannot use services implemented by the operating system. This limits us to the use of firm-ware or minimal kernel supported serial console sessions.
Diagnosis and management of remote systems must often be performed over (relatively) slow serial communications sessions, that cannot support the bandwidth required for graphical output.

In all of these situations, a command line interface may be the only practical mode of interaction.

3. CLIs and Usability

Many people would say that if you have to resort to a CLI, you have already abandoned usuability ... but (as noted previously) there positive reasons to use a CLI. Moreover, if you think about it for a minute, it becomes clear that usability considerations are (if anything) even more important with CLIs than with GUIs.

If we consider the factors that contribute to usability (familiarity, intuitiveness, simplicity, consistency of metaphors, robustness, adaptability) we see that none of them (with the possible of intuitiveness) is particularly tied to graphical interfaces. We are still going to apply the same principles to the design of command line interfaces ... and we will just have to work a little bit harder on intuitiveness.

The primary user-costs in most GUIs are finding the desired item, cursor motion, and clicks. We mitigate these costs by consistency of placement (making things easier to find) and designing our menus and dialogs around the most likely operations. In CLIs the primary costs are remembering command names and argument orders, and then having to type them. We attempt to mitigate the memory cost by using mnemonic names and standard conventions. We attempt to mitigate the typing costs by keeping names short and/or allowing abbreviations.

3.1 Mnemonic Command Names

We attempt to make command names easier to remember by making those names mnemonic (reminiscent of words that describe their functions). But we also want to make sure that commands have short (and intuitive) abbreviations:

sort ... to sort input lines
find ... to find desired files
copy (cp) ... for copying files
remove (rm) ... to delete a file

Unfortunately such naming discipline has not always been followed. In UNIX, Ken Thompson and Dennis Richie seem to have gone for short but less intuitively obvious names:

ls for directory listings
cat (short for concatenate) appending multiple files together and sending them to the standard output.
grep (short for global regular expression processor) for pulling desired lines out of an input stream.
awk for a macro language designed by Aho, Weingarten and Kernighan.

The basic set of UNIX commands proved to be very useful, and the tool-box concept which they pioneered changed the way that people thought about interactive commands and command languages. I think, however, that it is safe to say that these commands (which are still available on all UNIX-derived systems) have survived, not due to their names, but despite them. If there is a lesson we can learn from this, it is probably that mnemonic names are extremely valuable as learning aids ... but after they become so familiar that they are simply tokens in a language, typed by muscle-memory, their mnemonic qualities become much less important.

3.2 Intuitive Arguments

For many commands, the arguments are obvious. If I am deleting files,

del project.bak *.o it is obvious that the arguments will be the names of the files to be deleted, and the order of the arguments is of little consequence.

Where multiple arguments must be specified, it often helps if there is some rationale to the order. So it is that the copy (cp) command takes source files as its first argument, and the destination as its last argument:

We can almost see the implied word to in these commands, which makes the argument order intuitive. This then becomes a convention, where all commands that take input and output file names interpret earlier file names as input files, and the last file name as a target.

In cases where multiple arguments play more subtle roles (e.g. different types of input files which cannot be automatically distinguished from one another, or other types of arguments), expecting people to remember the correct order for multiple arguments is a formula for disaster, so another syntax is required. UNIX has a dd (disk to disk copy) command that can be used to perform copies with specified blocksizes (often important for the performance of large copy operations, or when copying to record structured devices like magnetic tapes). It has a dozen or more possible arguments, and allows them (only the necessary ones) to be specified in a name=value format:

dd if=/dev/disk01 of=/dev/tape bs=64K count=200

It does take a few more keystrokes to type the name=, but it replaces (hard to remember) argument order with (easier to remember) parameter names. If most parameters have reasonable default values (so that they usually do not need to be specified), and the keywords have short abreviations (e.g. if for input file the result can be something that is both easy to remember and easy to type.

3.3 Supporting CLI Infrastructure

Another way to make arguments easy to remember is to provide standard (command independent) means of specifying common arguments. Almost all commands have a notion of input and output files. In UNIX, the shell program (the command interpreter) provided a general syntax for specifying input and output files:

ls > dir.out # send the output to the file dir.out

sh < script # run the commands in the file script

cc foo.c 2> errors # send the errors to the file errors

grep ERROR log >> errors # find lines in log that contain "ERROR" and append those lines to the existing file errors

ls | grep '*.c' # list the files in this directory
and filter out only the names of the .c files

Given that all UNIX shells support input and output redirection, most commands, if no specific files are specified:

read from their standard input (file descriptor 0)
write to their standard output (file descriptor 1)
send errors to their standard error (file descriptor 2)

As a result of this:

users need not remember the syntax for specifying input and output files to various programs.
users can easily turn the output of one program into the input of another and use standard tools to process that output to obtain the desired information.

3.4 Command Line Switches

Originally, in UNIX, there were two kinds of arguments:

the names of files to be operated on
switches (usually preceded by a -) that enabled various options.

DOS had similar conventions, but used a slash (/) to introduce option controlling switches. If I wanted a listing of the directory foo, but I wanted a long listing (with all the details), I would add the -l (long) switch:

ls -l foo If I wanted to include all of the sub-directories under foo, I would also add the -r (recursive) switch:

ls -lR foo

To the extent that there are general qualifiers that are commonly used, it is a good idea to wrote new programs to use the same qualifiers that are used in popular existing programs:

-l long (give me all the details)
-v verbose (tell me every step)
-q quiet (skip the commentary)
-R operate recursively (files in sub-directories)
-f force (I don't care if the file is read only)

These are quickly learned, and tend to work well. It is command specific switches that people have trouble remembering. Obviously, the letters should be mnemonically related to the functions the switches perform. This can help users to remember the names of switches they have used before, but they don't do much for users who are trying to figure out how to use the program. This is an area where (browsable) menus are clearly superior.

CLI programs do not have run-time browsable menus, but they do tend to have usage messages. Most CLI programs, if they detect an invalid switch (or missing or otherwise improper arguments) will print out a usage message that describes the expected arguments and enumerates all of the legal switch values. Some programs make a point of ensuring that the argument help or the switch values -h or -help are not otherwise legal, so that they can be specfically used to get the usage message. Another common convention is that if help is specified along with another legal option, to print out a more detailed usage message for the specified option. Usage messages may be too short to completely describe the use of the program ... but they do go a long way to eliminate the problem of knowing what options are available, or remembering the switch name for a particular option.

3.5 CLI Argument Standards

How does one specify multiple switches?

UNIX systems have tended towards single letter switches (which are easier to type), and allows multiple switches to be combined into a single argument:

ls -alRu Which means

This convention was encouraged through a standard argument processing function (getopt(3)). But the combination of multiple switches into a single argument fundamentally assumes that all switches are single letters. If you wanted to support switches that could involve multiple letters, you would need to separate each switch into its own argument (as DOS did):

dir /verbose /wide /o:a Realizing that this involved considerably more typing, DOS also allowed users to specify unambiguous abbreviations for arguments:

dir /v /w /o:a

The older UNIX convention is indeed a little bit more concise, but much more cryptic. Thus it is that newer UNIX/LINUX CLI programs have tended towards allowing (more mnemonic) multi-letter arguements and putting separating switches into distinct arguments (e.g. svn).

4 Common Problems and Solutions

As observed in the previous chapter, the basic principles of usability remain the same whether the interfaces are graphical or command line. The previous chapter discussed some of the unique problems that CLIs present in the areas of intuitiveness and simplicity. There are other usability issues, where CLIs and GUIs face exactly the same problems, and address them in similar ways.

4.1 Robustness and Helpfulness

Most interesting programs are susceptible to errors (e.g. input syntax errors, missing files, failures of partner services). The difference between more and less usable programs is often how they deal with these errors:

How completely, accurately, and understandably they describe the cause of the error.
How able they are to detect and correct simple, common, and obvious errors.
How reasonably they continue operation in the face of the error.

When an error is encountered, it must be described in a way that is clear enough to enable the user to easily find and fix the problem. Consider, for example, a compiler, and three possible ways of reporting an error:

syntax error
foo.c 175: syntax error
line 175 of file foo.c, in function getDrive:
expression beginning "(jbod.ndrives" contains an invalid operator "RESERVE"

A GUI front end to the compilation process could further improve on this by going to the file in question and highlighting the offending token ... but not having graphical output seldom precludes doing a good job of describing a problem.

For some programs, an input error may justify a warning, but not prevent the program from going on to perform its other functions. Some errors may be temporary, and a program can recover from them by continuing to retry until the error condition has been corrected. Even for programs where an input error precludes successful completion of the request, there may still value on continuing to check for other errors before aborting.

After encountering the above syntax error, the compiler could respond in many ways:

abort
resume compilation with the next file
resume compilation with the next statement

A GUI front end might actually give the user the opportunity to correct the error and immediately resume compillation ... but not being able to immediately ask the user for guidance does not preclude intelligent and graceful error handling.

A command that is implemented as a CLI has fewer opportunities for complex output and interaction with the user. This is not, however, an excuse for lazy error handling.

4.2 Adaptability and Configurability

Many programs have large numbers of specifiable options. Most try to reduce the resulting complexity by assigning reasonable default values to each. To the extent that the default values are well chosen, users can avoid having to specify most parameters, and the use of the program can be greatly simplified. This is equally true for both GUI and CLI programs.

There are often parameters for which different users would want different default values. There are a few basic approaches that have been taken to this problem:

environment variables
per user (and/or) per application configuration files
system registries

While each of these involves different details, they all share a few basic steps:

associate a variable name with each specifiable parameter or option.
when the program starts up, search the appropriate places (environment, configuration files, registries) to find values for those variables.
initialize the value of each configurable parameter or option from the corresponding variable.

Such external preference configuration mechanisms can have the effect of specifying dozens or hundreds of parameter values every time the program runs ... and these are values that do not have to be specified on the command line.

A key difference between CLIs and GUIs is how these preferences are managed:

GUI programs tend to have preference configuration dialogs and maintain user preference files, where they automatically save the last specified values for each configurable property. In this way, new sessions seem to automatically pick up where the previous session left off.
CLI programs tend to use environment variables that are explicitly set by users (in shell session initialization scripts). This enables users to very specifically control the default behavior of each application.

The GUI approach is much easier for a novice to use (since the preference dialogs can guide them through the available options). The CLI approach is much more amenable to scripting (because environment variables are typically initialized by scripts). Also, the CLI approach makes it easy for a single user to have multiple profiles (using different default values for different situations).

Either way, the fundamental principle of configurable application design is to identify all configurable aspects of program behavior, and make it possible to externally specify (per user or per context) default values for each of them.

4.3 On-line Help

GUI programs often offer a rich assortment of help mechanisms:

entire on-line manuals with index and full-text search capabilities.
context sensitive guidance that automatically provides help information for the currently selected features and operations.
cursor sensitive tool-tip pop-ups that describe on-screen artifacts as the cursor moves over them.

For CLI programs, the options are a little bit weaker:

General usage messages in response to command line errors.
Specific usage messages in response to specific user requests for additional help information on a specific topic.
Problem assessments and corrective action suggestions printed in response to errors.
The user can often open on-line documentation in another window (and perhaps use full-text search capabilities there).

While these options are indeed less elegant than those that can be offered by GUI applications, they still leave ample room for providing good on-line help information.

5. Summary

GUI applications enjoy fundamental usability advantages over CLI based applications:

the availability of an organized two dimensional display with rich content navigation widgets makes it possible to make more (and more complex) output easily understandable.
the availability of browsable menus and context sensitive help (including things like tool-tip pop-ups) makes it much easier for users to understand what their options are at any particular juncture.

None the less, there are many other factors that would drive a decision to provide some functionality in a CLI form. This is not a reason to give up on usability. All of the basic principles of usability still apply to CLIs, and designing our software with these considerations in mind can result in significantly more usable programs.

ls > dir.out	# send the output to the file dir.out
sh < script	# run the commands in the file script
cc foo.c 2> errors	# send the errors to the file errors
grep ERROR log >> errors	# find lines in log that contain "ERROR" and append those lines to the existing file errors
ls \| grep '*.c'	# list the files in this directory and filter out only the names of the .c files