-----

1D Elementary Data Structures

-----

What the application needs

Terms describing the data structure from the point of view of the application, which only cares how it behaves and not how it is implemented.

list
Generic term for a collection of objects. May or may not contain duplicates. Application may or may not require that it be kept in a specified order.

ordered list
A list in which the order matters to the application. Therefore, for example, the implementer cannot scramble the order to improve efficiency.

set
List where the order does not matter to the application (implementer can pick order so as to optimize performance) and in which there are no duplicates.

multi-set
Like a set, but may contain duplicates.

double-ended queue (dequeue)
An ordered list in which insertion and deletion occur only at the two ends of the list. That is, elements cannot be inserted into the middle of the list or deleted from the middle of the list.

stack
An ordered list in which insertion and deletion both occur only at one end (e.g. at the start).

queue
An ordered list in which insertion always occurs at one end and deletion always occurs at the other end.

Implementation methods

There are a variety of options for the person implementing a list (or set or stack or whatever).

array
We all know what arrays are. Arrays are included here because a list can be implemented using a 1D array. If the maximum length of the list is not known in advance, code must be provided to detect array overflow and expand the array. Expanding requires allocating a new, longer array, copying the contents of the old array, and deallocating the old array.

Arrays are commonly used when two conditions hold. First the maximum length of the list can be accurately estimated in advance (so array expansion is rarely needed). Second, insertion and deletion occur only at the ends of the list. (Insertion and deletion in the middle of an array-based list is slow.)

linked list
A list implemented by a set of nodes, each of which points to the next. An object of class (or struct) "node" contains a field pointing to the next node, as well as any number of fields of data. Optionally, there may be a second "list" class (or struct) used as a header for the list. One field of the list class is a pointer to the first node in the list. Other fields may also be included in the "list" object, such as a pointer to the last node in the list, the length of the list, etc.

Linked lists are commonly used when the length of the list is not known in advance and/or when it is frequently necessary to insert and/or delete in the middle of the list.

doubly-linked vs. singly-linked lists
In a doubly-linked list, each node points to the next node and also to the previous node. In a singly-linked list, each node points to the next node but not back to the previous node.

circular list
A linked list in which the last node points to the first node. If the list is doubly-linked, the first node must also point back to the last node.

binary tree
Used to implement lists whose elements have a natural order (e.g. numbers) and either (a) the application would like the list kept in this order or (b) the order of elements is irrelevant to the application (e.g. this list is implementing a set).

Each element in a binary tree is stored in a "node" class (or struct). Each node contains pointers to a left child node and a right child node. In some implementations, it may also contain a pointer to the parent node. A tree may also have an object of a second "tree" class (or struct) which as a header for the tree. The "tree" object contains a pointer to the root of the tree (the node with no parent) and whatever other information the programmer wants to squirrel away in it (e.g. number of nodes currently in the tree).

In a binary tree, elements are kept sorted in left to right order across the tree. That is, if N is a node, then the value stored in N must be larger than the value stored in left-child(N) and less than the value stored in right-child(N). Variant trees may have the opposite order (smaller values to the right rather than to the left) or may allow two different nodes to contain equal values.


This page is maintained by Geoff Kuenning.