When to Use which Container
The C++ standard library provides different container types with different
abilities. The question now is: When do you use which container type? Table below provides an overview. However, it
contains general statements that might not fit in reality. For example, if you
manage only a few elements you can ignore the complexity because short element
processing with linear complexity is better than long element processing with
logarithmic complexity.
As a supplement to the table, the following rules of thumb might help:
-
By default, you should use a vector. It has the simplest internal data structure and provides random access. Thus, data access is convenient and flexible, and data processing is often fast enough.
-
If you insert and/or remove elements often at the beginning and the end of a sequence, you should use a deque. You should also use a deque if it is important that the amount of internal memory used by the container shrinks when elements are removed. Also, because a vector usually uses one block of memory for its elements, a deque might be able to contain more elements because it uses several blocks.
-
If you insert, remove, and move elements often in the middle of a container, consider using a list. Lists provide special member functions to move elements from one container to another in constant time. Note, however, that because a list provides no random access, you might suffer significant performance penalties on access to elements inside the list if you only have the beginning of the list.Like all node-based containers, a list doesn't invalidate iterators that refer to elements, as long as those elements are part of the container. Vectors invalidate all of their iterators, pointers, and references whenever they exceed their capacity, and part of their iterators, pointers, and references on insertions and deletions. Deques invalidate iterators, pointers, and references when they change their size, respectively.
-
If you need a container that handles exceptions in a way that each operation either succeeds or has no effect, you should use either a list (without calling assignment operations and sort() and, if comparing the elements may throw, without calling merge (), remove(), remove_if(), and unique(); see page 172) or an associative container (without calling the multiple-element insert operations and, if copying/assigning the comparison criterion may throw, without calling swap()).
-
If you often need to search for elements according to a certain criterion, use a set or a multiset that sorts elements according to this sorting criterion. Keep in mind that the logarithmic complexity involved in sorting 1,000 elements is in principle ten times better than that with linear complexity. In this case, the typical advantages of binary trees apply.A hash table commonly provides five to ten times faster lookup than a binary tree. So if a hash container is available, you might consider using it even though hash tables are not standardized. However, hash containers have no ordering, so if you need to rely on element order they're no good. Because they are not part of the C++ standard library, you should have the source code to stay portable.
-
To process key/value pairs, use a map or a multimap (or the hash version, if available).
-
If you need an associative array, use a map.
-
If you need a dictionary, use a multimap.
Vector | Deque | List | Set | Multiset | Map | Multimap | |
Typical internal data structure | Dynamic array | Array of arrays | Doubly linked list | Binary tree | Binary tree | Binary tree | Binary tree |
Elements | Value | Value | Value | Value | Value | Key/value pair | Key/value pair |
Duplicates allowed | Yes | Yes | Yes | No | Yes | Not for the key | Yes |
Random access available | Yes | Yes | No | No | No | With key | No |
Iterator category | Random access | Random access | Bidirectional | Bidirectional (element constant) | Bidirectional (element constant) | Bidirectional (key constant) | Bidirectional (key constant) |
Search/find elements | Slow | Slow | Very slow | Fast | Fast | Fast for key | Fast for key |
Inserting/removing of elements is fast | At the end | At the beginning and the end | Anywhere | — | — | — | — |
Inserting/removing invalidates iterators, references, pointers | On reallocation | Always | Never | Never | Never | Never | Never |
Frees memory for removed elements | Never | Sometimes | Always | Always | Always | Always | Always |
Allows memory reservation | Yes | No | — | — | — | — | — |
Transaction safe (success or no effect) | Push/pop at the end | Push/pop at the beginning and the end | All except sort() and assignments | All except multiple-element insertions | All except multiple-element insertions | All except multiple-element insertions | All except multiple-element insertions |
A problem that is not easy to solve is how to sort objects according to two
different sorting criteria. For example, you might have to keep elements in an
order provided by the user while providing search capabilities according to
another criterion. And as in databases, you need fast access regarding two or
more different criteria. In this case, you could probably use two sets or two
maps that share the same objects with different sorting criteria. However,
having objects in two collections is a special issue,
The automatic sorting of associative containers does not mean that these
containers perform better when sorting is needed. This is because an associative
container sorts each time a new element gets inserted. An often faster way is to
use a sequence container and to sort all elements after they are all inserted by
using one of the several sort algorithms.
The following are two simple programs that sort all strings read from the
standard input and print them without duplicates by using two different
containers:
-
Using a set:
-
Using a vector:
When I tried both programs with about 150,000 strings on my system, the
vector version was approximately 10% faster. Inserting a call of reserve() made the vector version 5% faster. Allowing
duplicates (using a multiset instead of a set and calling copy() instead of
unique_copy() respectively) changed things dramatically:
The vector version was more than 40% faster! These measurements are not
representative; however, they do show that it is often worth trying different
ways of processing elements.
In practice, predicting which container type is the best is often difficult.
The big advantage of the STL is that you can try different versions without much
effort. The major work— implementing the different data structures and
algorithms— is done. You have only to combine them in a way that is best for
you.
-----------------------------------------------------------------
See Also:
-----------------------------------------------------------------
See Also:
-----------------------------------------------------------------
- Complete Tutorial of C++ Template's
- Standard Template Library Tutorial
- Inter Process Communication Tutorial
- Advance Programming in C & C++
-----------------------------------------------------------------
No comments:
Post a Comment