Sunday, May 6, 2012

stl vector capacity size


Vectors copy their elements into their internal dynamic array. The elements always have a certain order. Thus, vectors are a kind of ordered collection. Vectors provide random access. Thus, you can access every element directly in constant time, provided you know its position. The iterators are random access iterators, so you can use any algorithm of the STL.

Vectors provide good performance if you append or delete elements at the end. If you insert or delete in the middle or at the beginning, performance gets worse. This is because every element behind has to be moved to another position. In fact, the assignment operator would be called for every following element.

Size and Capacity

Part of the way in which vectors give good performance is by allocating more memory than they need to contain all their elements. To use vectors effectively and correctly you should understand how size and capacity cooperate in a vector. Vectors provide the usual size operations size(), empty(), and max_size(). An additional "size" operation is the capacity() function. capacity() returns the number of characters a vector could contain in its actual memory. If you exceed the capacity(), the vector has to reallocate its internal memory.

The capacity of a vector is important for two reasons:
  1. Reallocation invalidates all references, pointers, and iterators for elements of the vector.
  2. Reallocation takes time.

Thus, if a program manages pointers, references, or iterators into a vector, or if speed is a goal, it is important to take the capacity into account.

To avoid reallocation, you can use reserve() to ensure a certain capacity before you really need it. In this way, you can ensure that references remain valid as long as the capacity is not exceeded:

std::vector<int> v; // create an empty vector
v.reserve (80); // reserve memory for 80 elements

Another way to avoid reallocation is to initialize a vector with enough elements by passing additional arguments to the constructor. For example, if you pass a numeric value as parameter, it is taken as the starting size of the vector:

std::vector<T> v(5); // creates a vector and initializes it
with five values
// (calls five times the default constructor of type T)

Of course, the type of the elements must provide a default constructor for this ability. But note that for complex types, even if a default constructor is provided, the initialization takes time. If the only reason for initialization is to reserve memory, you should use reserve().

The concept of capacity for vectors is similar to that for strings , with one big difference: Unlike strings, it is not possible to call reserve() for vectors to shrink the capacity. Calling reserve() with an argument that is less than the current capacity is a no-op. Furthermore, how to reach an optimal performance regarding speed and memory usage is implementation defined. Thus, implementations might increase capacity in larger steps. In fact, to avoid internal fragmentation, many implementations allocate a whole block of memory (such as 2K) the first time you insert anything if you don't call reserve() first yourself. This can waste Jots of memory if you have many vectors with only a few small elements. Because the capacity of vectors never shrinks, it is guaranteed that references, pointers, and iterators remain valid even when elements are deleted or changed, provided they refer to a position before the manipulated elements. However, insertions may invalidate references, pointers, and iterators.

There is a way to shrink the capacity indirectly: Swapping the contents with another vector swaps the capacity. The following function shrinks the capacity while preserving the elements:

template <class T>
void shrinkCapacity(std::vector<T>& v)
{
   std::vector<T> tmp(v); // copy elements into a new vector
   v.swap(tmp); // swap internal vector data
}

You can even shrink the capacity without calling this function by calling the following statement:

Note: You (or your compiler) might consider this statement as being incorrect because it calls a non constant member function for a temporary value. However, standard C++ allows you to call a non constant member function for temporary values.

//shrink capacity of vector v for type T
std::vector<T>(v).swap(v);

However, note that after swap(), all references, pointers, and iterators swap their containers. They still refer to the elements to which they referred on entry. Thus, shrinkCapacity() invalidates all references, pointers, and iterators

See Also:

No comments:

Post a Comment