Ranges in stl
All algorithms process one or more ranges of elements.
Such a range might, but is not required to, embrace all elements of a container. Therefore, to be
able to handle subsets of container elements, you pass the beginning and the end of the range
as two separate arguments rather than the whole collection as one argument.
This interface is flexible but dangerous. The caller must
ensure that the first and second arguments define a valid range. This is the case if the
end of the range is reachable from the beginning by iterating through the elements. This means,
it is up to the programmer to ensure that both iterators belong to the same container and that
the beginning is not behind the end. If this is not the case, the behavior is undefined and
endless loops or forbidden memory access may result. In this respect, iterators are just as unsafe
as ordinary pointers. But note that undefined behavior also means that an implementation of
the STL is free to find such kinds of errors and handle them accordingly. The following
paragraphs show that ensuring that ranges are valid is not always as easy as it sounds.
Every algorithm processes half-open ranges. Thus, a range
is defined so that it includes the position used as the beginning of the range but excludes
the position used as the end. This concept is often described by using the traditional
mathematical notations for half-open ranges:
[begin,end)
or
[begin,end[
I use the first alternative in this book.
The half-open range concept has the advantages that were described here(it is simple and avoids special handling for empty collections). However, it also has some disadvantages. Consider the following example:
In this example, the collection is initialized with
integral values from 20 to 40. When
the search for an element with the value 3 fails, find()
returns the end of the processed range (coll.end() in this example) and
assigns it to pos. Using that return value as the beginning of the range in the following call of reverse()
poses no problem because it results in the following call:
reverse (coll.end(), coll.end());
This is simply a call to reverse an empty range. Thus, it
is an operation that has no effect (a socalled "no-op").
However, if find() is used to find the first
and the last elements of a subset, you should consider that passing these iterator positions as a range
will exclude the last element. So, the first call of max_element()
max_element (pos25, pos35)
finds 34 and not 35:
max: 34
To process the last element, you have to pass the
position that is one past the last element:
max_element (pos25, ++pos35)
Doing this yields the correct result:
max: 35
Note that this example uses a list as the container.
Thus, you must use operator ++ to get the position that is behind pos35. If you have random access
iterators, as with vectors and deques, you also could use the expression pos35 + 1. This
is because random access iterators allow "iterator arithmetic.
Of course, you could use pos25 and pos35 to
find something in that subrange. Again, to search including pos35 you have to pass the position
after pos35. For example:
//increment pos35
to search with its value included
++pos35;
pos30 = find(pos25,pos35, //range
30); //value
if (pos30 == pos35) {
cout << "30 is in NOT the
subrange" << endl;
}
else {
cout << "30 is in the
subrange" << endl;
}
All the examples in this section work only because you
know that pos25 is in front of pos35.
Otherwise, [pos25,pos35) would not be a valid
range. If you are not sure which element is in front, things are getting more complicated and undefined
behavior may occur.
Suppose you don't know whether the element that has value
25
is in front of the element that has value 35. It might even be possible that
one or both values are not present. By using random access iterators, you can call operator <
to check this:
if (pos25 < pos35) {
//only [pos25,pos35)
is valid
...
}
else if (pos35 < pos25) {
//only [pos35,pos25)
is valid
...
}
else {
//both are equal, so both must be end()
...
}
However, without random access iterators you have no
simple, fast way to find out which iterator is in front. You can only search for one iterator in the
range of the beginning to the other iterator or in the range of the other iterator to the end. In this
case, you should change your algorithm as follows: Instead of searching for both values in the
whole source range, you should try to find out, while searching for them, which value comes first. For
example:
pos25 = find (coll.begin(), coll.end(), //range
25); //value
pos35 = find (coll.begin(), pos25, //range
35); //value
if (pos35 != pos25) {
/*
* pos35
is in front of pos25
* so, only [pos35,pos25)
is valid
*/
...
}
else {
pos35 = find (pos25, coll.end(), //range
35); //value
if (pos35 != pos25) {
/*pos25
is in front of pos35
*so, only [pos25,pos35)
is valid
*/
...
}
else {
// both are equal, so both must be end()
...
}
}
In contrast to the previous version, here you don't
search for pos35 in the full range of all elements of coll. Instead, you first search for
it from the beginning to pos25. Then, if it's not found, you search for it in the part that contains the remaining elements after pos25. As a result you know which iterator position comes first and which
subrange is valid.
This implementation is not very efficient. A more
efficient way to find the first element that either has value 25 or value 35 is to
search exactly for that. You could do this by using some abilities of the STL that are not introduced yet as follows:
pos = find_if (coll.begin(), coll.end(), //range
compose_f_gx_hx(logical_or<bool>(), //criterion
bind2nd(equal_to<int>(), 25),
bind2nd(equal_to<int>(), 35)));
switch (*pos) {
case 25:
//element with value 25
comes first
pos25 = pos;
pos35 = find (++pos, coll.end(), //range
35); //value
...
break;
case 35:
//element with value 35
comes first
pos35 = pos;
pos25 = find (++pos, coll.end(), //range
25); //value
...
break;
default:
//no element with value 25
or 35 found
...
break;
}
Here, a special expression is used as a sorting criterion
that allows a search of the first element that has either value 25 or value 35. The
expression is a combination of several predefined function objects.
See Also:
No comments:
Post a Comment