Because machines differ and because C left many things undefined. For
details, including definitions of the terms “undefined”, “unspecified”,
“implementation defined”, and “well-formed”; see the ISO C++ standard.
Note that the meaning of those terms differ from their definition of the
ISO C standard and from some common usage. You can get wonderfully
confused discussions when people don’t realize that not everybody shares
definitions.
This is a correct, if unsatisfactory, answer. Like C, C++ is meant to
exploit hardware directly and efficiently. This implies that C++ must
deal with hardware entities such as bits, bytes, words, addresses,
integer computations, and floating-point computations the way they are
on a given machine, rather than how we might like them to be. Note that
many “things” that people refer to as “undefined” are in fact
“implementation defined”, so that we can write perfectly specified code
as long as we know which machine we are running on. Sizes of integers
and the rounding behavior of floating-point computations fall into that
category.
Consider what is probably the the best known and most infamous example of undefined behavior:
The C++ (and C) notion of array and pointer are direct
representations of a machine’s notion of memory and addresses, provided
with no overhead. The primitive operations on pointers map directly onto
machine instructions. In particular, no range checking is done. Doing
range checking would impose a cost in terms of run time and code size. C
was designed to outcompete assembly code for operating systems tasks,
so that was a necessary decision. Also, C – unlike C++ – has no
reasonable way of reporting a violation had a compiler decided to
generate code to detect it: There are no exceptions in C. C++ followed C
for reasons of compatibility and because C++ also compete directly with
assembler (in OS, embedded systems, and some numeric computation
areas). If you want range checking, use a suitable checked class (
vector
, smart pointer, string
, etc.). A good compiler could catch the range error for a[100]
at compile time, catching the one for p[100]
is far more difficult, and in general it is impossible to catch every range error at compile time.
Other examples of undefined behavior stems from the compilation
model. A compiler cannot detect an inconsistent definition of an object
or a function in separately-compiled translation units. For example:
Compiling
file1.c
and file2.c
and linking
the results into the same program is illegal in both C and C++. A linker
could catch the inconsistent definition of S
, but is not
obliged to do so (and most don’t). In many cases, it can be quite
difficult to catch inconsistencies between separately compiled
translation units. Consistent use of header files helps minimize such
problems and there are some signs that linkers are improving. Note that
C++ linkers do catch almost all errors related to inconsistently
declared functions.
Finally, we have the apparently unnecessary and rather annoying undefined behavior of individual expressions. For example:
The value of
j
is unspecified to allow compilers to
produce optimal code. It is claimed that the difference between what can
be produced giving the compiler this freedom and requiring “ordinary
left-to-right evaluation” can be significant. Leading experts are
unconvinced, but with innumerable compilers “out there” taking advantage
of the freedom and some people passionately defending that freedom, a
change would be difficult and could take decades to penetrate to the
distant corners of the C and C++ worlds. It is disappointing that not
all compilers warn against code such as ++i+i++
. Similarly, the order of evaluation of arguments is unspecified.
There is a sentiment that too many “things” are left undefined,
unspecified, implementation-defined, etc. To address this, the ISO C++
committee has created Study Group 12 to review and recommend
wide-ranging tightening-up to reduce undefined, unspecified, and
implementation-defined behavior.
No comments:
Post a Comment