Let a variable of type
double be cast to a variable of type
float . If a
double variable stores a
NaN value, then it is converted to
NaN . If the
double variable stores the value
Inf , then it is converted to
Inf . If the
double variable stores the value
-1e+50 , then it is converted to
Is this behavior guaranteed by the C++ standard, or is it IEEE 754 ? Or is it implementation-defined behavior , or is it generally undefined behavior and generally speaking, it is not worth counting on the fact that
-1e+50 is converted to
Strictly speaking, the standard does not define the dimensions of
double , but in my answer I will assume that they are of different dimensions and represent 32 and 64 bit data types, respectively.
float can hold numbers from ±1.18×10^(−38) to ±3.4×10^38 , and you try to store a
double variable whose value cannot be
float you get
You get undefined behavior. Therefore, you should not rely on
My original answer, stating that you get undefined behavior, is incorrect. Undefined behavior is obtained if we try to put a number in a
float that cannot be represented in it. Here is how it is described in the standard:
C++14 standard [conv.double]
A prvalue of floating point type can be converted to a prvalue of another floating point type. If the source value can be exactly represented in the destination type, the result of the conversion is that exact representation. If the source value is between two adjacent destination values, the result of the conversion is an implementation-defined choice of either of those values. Otherwise, the behavior is undefined.
But this is all about the abstract implementation, and we have it quite specific (and this was my mistake – I tried on a specific implementation for an abstract description and made an erroneous conclusion). Since I made an assumption about the range of valid
float values, I took a specific implementation – IEEE 754. And since I took this implementation, then I need to start from it, and not from abstraction.
According to IEEE 754, infinity is part of the type, and therefore its maximum and minimum values are not its delimiters. It's just that everything that lies outside the allowable range is reduced to the desired value according to the rounding rules. So, according to this rule, if you try to represent
-1e+50 in a
float , you get
-INF – this is a normal
float value and it is guaranteed when
true *. Those. we have this situation from the standard: "the source value is between two adjacent destination values, the result of the conversion is an implementation-defined choice of either of those values" .
If the above constant is equal to
false , then the answer to the question may be different, but it should also be considered within the framework of a specific implementation, because we have a specific figure in the question, which we cannot consider without specifics from the implementation side.
To sum up: whether such behavior is undefined behavior, whether it is implementation dependent – all this depends on the representation of floating point types and cannot be considered outside of it.
* I did not find the text of the document itself, but many documents that refer to it state this.
In C, by the way, this is more explicitly stated: if the environment supports infinity, then the result is defined and, with the example from the question, it will give guaranteed
-inf . This is described in the C11 standard [126.96.36.199.2/p5]
The minimum range of representable values for a floating type is the most negative finite floating-point number representable in that type through the most positive finite floating point number representable in that type. in addition, if negative infinity is representable in a type, the range of that type is extended to all negative real numbers; likewise, if positive infinity is representable in a type, the range of that type is extended to all positive real numbers.