Question:
What elements of the C language are unsupported in C++? What C code will not be accepted by a C++ compiler? Particularly interested in the behavior of g++.
Answer:
The C language has been significantly different from the C++ language since its inception. (It is clear that the new features of the C99 language allow us to easily write examples of C code that will not compile in C++, but in fact this does not require at all to refer to C99. Even the "classic" standard C – C89/90 – differs markedly from C++.)
There are many serious differences, but according to the question, we are only interested in differences that make correct C code incorrect in C++. Without pretending to be complete, I will try to list these differences and give code examples based on these differences. The key point here is precisely that the syntactic constructions used are present in both languages, i.e. at first glance, the code looks innocent enough from the point of view of the C ++ language.
-
In C, it is allowed to "lose" the trailing
'\0'
when initializing a character array with a string literalchar s[4] = "1234";
The code is incorrect from a C++ point of view.
-
C supports "tentative" definitions. In one translation unit, you can make multiple external definitions of the same object
/* На уровне файла */ int a; int a; int a, a, a;
The code is incorrect from a C++ point of view.
-
In C, typedef type names and struct type tags are in different namespaces and do not conflict with each other. For example, such a set of declarations is valid in C
struct A { int i; }; typedef struct B { int i; } A; typedef struct C { int i; } C;
In C++, there is no separate concept of a "tag" for class types: class names share the same namespace with typedef names and may conflict with them. For partial backward compatibility with cross-compiled idiomatic C code, the C++ language allows you to declare typedef aliases that match the names of existing type classes, but only if the alias refers to the type class of the same name.
In the example above, the first typedef declaration is incorrect from a C++ point of view.
-
In C, an "unfamiliar"
struct
type name mentioned in a function's parameter list is a declaration of a new type local to that function. At the same time, in the list of function parameters, this type can be declared as incomplete, and "additionally declared" to the full type already in the body of the function/* Тип `struct S` в этой точке не известен */ void foo(struct S *p) { struct S { int a; } s; p = &s; p->a = 5; }
In this code, everything is correct from the point of view of the C language:
p
has the same type as&s
, etc.From the point of view of the C++ language, the mention of an "unfamiliar"
struct
type name in a function's parameter list is also a declaration of a new type. However, this new type is not local: it is considered to belong to the enclosing namespace . Therefore, from the point of view of the C++ language, the local definition of the typeS
in the above example has nothing to do with the typeS
mentioned in the parameter list. The assignmentp = &s
is not possible due to a type mismatch. The code is incorrect from a C++ point of view. -
C allows new types to be defined inside a cast operator, inside the
sizeof
operator, in function declarations (return types and parameter types)int a = sizeof(enum E { A, B, C }) + (enum X { D, E, F }) 0; enum E e = B; int b = e + F;
The code is incorrect from a C++ point of view.
-
The C language allows the definition of external objects of incomplete types, provided that the type is extended and becomes complete somewhere further in the same translation unit
/* На уровне файла */ struct S s; struct S { int i; };
The above set of declarations is correct from a C point of view, but incorrect from a C++ point of view. The C++ language unconditionally forbids defining objects of incomplete types.
-
In C, many statements create their implicit enclosing scope in addition to the already existing scope in the "body" of that statement, while C++ creates a single scope.
for instance
for (int i = 0; i < 10; ++i) { int i = 42; }
In C, the loop body is a nested scope of the loop header, so this code is correct. In C++, there is only one scope, which eliminates the possibility of a "nested" declaration of
i
. -
C language allows jumping through declarations with initialization
switch (1) { int a = 42; case 1:; }
The code is incorrect from a C++ point of view.
-
In C, nested
struct
type declarations place the name of the inner type in the outer (enclosing) scopestruct A { struct B { int b; } a; }; struct B b;
The code is incorrect from a C++ point of view.
-
C allows implicit conversion of pointers from type
void *
void *p = 0; int *pp = p;
The code is incorrect from a C++ point of view.
-
C supports function declarations without prototypes
/* На уровне файла */ void foo(); void bar() { foo(1, 2, 3); }
The code is incorrect from a C++ point of view.
-
In C,
enum
values are freely implicitly convertible to and fromint
enum E { A, B, C } e = A; e = e + 1;
The code is incorrect from a C++ point of view.
-
The copy constructors and assignment operators implicitly generated by the C++ compiler cannot copy
volatile
objects. In C, copyingvolatile
objects is not a problem.struct S { int i; }; volatile struct S v = { 0 }; struct S s = v;
The code is incorrect from a C++ point of view.
-
In C, string literals are of type
char [N]
, while in C++ they areconst char [N]
. Even if "classic" C++ supports the conversion of a string literal tochar *
as an exception, this exception only works when applied directly to the string literalchar *p = &"abcd"[0];
The code is incorrect from a C++ point of view.
-
C allows the use of "meaningless" storage class specifiers in declarations that do not declare objects
static struct S { int i; };
The code is incorrect from a C++ point of view.
Additionally, you can notice that in C language
typedef
is also formally just one of the storage class specifiers, which allows you to create typedef declarations that do not declare aliases.typedef struct S { int i; };
C++ does not allow such typedef declarations.
-
C allows explicit repetition of cv-qualifiers in declarations
const const int a = 42;
The code is incorrect from a C++ point of view. (C++ allows a similar "redundant" qualification, but only through intermediate type names: typedef names, template parameters).
-
In C, any integer constant expression with value
0
can be used as null pointer constantvoid *p = 2 - 2; void *q = -0;
This was also the case in C++ before the adoption of the C++11 standard. However, in modern C++, of integer values, only the literal value
0
(an integer literal) can act as a null pointer constant, and more complex expressions are no longer valid. The above initializations are incorrect from a C++ point of view. -
In C language, you can make a non-defining object declaration of type
void
extern void v;
(The definition of such an object will not work, because
void
is an incomplete type). In C++, even a non-defining declaration is forbidden. -
In C, a bit field declared as type
int
without an explicit indication ofsigned
orunsigned
can be either signed or unsigned (implementation-defined). In C++, such a bit field is always signed. -
The C preprocessor is not familiar with literals like
true
andfalse
. In C,true
andfalse
are only available as macros defined in the<stdbool.h>
header file. If these macros are not defined, then according to the rules of the preprocessor, both#if true
and#if false
should behave like#if 0
.At the same time, the C++ language preprocessor must natively recognize
true
andfalse
literals, and its#if
directive must behave the "expected" way with these literals.This can be a source of incompatibilities when the C code does not include
<stdbool.h>
#if true int a[-1]; #endif
This code is obviously incorrect in C++, and at the same time can be easily compiled in C.
-
The C language does not support cv-qualification for rvalues. In particular, the cv-qualification of a function's return value is ignored by the language. Together with the automatic conversion of arrays to pointers, this allows you to bypass some rules of constant correctness
struct S { int a[10]; }; const struct S foo() { struct S s; return s; } int main() { int *p = foo().a; }
From a C++ perspective, the return value of
foo()
and hence the arrayfoo().a
are const, and implicit conversion offoo().a
toint *
is not possible. -
In C, an implicit conflict between inner and outer binding when declaring the same entity results in undefined behavior, but in C++, such a conflict makes the program ill-formed (erroneous). To arrange such an implicit conflict, you need to build a rather tricky configuration
static int a; /* Внутреннее связывание */ void foo(void) { int a; /* Скрывает внешнее static `a`, не имеет связывания */ { extern int a; /* Из-за того, что внешнее static `a` скрыто, объявляет `a` с внешним связыванием. Теперь `a` объявлено и с внешним, и с внутренним связыванием - конфликт */ } }
In C++, such an
extern
declaration is erroneous. -
Recursive calls to
main
are allowed in C, but not in C++. -
The C++ preprocessor no longer (C++11) treats the
<строковый или символьный литерал><идентификатор>
sequence as independent tokens. From the point of view of the C++ language,<идентификатор>
in this situation is a literal suffix. To avoid this interpretation, in C++ these tokens should be separated by a spaceuint32_t a = 42; printf("%"PRIu32, a);
This code is correct from a C point of view, but incorrect from a C++ point of view.
-
The C language allows the definition of const objects without initialization
void foo() { const int a; }
In C++, such a declaration is incorrect.