Chapter 2 Variables and Basic Types
2.1 Primitive Built-in Types
Arithmetic Types

- Machine-level representation:
- The smallest chunk of addressable memory is a byte.
- The basic unit of storage is a word.
- The size (largest value) of arithmetic types varies across machines: compilers are allowed to use larger sizes than the minimum required.
- Two categories of arithmetic types: integral and floating-point.
- Character types:
charis guaranteed to be big enough to hold numeric values corresponding to characters in the machine’s basic character set.wchar_t,char16_t,char32_tare used for extended character sets:wchar_tis large enough for the machine’s largest extended character set.char16_tandchar32_tare for Unicode characters.- There are three distinct basic character types:
char,signed charandunsigned char.charis not necessarily the same assigned char, which may lead to problems across machines.
- Floating-point types:
float,double,long doubleare represented in 1, 2 and 3/4 words.- IEEE floating-point standard: sign + biased exponent (8/11 bits) + normalized mantisa (23/52 bits)
- Biased exponent ($2^{b-1}-1$): you could treat the bits of a floating-point number as a (signed) integer of the same size, and if you compared them that way, the values will sort into the same order as the floating point numbers they represented.
- Normalized mantisa: significant digits in scientific notation.
- Special values:
| Exponent | Mantisa | Value |
|---|---|---|
| 0 | 0 | 0 |
| 255 | 0 | Infinity |
| 0 | not 0 | Denormalised |
| 255 | not 0 | Not a number (NAN) |
Type Conversions
- Type conversion behavior:
- Assigning floating-point values to integral values will involve value truncation (not identical bit pattern). The value stored is the part before the decimal point (not in scientific notation).
- Precision may be lost when assigning an integral value to a floating-point type (not identical bit pattern).
- Assigning out-of-range values to unsigned types will involve modulo operation (the same bit pattern).
- Assigning out-of-range values to signed types are undefined (some processors generate a hardware exception on overflow).
- Avoid undefined/implementation-defined behavior (e.g. assuming the size of
int).
- Avoid undefined/implementation-defined behavior (e.g. assuming the size of
- Cases of type conversion:
- The compiler applies type conversions when another arithmetic type is expected.
- Signed values are converted to unsigned values if we use both in an arithmetic expression (not limited to addition/subtraction). Don’t mix signed and unsigned types.
Literals
- Integer literal:
- Decimal, octal (
0), hexadecimal (0x/0X) notation.- The octal and hexadecimal literals have the smallest type of
int,unsigned int,long,unsigned long,long long, orunsigned long long. The decimal literals have the smallest type ofint,long, orlong long(decimal literals are signed by default). There are no literals ofshort. - The type can be explicitly specified with suffixes:
u/Uforunsigned,l/Lforlong, andll/LLforlong long. This is specified as the minimum type -lmay belong long.
- The octal and hexadecimal literals have the smallest type of
- Technically speaking, the value of a decimal literal is never a negative number.
- Decimal, octal (
- Floating-point literal: decimal point (
.) or an exponent specified using scientific notation (Eore).- Floating-point literals are always
doubleby default. It doesn’t depend on the value. - The type can be explicitly specified with suffixes:
f/Fforfloat, andl/Lforlong double.
- Floating-point literals are always
- Character literal: enclosed in single quotes (
'). - String literal: enclosed in double quotes (
"). The type is array of constantchars (const char []), where a null character ('\0') will be appended at the end.- The type can be explicitly specified with prefixes:
uforchar16_t,Uforchar32_t,Lforwchar_t,u8for utf-8charstring literals. - Two string literals adjacent to each other separated only by spaces, tabs, or newlines are concatenated into a single literal.
- Nonprintable and control characters can be represented using escape sequences. We can also use generalized escape sequences:
\xfollowed by hexadecimal digits or\followed by octal digits of the character value.\uses only the first three octal digits following it, but\xuses up all the hex digits.
- The type can be explicitly specified with prefixes:
| Character | Escape Sequence |
|---|---|
| newline | \n |
| vertical tab | \v |
| backslash | \\ |
| carriage return | \r |
| horizontal tab | \t |
| backspace | \b |
| question mark | \? |
| formfeed | \f |
| alert (bell) | \a |
| single quote | \' |
| double quote | \" |
- Other literals:
true/false(boolean) andnullptr(pointer/address).
2.2 Variables
Variable Definitions
- Definition: base type specifier followed by a list of declarators separated by commas. Each name may have an initial value (initialized, not assigned).
- The name of each object becomes visible immediately; it is possible to initialize a variable to the value of one defined earlier.
- Four ways of initialization:
- Loss of information from conversion is not allowed in list initialization. It can protect us from unintentional narrowing conversions.
int units_sold(0); // direct initialization
int units_sold = 0; // copy initialization
int units_sold{0}; // list initialization
int units_sold = {0}; // list initialization
Summary of four forms of initialization:
- The form of initialization is different from how the value is initialized.
- For copy initialization, we can only supply a single initializer - a constructor (not necessarily the copy constructor) will be called with the initializer passed in as the argument.
- Copy initialization may be suppressed with
explicit.- For in-class initializers, we must either use copy initialization or list initialization (no direct initialization). See section 2.6 for details.
- We can supply a list of values only by using list initialization - list-initializing will be preferred.
- If such constructors don’t exist, it will fall back to direct-initializing with elements as arguments.
- Initializer lists don’t require all elements to be of the same type - they may be used for direct initialization. This is different from the template
initializer_list, which requires elements to be of the same type.
- A variable is default initialized without an initializer. Refer to section 7.5.3 for details. The value depends on the type and where it is defined:
- Built-in/Compound (arrays/pointers) types: variables defined outside of blocks are zero-initialized; those defined inside a block are uninitialized - undefined. Uninitialized variables have indeterminate values, which may cause errors hard to debug.
- Classes: up to the class whether we can define objects without an initializer and what value will it have.
Variable Declarations and Definitions
- C++ supports separate compilation. In order to share symbols and code across files, declarations and definitions are distinguished:
- Declaration makes a name known to the program (included to use the name by others). A symbol may be declared multiple times.
- Definition creates the associated entity (the type should be in line with declaration. The compiler will not necessarily check this). A symbol may be defined only once in a program.
- A variable definition is a declaration (in contrast to “definition” mentioned above). We need to use
externwithout an explicit initializer to obtain a declaration that is not also a definition.- It is an error to provide an initializer on an
externinside a function. - To use a variable in more than one file requires declarations separate from definition.
- It is an error to provide an initializer on an
extern int x; // declaration but not a definition
int y = 0; // declaration and definition
extern int z = 0; // declaration and definition
Identifiers
- Identifiers can be letters, digits and the underscore. There are no limits on name length. They must begin with a letter or underscore. The names are case-sensitive.
- C++ keywords and alternative operator names cannot be used as identifiers.


- The C++ standard also reserves a set of names for use in libraries:
- Identifiers beginning with an underscore followed immediately by an uppercase letter (in any scope).
- Identifiers containing adjacent underscores (or “double underscore”) (in any scope).
- Identifiers beginning with an underscore (in the global namespace/outside a function).
Scope of a Name
- Names are visible from the point of declaration until the end of the enclosing scope (block or file/global).
- The scope operator (
::) can be used to use the name in a specific scope. Since the global scope has no name, the scope operator with an empty left-hand side refers to the global scope.- It is almost always a bad idea to define a local variable with the same name as a global variable that the function may use.
2.3 Compound Types
References
- A (lvalue) reference defines an alternative name for an object. A reference must be initialized: it will be bound to its initializer. There is no way to rebind a reference.
- Since references are not objects, we may not define a reference to a reference.
- The type of a reference and the referred object should match exactly with two exceptions:
- We can initialize a reference to
constfrom any expression convertible to the type of reference (section 2.4.1). - We can initialize a reference to a base-class type from an object of a derived-class type (section 15.2.3).
- We can initialize a reference to
- A reference may be bound only to an object, not to a literal or the result of an expression. Basically, a non-
constreference must be bound to a non-constlvalue (section 4.1).
Pointers
- A pointer is an object: it can be assigned and copied and need not be initialized.
- We get the address of an object by using the address-of operator (
&). We may not define a pointer to a reference (not an object). - The type of a pointer and the pointed object must match with two exceptions:
- We can set a pointer to
constto a nonconstobject (section 2.4.2). - We can set a pointer to a base-class type to an object of a derived-class type (section 15.2.3).
- We can set a pointer to
- We use the dereference operator (
*) to access the object pointed to by a pointer, or use the member access operator (->) if the pointer points to a non-built-in object. - Null pointers:
nullptr(a special type convertible to any pointer type),0/NULL(avoided in modern C++ programs). - Pointers can be used in conditions (non-zero pointers evaluate to
true) or comparison operations (==/!=).- It is possible for a pointer to an object and a pointer one past the end of a different object to hold the same address.
void *pointers can hold the address of any data pointer type (cast automatically when assigned), but we cannot dereference avoid *pointer without an explicit cast. We use them to deal with memory as memory instead of accessing objects.
Understanding Compound Type Declarations
- There are no limits to how many type modifiers (
&/*) can be applied to a declarator. - Reference to a pointer:
int *&r = p(read from right to left).
2.4 const Qualifier
Declaration and Definition
constobject values cannot be changed after initialization. They must be initialized at the point of declaration.- When we use an object to (copy) initialize another object, it doesn’t matter whether either or both are
constobjects.
- When we use an object to (copy) initialize another object, it doesn’t matter whether either or both are
constvariable definitions are local to a file. The compiler handlesconstvariables differently from ordinary ones:constobjects initialized from a compile-time constant will be replaced with its corresponding value during compilation.constobjects initialized from a non-constant expression will remain as objects (variables). They are local to the file: no multi-definition compilation errors.
- In order to reuse a
constvariable across files, we can define a single instance of it by usingexternon both the definition and declarations:- In contrast,
externis only needed on the declaration for non-constvariables.
- In contrast,
// file.cpp
extern const int bufSize = fcn(); // !!! extern makes it visible in other files
// file.h
extern const int bufSize;
References to const
- “Reference to
const” is abbreviated as “constreference”. There are noconstreferences - a reference is not an object. - We can initialize a reference to
constfrom any expression convertible to the type of reference - non-constobject, literal or general expression. Basically, we can bind aconstreference to an rvalue or an lvalue (section 4.1).- In this way, a temporary object is created by the compiler to store the result of the expression, and is bound by the reference.
- The underlying object might be non-
constand be changed by other means.
int i = 42;
const int &r1 = i; // ok: plain int object
const int &r2 = 42; // ok: literal
const int &r3 = r1 * 2; // ok: general expression that evaluates to int
int &r4 = r1 * 2; // error: r4 is a plain, non-const reference
double dval = 3.14;
const int &ri = dval; // ok: bound to a temporary object - int 3
Pointers and const
- A pointer to
constmay not be used to change the pointed object. We may store the address of aconstobject only in a pointer toconst.- Like references, pointers to
constcan be bound to non-constobjects.
- Like references, pointers to
constpointers have unchangeable address values: put theconstafter the*(int *const) - read from right to left.
Top-Level const
- Top-level
const: the pointer (object) itself isconst; low-levelconst: referring/pointing to aconstobject. - When we copy an object, top-level
consts (of both sides) are ignored - copying doesn’t change the copied object, and we can declare the new object asconstor non-constas we wish. However, low-levelconstis never ignored.
int i = 0;
int *const p1 = &i; // top-level
const int ci = 42; // top-level
const int *p2 = &ci; // low-level
const int *const p3 = p2; // top-level + low-level
int *p = p3; // error: p3 has a low-level const but p doesn’t
p2 = p3; // ok: p2 has the same low-level const qualification as p3
p2 = &i; // ok: we can convert int* to const int*
int &r = ci; // error: can’t bind an ordinary int& to a const int object
const int &r2 = i; // ok: can bind const int& to plain int
constexpr and Constant Expressions
- A constant expression is an expression whose value cannot change and can be evaluated at compile time: literals,
constobjects initialized from constant expressions, etc. - We might define a
constvariable with an initializer that we think is a constant expression. However, when we use that variable in a context that requires a constant expression we may discover that the initializer was not a constant expression. In order to avoid this, we can useconstexprdeclaration to ask the compiler to verify whether a variable is a constant expression.- Warning:
constexpris different from constant expressions -constexpris only a way to verify whether it is a constant expression. constexprimplicitly impliesconst.
- Warning:
constexpr int mf = 20; // 20 is a constant expression
constexpr int limit = mf + 1; // mf + 1 is a constant expression
constexpr int sz = size(); // ok only if size is a constexpr function
- The types can be used as
constexprare known as literal types: arithmetic types, references, pointers, and literal classes (section 7.5.6). constexprpointers can be initialized with objects on the heap (defined outside functions or static) but not with objects on the stack.constexprspecifier always applies to the pointer, not the type to which the pointer points.
const int *p = nullptr; // p is a pointer to a const int - low-level
constexpr int *q = nullptr; // q is a const pointer to int - top-level
constexpr int i = 42;
int j = 0;
constexpr const int *p = &i; // p is a constant pointer to the const int i
constexpr int *p1 = &j; // p1 is a constant pointer to the int j
2.5 Dealing with Types
Type Aliases
- Two ways of defining a type alias -
typedefandusing:
typedef double wages; // wages is a synonym for double
typedef wages base, *p; // base is a synonym for double, p for double*
using SI = Sales_item; // SI is a synonym for Sales_item
- However, there are pitfalls to be aware of: compound types are viewed as a whole and cannot be simply replaced with the underlying types:
typedef char *pstring;
const pstring cstr = 0; // cstr is a constant pointer to char
const char *cstr = 0; // wrong interpretation of const pstring cstr
const pstring *ps; // ps is a pointer to a constant pointer to char
The auto Type Specifier
autotells the compiler to deduce the type from the initializer (must have one). Multiple variables defined with a singleautoshould have consistent types.- No implicit type conversion:
auto sz = 0, pi = 3.14is not allowed.
- No implicit type conversion:
- When we use a reference as an initializer, the compiler uses the referenced object’s type for
auto’s type deduction. const-related type deduction:- Top-level
constwill be ignored byautoduring type deduction. - However, when we ask for a reference to an
auto-deduced type, top-levelconsts in the initializer are not ignored - the defined reference will have low-levelconst. - For temporary objects as initializers (literals/expressions), no top-level
constis implied, and the deduced type will be non-const. Therefore,constis necessary when binding a reference to a temporary object.
- Top-level
// Top-level const
int i = 0;
const int ci = i;
auto b = ci; // int
auto e = &ci; // const int * (low-level)
const auto f = ci; // const int (top-level)
// References
auto &g = ci; // const int & (top-level const is preserved when using auto with a reference)
const auto &h = ci; // const int & (const is not necessary here)
auto &j = 42; // not allowed (literal - no top-level const; deduced type is int)
const auto &j = 42; // const int &
// Type consistency
auto k = ci, &l = i; // k is int; l is int&
auto &m = ci, *p = &ci; // m is const int&; p is a pointer to const int
auto &n = i, *p2 = &ci; // error: type deduced from i is int; type deduced from &ci is const int
The decltype Type Specifier
decltype(...)is used to let the compiler analyze the expression to determine the type without evaluating the expressions.constand reference-related type deduction:- When the expression is a variable, then the exact type including top-level
constand reference is returned. This is the only context where a reference variable is not treated as a synonym for the referred object. - For other expressions, the result is a reference type if the expression yields an lvalue (section 4.1.1), e.g. dereference.
- This leads to some interesting behaviors. If we wrap a variable in one or more sets of parentheses, it becomes an expression that yields an lvalue, so
decltypewill return a reference type.
- This leads to some interesting behaviors. If we wrap a variable in one or more sets of parentheses, it becomes an expression that yields an lvalue, so
- When the expression is a variable, then the exact type including top-level
int i = 42, *p = &i, &r = i;
decltype(r) a = i; // a is int &
decltype(r + 0) b; // b is int: a trick to get the referred type
decltype(*p) c = i; // c is int &
decltype((i)) d = i; // d is int &
decltype(i) e; // e is int
decltype(i = 45) f; // f is int &: i = 45 will not be executed - only for type deduction
2.6 Defining Our Own Data Structures
Defining the Sales_data Type
- The close curly that ends the class/struct body must be followed by a semicolon. It is needed because we can define variables after the class body:
struct Sales_data { /* ... */ } accum, trans, *salesptr;.- It is not recommended though: it obscures the code by combining the definition of two different entities.
- Under C++11, we can supply in-class initializers for data members if we don’t want them default initialized. They must either be enclosed inside curly braces or follow an
=sign.
Writing Our Own Header Files
- Classes are usually defined in header files to ensure the definition is the same in each file. Headers contain entities that can be defined only once in any given file, but since it might be included more than once, we need to write headers that is safe even if it is included multiple times.
- The preprocessor is a program that runs before the compiler and changes the source text of programs.
#include: It will be replaced with the contents of the specified header.- Preprocessor variables (
#define,#ifdef,#ifndef,#endif) are used to define header guards. We typically ensure uniqueness by basing the guard’s name on the name of a class.