Topic 1 - Introduction to C++

Warning

It is just a quick introduction/review to C++
This course is not a course on programming or OOP.
By now you should be familiar with all the concepts of this topic, we are only focusing on the syntax.

C++ is an old language and the syntax is very old school.
The worst part if you come directly from python is that with C++ we need to compile!
We will take a first class to discuss the syntax.

#include <iostream>
using namespace std;

/**
 * A class for simulating an integer memory cell.
 */
class Int
{
  public:
    /**
     * Construct the IntCell.
     * Initial value is 0.
     */
    Int( ){
        storedValue = 0;
    }

    /**
     * Construct the IntCell.
     * Initial value is initialValue.
     */
    Int( int initialValue ){
        storedValue = initialValue;
    }

    /**
     * Return the stored value.
     */
    int read( ){
        return storedValue;
    }

    /**
     * Change the stored value to x.
     */
    void write( int x ){
        storedValue = x;
    }

  private:
    int storedValue;
};

int main( )
{
    Int m;

    m.write( 5 );
    cout << "Cell contents: " << m.read( ) << endl;

    return 0;
}

Basic class syntax

Private/Public

In C++ and similarly to Java (Or Java similarly to C++) members of the class can be ever:

private: members that cannot be accessed outside the object.
public: members that can be accessed outside the object.
or protected (but we will not talk about it).

Usually, we put the public members at the beginning and the private members at the end.

To define public or private member you just type the label, then your members:

class Int
{
    public:

        /**
         * ...
         * Public members
         * ...
         */

    private:

        /**
         * ...
         * Private members
         * ...
         */

};

Constructor

Constructors are always public (not really), otherwise it would be impossible to create an object.

By default the constructor is just the name of the class followed by brackets:

class Int{
    Int(){
    }
};

If you have data members (variable) to initialize, you can put a default value inside the constructor:

class Int{
    Int(){
        storedValue = 0;
    }
};

Now, if we want to specify the values of the variables during the initialization then we can just add parameters to the constructors:

class Int{
    Int(int initialValue){
        storedValue = initialValue;
    }
};

When your data members have a simple type, because the assignment statement is enough to initialize them. However, when you have a complex data type as data members, it can be more complex to initialize them.

This is why C++ proposes an initialization list:

class Int{
    Int(int initialValue) :
        storedValue{initialValue}
    {

    }
};

Warning

You should always use braces, it is the convention and it works for all types (in contrary to brackets).

Getters and Setters

Getters and setters are working very similarly to any language.

Again, there are some rules in C++ that you need to know.

Activity

Take a close look at the following code and tell me if you see anything different/strange/etc.

class Int
{
public:
    /**
    * Return the stored value.
    */
    int read( ) const{
        return storedValue;
    }

    /**
    * Change the stored value to x.
    */
    void write( int x ){
        storedValue = x;
    }

private:
    int storedValue;
};

Getters:

The getters in C++ should always have the keyword const after the brackets.

By default all methods are mutators in C++ and they can modify any value inside the object.

Setters:

The setters don’t have a return type; instead it is using the keyword void.

Separation of Interface and Implementation

In C++, we never put the interface and the implementation of a class in the same file.

We use two files for that a .h and a .cpp.

Note

As it is old language, programmers used different extension throughout the years.

For .h, you can find .hpp, .hh.

For .cpp you can find .cc, .cxx, .c++.

In this course, only use the default ones.

Going back to our example, we need to separate the interface from the implementation. We obtain the following files:

Int.h

#ifndef Int_H
#define Int_H

/**
 * A class for simulating an integer memory cell.
 */
class Int
{
  public:
    IntCell( int initialValue = 0 );
    int read( ) const;
    void write( int x );
    
  private:
    int storedValue;
};

#endif

Int.cpp

#include "Int.h"

/**
 * Construct the IntCell with initialValue
 */
Int::Int( int initialValue ) : storedValue{ initialValue }
{
}

/**
 * Return the stored value.
 */
int Int::read( ) const
{
    return storedValue;
}

/**
 * Store x.
 */
void Int::write( int x )
{
    storedValue = x;
}

Preprocessor Commands

If it is the first time that you see some C++ or C code, then you must wonder why we have the following lines:

#ifndef Int_H
#define Int_H

// Some code...

#endif

The first two lines are called preprocessor commands.

In this case, it makes sure that Int.h is only imported once in the entire project.

Each time you need to use the class Int, you will need the interface, so you will include Int.h.

However, if you do that too many times, it could include this file again and again.

To guard this, we put these two lines to define a keyword (Int_H in our case), and if this keyword is already defined the compiler will not include it anymore.

#endif is at the end of the file to close the condition.

So, now when you want to use this class you can include it multiple time without any issue:

main.cpp

#include <iostream>
#include "Int.h"
using namespace std;

int main( )
{
    Int m;   // Or, Int m( 0 ); but not Int m( );

    m.write( 5 );
    cout << "Cell contents: " << m.read( ) << endl;

    return 0;
}

Notice that both main.cpp and Int.cpp have the include:

#include "Int.h"

Scope Resolution Operator and Signatures Matching

Again, we need to discuss some C++ syntax.

In the implementation file, each method needs to identify the class it is part of. Otherwise it could be any function in your program and it will create huge errors.

The syntax is always:

ClassName::member

Now looking back at Int.cpp:

Int.cpp

#include "Int.h"

/**
 * Construct the IntCell with initialValue
 */
Int::Int( int initialValue ) : storedValue{ initialValue }
{
}

/**
 * Return the stored value.
 */
int Int::read( ) const
{
    return storedValue;
}

/**
 * Store x.
 */
void Int::write( int x )
{
    storedValue = x;
}

Not too complicated, but something that you need to remember.

Again even with the correct syntax, you need to make sure that the method signature is matching perfectly.

Example

The method read() is a getter, so it has the const keyword:

int read( ) const;

So in the implementation you need to have it too:

int Int::read( ) const
{
    return storedValue;
}

If you remove it it will not work, because it could be a different method.

Object Are Declared Like Primitive Types

One last thing about objects in C++.

A object is declared just like a primitive type, so the legal declaration for the Int class are the following:

Int obj1;       // Zero parameter constructor
Int obj2(20);   // One parameter constructor

The following are incorrect:

Int obj3 = 42;  // Constructor is explicit
Int obj4();     // Function declaration

obj4 is very confusing, because in other languages it should work.

This is why since C++ 11 we prefer to use braces to initialize variables.

C++ 11 introduced the following declarations:

Int obj1;       // Zero parameter constructor, same as before
Int obj2{20};   // One parameter constructor, same as before (but with braces)
Int obj4{ };    // Zero parameter constructor

Note

You can use the syntax you prefer, but be consistent.

Activity

Create a class Float, separated in two files.

Try this class.

C++ Details

There are some C++ details that are important for us, so we need to review them.

Pointers

Remember, a pointer variable is a variable that stores the address where another object resides.

To illustrate how pointers work we use the previous main.cpp and modify it.

#include <iostream>
#include "Int.h"
using namespace std;

int main( )
{
    IntCell *m;

    m = new Int{ 0 };
    m->write( 5 );
    cout << "Cell contents: " << m->read( ) << endl;

    delete m;
    
    return 0;
}

The declaration of pointer variable start always with a *. You don’t need to assign any value during the declaration, it can be done later.

When you finally want to create a new object, you need to use the keyword new.

If you want to use the zero parameter constructor, the following declarations are legal:

m = new Int();  // OK
m = new Int{};  // C++ 11
m = new Int;    // Preferred

Important

When you create a pointer, you need to delete it manually!

delete m;

Accessing Members of an Object through a Pointer

If a pointer variable points at a class type, then a visible member of the object being pointed can be accessed with the -> operator.

m->write( 5 );

Address-of Operator (&)

One important operator is the adress-of operator &.

This operator returns the memory location where an object resides and is useful for implementing an alias.

cout << "Address: " << &m << endl;

It would print the memory address where m is stored.

Lvalues, Rvalues, and References

Ok, now we need to speak about lvalues, rvalues and references.

It will confuse you a lot at first, so stay focus!

A lvalue is an expression that identifies a non-temporary object. Basically, every variable that has a name.

While an rvalue is an expression that identifies a temporary object or is a value not associated with any object.

Example

Consider the following code:

vector<string> arr(3);
const int x = 2;
int y;
// ...
int z = x + y;
string str = "foo";
vector<string> *ptr = &arr;

All the declarations (arr, str, arr[x], &x, y, z , ptr, (*ptr)[x]) are all lvalues.

However 2, "foo", x+y, str.substr(0,1) are or would be rvalues.

Note

Intuitively, if the function call computes an expression whose value does not exist prior to the call and does not exist once the call is finished unless it is copied somewhere, it is likely to be an rvalue.

A reference type allows us to define a new name for an existing value.

In C++ you can reference lvalues and rvalues! Usually that’s where you start to be confused.

Lvalues

A lvalue reference is declared by placing an & after some type. It becomes an alias for the object it references.

Example

string str = "hell";
string & rstr = str;    // rstr is an alias of str
rstr += 'o';            // changes str to "hello"

We can find three common uses to lvalue references:

Aliasing complicated names.
Range for loops
Avoiding a copy

Case 1:

auto & whichList = theLists[myhash(x, theLists.size())];
if ( find(begin(whichList), end(whichList), x) != end(whichList) ){
    return false;
}
whichList.push_back(x);

Case 2:

Suppose we want to increase by 1 all values in a vector. This is easy for a for loop:

for (int i = 0; i < arr.size(); i++){
    arr[i]++;
}

But a range for loop would be more elegant:

for (auto x: arr){
    x++;            // It will not work, x is a copy!
}

The correct solution is:

for (auto &x: arr){
    x++;            // It is working, x is an alias.
}

Case 3:

The last use case is to avoid a copy.

Suppose we have a function findMax that returns the largest value in a vector. Then, we would use it this way:

auto x = findMax( arr );

It will work, but x will be a copy of the largest value. This is fine if it is what we want!

Otherwise, a reference is enough, we don’t need to use more memory:

auto &x = findMax( arr );   // Now we have a reference / alias and not a copy.

Parameter passing

Many languages (Java included), pass all parameters using call-by-value: the actual argument is copied into the formal parameter.

However, in C++ there are three ways to pass arguments (four, but it is not important).

To understand why it is not sufficient we will take three example functions:

double average(double a, double b);     // return the average between a and b
void swap(double a, double b);          // swap both numbers;   Wrong parameter type
string randomItem(vector<string> arr);  // returns a random item from the vector; Inefficient

Activity

Can you tell me why swap will not work?
Why randomItem is inefficient?

average

The average function is perfectly fine. It can calculate the average and return it.

a and b will be copies and so cannot be changed by the function, so this perfect for security purposes.

swap

This function has a big issue.

We can’t return anything and a and b are copies, so even if swap them inside the function it will not swap them outside the function.

What we need is that these variables are sent as references.

void swap(double &a, double &b);

This is called a call-by-reference (or call-by-lvalue-reference).

randomItem

This function is working, but it is very inefficient.

Suppose that arr is a very long vector.

The vector will be copied.
Sent to the function
The function will use a random generator to generate a number between 0 and arr.size()-1.
Then return the string at this index.

Copying an entire vector just to have its size is very expensive. A reference will work as well without the added memory consumption.

The optimized function is:

string randomItem(const vector<string> &arr);

Activity

Can you tell me why we use const?

This is called call-by-constant-reference.

Returning Values

The classic way to return a value wih a function is called return-by-value.

double average(double a, double b){
    return (a+b)/2;
}

However, it creates the same issue with large objects. It copies the object when you return it.

We can use a return-by-constant-reference:

LargeType randomItem1( const vector<LargeType> & arr){
    return arr[randomInt(0, arr.size()-1)];
}

const LargeType& randomItem2( const vector<LargeType> & arr){
    return arr[randomInt(0, arr.size()-1)];
}

vector<LargeType> vec;
/**
 * ...
 */

LargeType item1 = randomItem1(vec);            // copy
LargeType item2 = randomItem2(vec);            // copy
const LargeType &item3 = randomItem2(vec);     // no copy
auto &item4 = randomItem2(vec);                // no copy

It creates some issue too, item3 cannot change.

The last one is the return-by-reference, in which you just return a reference to an object. It is used in specific context.

The Big-Five

In C++ classes come with five special functions that are already implemented:

destructor
copy constructor
move constructor
copy assignment operator
move assignment operator

Destructor

This one is simple. When called the destructor free up any resources acquired during the use of the object.

Meaning that it needs to call delete for any corresponding :code:`new`s, closinf files, etc.

The destructor is called when the object goes out of scope, or if you call delete.

Copy Constructor and Move Constructor

There are two special constructors that are required to construct a new object, initialized to the same state as another object of the same type:

copy constructor: if the existing object is an lvalue
move constructor: if the existing object is an rvalue

Concretely it allows us to do the following:

Int B = C;      // Copy construct if C is lvalue; Move construct if C is rvalue
Int B{C};       // Copy construct if C is lvalue; Move construct if C is rvalue

Do not confuse it with:

B = C; // Assignment operator

It is also called when:

an object is passed using call-by-value instead of &
an object returned by value instead of & or const &

Copy Assignment and Move Assignment (operator=)

The assignment operator is called when = is applied to two objects hat have been both constructed.

If we are doing the following:

B = C;

We intend to copy C to B.

If C is an lvalue, then the copy assignment operator is used. Otherwise, if C is an rvalue, it is the move assignment operator that is used.

Default

By default it is working very well, you don’t need to change anything.

The main issue comes when you have pointers inside the class.

The default does not delete pointers.
It copies the pointer value to the new pointer, so it does not create a real copy.

Example

The signatures of the these methods for the class Int are given below:

~Int();                             // Destructor
Int(const Int &rhs);                // Copy constructor
Int(Int &&rhs);                     // Move constructor
Int &operator= (const Int &rhs);    // Copy assignment
Int &operator= (Int &&rhs);         // Move assignment

Important

If you modify one these methods, it is advised to modify all of them.

Pointer Exceptions

Consider the class Int in which we have a pointer as a variable.

We need to change the implementation of the Big-Five.

#include <iostream>
using namespace std;


class IntCell
{
  public:
    explicit IntCell( int initialValue = 0 )
    { 
      storedValue = new int{ initialValue }; 
    }
    
    ~IntCell( )
    {
      delete storedValue;
    }

    IntCell( const IntCell & rhs )
    { 
      storedValue = new int{ *rhs.storedValue };
    }

    IntCell( IntCell && rhs ) : storedValue{ rhs.storedValue }
    {
      rhs.storedValue = nullptr;
    }
    
    IntCell & operator= ( const IntCell & rhs )
    {
      if( this != & rhs ){
        *storedValue = *rhs.storedValue;
      }
      return *this;
    }
    
    IntCell & operator= ( IntCell && rhs )
    {
      std::swap( storedValue, rhs.storedValue );
      return *this;
    }
    
    int read( ) const
    {
      return *storedValue;
    }
    void write( int x )
    {
      *storedValue = x;
    }
    
  private:
    int *storedValue;
};