Neural Network From Scratch with C++

A computational model that makes decisions similar to the human brain is called a neural network. Neural networks consist of layers of nodes, or neurons: an input layer, one or more hidden layers, and an output layer. These layers process and learn from data.

The Architecture Behind Neural Networks

The key components of neural networks are Input, Weight, Bias, Transfer and Activation function, and Layers of nodes — Input, Hidden, and Output Layers.

nn-layers

Photo from ujjwalkarn.me

We will use this basic architecture in our examples and apply the iris dataset to train our model for classifying iris flowers. Each node in the input layer receives inputs, which are then combined linearly using weights and biases before applying a non-linear activation function.

By adjusting the Weight and Bias through backpropagation, we can model the relationships between our input data and the target predictions.

Peeking At Dataset

Suppose our dataset looks like this:

sepal_lengthsepal_widthpetal_lengthpetal_widthsetosavirginicaversicolor
0.08333333333330.6666666666670.00.04166666666671.00.00.0
0.7222222222220.4583333333330.6949152542370.9166666666670.01.00.0
0.6666666666670.4166666666670.6779661016950.6666666666670.00.01.0
0.7777777777780.4166666666670.8305084745760.8333333333330.01.00.0
0.6666666666670.4583333333330.7796610169490.9583333333330.01.00.0
0.3888888888890.4166666666670.5423728813560.4583333333330.00.01.0
0.6666666666670.5416666666670.7966101694920.8333333333330.01.00.0

We have sepal_length, sepal_width, petal_length, and petal_width as our input node, and setosa, virginica, and versicolor as our output node.

For our calculations, we’ll convert this dataset into a matrix format, which will look something like this:

Input Matrix:

[sl1sw1pl1pw1sl2sw2pl2pw2sl3sw3pl3pw3sl4sw4pl4pw4sl5sw5pl5pw5...] \begin{bmatrix} sl1 & sw1 & pl1 & pw1 \\ sl2 & sw2 & pl2 & pw2 \\ sl3 & sw3 & pl3 & pw3 \\ sl4 & sw4 & pl4 & pw4 \\ sl5 & sw5 & pl5 & pw5 \\ ... \end{bmatrix}

Output Matrix:

[se1vi1vc1se2vi2vc2se3vi3vc3se4vi4vc4se5vi5vc5...] \begin{bmatrix} se1 & vi1 & vc1 \\ se2 & vi2 & vc2 \\ se3 & vi3 & vc3 \\ se4 & vi4 & vc4 \\ se5 & vi5 & vc5 \\ ... \end{bmatrix}

Defining Functions and Types

Let’s define two classes: the Axon class, which will serve as a container for our neuron layer, and the Neuron class, which will act as the foundation for our neural network.

class Axon {
    private:
        Node inputNode;                                              // this is the input node,
        int inputNodeLength;                                       // and the length of the node

        Node outputNode;                                           // the output node,
        int outputNodeLength;                                    // and the length of the last node

        std::vector<Cell> isolatedCells;                     // this would be the hidden layer

    public:
        Axon();
        Axon(int inputNodeLength, int outputNodeLength);

        void seedData(std::string filename);                // this is for reading the dataset
        void addCell(int nodes);                                      // hyperparameter for hidden layer

        ~Axon();
};

In the Neuron class, we’ll have Epoch and learningRate properties. These are hyperparameters used to specify the details of the learning process. In more advanced models, these hyperparameters can be tuned to optimize the results of the learning process.

Next, we’ll define the feedForward, backProp, and linkAdjustment class methods, which will later be implemented to adjust the Weight and Bias of the nodes.

class Neuron {
    private:
        int         epoch                  = 0;                         // hyperparameter for a number of iteration the training would be
        float     learningRate       = 0.0f;                    // hyperparameter for determines the step size at each iteration while moving toward a minimum of a loss function
        Axon    *learningAxon;

        void    feedForward(Node primalNode, std::vector<Cell> &isolatedCell);
        void    backProp(std::vector<Cell> &isolatedCell);
        void    linkAdjustment(std::vector<Cell> &isolatedCell);

    public:
        Neuron();
        Neuron(int epoch, float learningRate);

        ~Neuron();
};

We also need to define our activation function, which we will utilize during backpropagation. There are many choices for activation functions, but here we are going to use the sigmoid function.

inline double sigmoid(double x) {
    return 1.0 / (1.0 + std::exp(-x));
}

inline double sigmoidPrime(double x) {
    return sigmoid(x) * (1.0 - sigmoid(x));
}

This function’s primary role is to map input values to a range between 0 and 1, which is particularly useful in supervised classification problems where the output can be interpreted as probabilities. The sigmoid function is widely favored due to its smooth gradient, which aids in gradient-based optimization methods.

The probabilistic interpretation of the sigmoid output is especially advantageous in binary classification tasks, where it can represent the likelihood of a given input belonging to a specific class. More about sigmoid function will be referred here and here.

And that’s all for now, will continue on the next part.

logo

© 2024 Cameology. All rights reserved.