In simple terms, the neural networks is a computer simulation model that is designed according to the human nervous system and brain. However, neural network python could easily be described without using the human analogies. You can also take them as a mathematical function to give out perfect explanation.
However, you might have certain doubts that why to build Neural Network from the scratch?
There are neural network python libraries available such as PyBrain that could be used. However, implementing a network is essential to get a better understanding of the working process of the neural network.
Before getting started, let us tell you that the codes used here are simply for the understanding point of view. Now, get the work started.
1. Generating A Dataset
You need to generate a dataset before starting the neural network building using python. You can generate it yourself or you can simply take help from some online software that offers inbuilt dataset. Take two different type of sets on the x and y-axis. Basically, you need to plot the data. It can be medical data or business related data.
We are mainly focusing on the Machine Learning Classifiers that can easily predict the class of the data spread out on the x and y coordinates. However, the data cannot be separated by a line that will limit the use of linear classifier. This simply means that you cannot use Logistic Regression without non-linear features.
This is the plus point of using neural networks as there is no need to worry about features like polynomials. The neural network has some hidden layer that can easily remember features.
2. Training A Neural Network
Then you need to create the neural network 3-layers such as input layer, an output layer and a hidden layer. Since the dimension of the data taken is x and y hence the input layer will have 2 nodes in total (if you have more dimensions then you need to take the number of nodes accordingly).
The classes will determine the number of nodes that will be held in the output layer. Since our predicting class has two values (1,0) hence the number of nodes will be 2 as well. You can extend the classes and network as per your wish.
However, the hidden layer dimensionality is in your hand. You can take as many nodes as you want to take. You need to understand that more nodes give you a chance to fit more complex functions. The two things that you will face with higher nodes are:
- To make predictions you will need more computation and need to learn network parameters efficiently.
- You will become disposed to overfit the data with the higher number of nodes.
Then, you will need an activation function that will transform the input data to the output layer. The most commonly used activation functions are a sigmoid function, tanh, and ReLUs.
Creating a class:
class NeuralNetwork:
def _init_(self, x, y);
self.input = x
self.weights1 = np.random.rand(self.input.shape[1],4)
self.weights2 = np.random.rand(4,1)
self.y = y
self.output = np.zeros(y,shape)
Here x is the input layer with randomly hidden layer and ŷ as the output layer. The weights and biases are denoted by their first letter that is W and b respectively. The sigmoid function will be the activation function.
The output of the neural network will become:
Ŷ = σ(W2σ(W1x+b1)+b2)
Since you can see that in the equation, W and b are affecting the output layer. Logically, the correct value of W and b are required in order to define the strong point of the predictions. This fine-tuning of W and b are the main functioning of the data known as training a neural network.
The main work of this aspect is:
- Feedforward: Through this, you are calculating the Ŷ that is the predicted value.
- Backpropagation: it is simply the value update of W and b.
In the above code you need to add the feedforward code to come in sequence;
def feedforward(self):
self.layer1 = sigmoid(np.dot(self.input, self.weights1))
self.output = sigmoid(np.dot(self.layer1, self.weights2))
After this, you need to check out the best of the prediction you have done, to do you will need a loss function. The main thing to keep in mind is that you need to minimize the value of loss function to get the result. Use sum-of-square error as the loss function to dictate the problem.
The sum of square error is mainly the basic square of the difference between the predicted and actual value.
Gradient Descent:
Then comes the back propagation, once you are done with finding the loss error then you need to propagate the loss back in order to update the b and W value. Simply, derivative the loss function with W and b to get the result.
But you won’t be able to derivative the functions directly hence you will need a chain rule that can help out in easy calculations.
Loss (y, ŷ) = S i=1-n (y-ŷ)square
dLoss(y,ŷ) = dLoss(y,ŷ) *dŷ *dz
————– ————– —– —
dw dŷ dz dW
= 2(y-ŷ) *derivate of sigmoid function *x
= 2(y-ŷ) *z(1-z) *x
Now simply add the code to the main python code to get the proper result.
This looks complicated, however, when you will start to write the code, you will start to understand the terms and the basis. Don’t be worried if your prediction is not the same as your output. There can be a slight variation to avoid overfitting. This will generalize the data in a better way by giving the desired output.
For more Python related blogs