Wednesday, December 7, 2022

Output layer of Neural network for Regression and classification problems

Regression output layer:

When developing a neural network to solve a regression problem, the output layer should have exactly one node. Here we are not trying to map inputs to a variety of class labels, but rather trying to predict a single continuous target value for each sample. Therefore, our network should have one output node to return one – and exactly one – output prediction for each sample in our dataset.

The activation function for a regression problem will be linear. This can be defined by using activation = ‘linear’ or leaving it unspecified to employ the default parameter value activation = None

Linear activation function: The linear activation function, also known as "no activation," or "identity function" (multiplied x1. 0), is where the activation is proportional to the input. The function doesn't do anything to the weighted sum of the input, it simply spits out the value it was given

Evaluation metrics for regression: Mostly use MSE loss function and other available options as below.

  • Root Mean Squared Error (RMSE) – a good option if you’d like the error to be in the same units as the target variable
  • Mean Absolute Error (MAE) – useful for when you need an error that scales linearly
  • Median Absolute Error (MdAE) – de-emphasizes outliers

Classification output layer:

If your data has a target that resides in a single vector, the number of output nodes in your neural network will be 1 and the activation function used on the final layer should be sigmoid. On the other hand, if your target is a matrix of One-Hot-Encoded vectors, your output layer should have 2 nodes and the activation function on the final layer should be softmax.  Usually for binary classification, the last layer is logistic regression(as its single node/sigmoid) for deciding the class output.

Example: if Y has category values of (Yes,no) then one hot encoding give 2 columns encoded_yes, encoded_no. for these cases we need to consider 2 neurons in output layer for 2 outputs.

Evaluation metrics for Classfication:

The loss function used for binary classification problems is determined by the data format as well. When dealing with a single target vector of 0s and 1s, you should use BinaryCrossentropy as the loss function. When your target variable is stored as One-Hot-Encoded values, you should use the CategoricalCrossentropy loss function.

Reference:

https://www.enthought.com/blog/neural-network-output-layer/

No comments:

Post a Comment