#method to get shape of tensor flow element
Saturday, March 11, 2023
Tensorflow general methods
Saturday, March 4, 2023
Dropout
- Dropout is a regularization technique.
- You only use dropout during training. Don't use dropout (randomly eliminate nodes) during test time.
- Apply dropout both during forward and backward propagation.
- During training time, divide each dropout layer by keep_prob to keep the same expected value for the activations. For example, if keep_prob is 0.5, then we will on average shut down half the nodes, so the output will be scaled by 0.5 since only the remaining half are contributing to the solution. Dividing by 0.5 is equivalent to multiplying by 2. Hence, the output now has the same expected value. You can check that this works even when keep_prob is other values than 0.5.
Wednesday, March 1, 2023
Deep Learning methodology using gradient descent
Usual Deep Learning methodology to build the model:
- Initialize parameters / Define hyperparameters
- Loop for num_iterations:
Sunday, December 18, 2022
split data set to train, cross validation and test sets
print(f"the shape of the original set (input) is: {x.shape}")
print(f"the shape of the original set (target) is: {y.shape}\n")
Tuesday, December 13, 2022
Epochs and batches
We provide epoch value while fitting/training the model as below.
Example: model.fit(X,Y,epoch=100)
Epochs and batches
In the fit statement above, the number of epochs was set to 100. This specifies that the entire data set
should be applied during training 100 times. During training, you see output describing the progress of
training that looks like this:
Epoch 1/100
157/157 [==============================] - 0s 1ms/step - loss: 2.2770The first line, Epoch 1/100, describes which epoch the model is currently running. For efficiency,
the training data set is broken into 'batches'. The default size of a batch in Tensorflow is 32.
if given an model has are 5000 examples(X_train) it will set or roughly to 157 batches.
The notation on the 2nd line 157/157 [==== is describing which batch has been executed.
Monday, December 12, 2022
Derivative using python
Libraries for derivative
from sympy import symbols, diff
Let's try this out. Let's look at the derivative of the function
Sunday, December 11, 2022
SparseCategorialCrossentropy or CategoricalCrossEntropy
Tensorflow has two potential formats for target values and the selection of the loss defines which is expected.
- SparseCategorialCrossentropy: expects the target to be an integer corresponding to the index. For example, if there are 10 potential target values, y would be between 0 and 9.
- CategoricalCrossEntropy: Expects the target value of an example to be one-hot encoded where the value at the target index is 1 while the other N-1 entries are zero. An example with 10 potential target values, where the target is 2 would be [0,0,1,0,0,0,0,0,0,0].
Friday, December 9, 2022
Get the output of each layer in Neural network
Lets Consider following simple neural network
import keras.backend as K
from keras.models import Model
from keras.layers import Input, Dense
input_layer = Input((10,))
layer_1 = Dense(10)(input_layer)
layer_2 = Dense(20)(layer_1)
layer_3 = Dense(5)(layer_2)
output_layer = Dense(1)(layer_3)
model = Model(inputs=input_layer, outputs=output_layer)# some random input
import numpy as np
features = np.random.rand(100,10)and consider this model is trained
# With a Keras function get the ouputs of all the layers
get_all_layer_outputs = K.function([model.layers[0].input],
[l.output for l in model.layers[0:]])
layer_output = get_all_layer_outputs([features]) # return the same thing
#layer_output is a list of all layers outputs
#if the model is trained you will get the output for input with trained weights other wise
it will give outout with initial weights
Wednesday, December 7, 2022
Output layer of Neural network for Regression and classification problems
Regression output layer:
When developing a neural network to solve a regression problem, the output layer should have exactly one node. Here we are not trying to map inputs to a variety of class labels, but rather trying to predict a single continuous target value for each sample. Therefore, our network should have one output node to return one – and exactly one – output prediction for each sample in our dataset.
The activation function for a regression problem will be linear. This can be defined by using activation = ‘linear’ or leaving it unspecified to employ the default parameter value activation = None
Linear activation function: The linear activation function, also known as "no activation," or "identity function" (multiplied x1. 0), is where the activation is proportional to the input. The function doesn't do anything to the weighted sum of the input, it simply spits out the value it was given
Evaluation metrics for regression: Mostly use MSE loss function and other available options as below.
- Root Mean Squared Error (RMSE) – a good option if you’d like the error to be in the same units as the target variable
- Mean Absolute Error (MAE) – useful for when you need an error that scales linearly
- Median Absolute Error (MdAE) – de-emphasizes outliers
Classification output layer:
If your data has a target that resides in a single vector, the number of output nodes in your neural network will be 1 and the activation function used on the final layer should be sigmoid. On the other hand, if your target is a matrix of One-Hot-Encoded vectors, your output layer should have 2 nodes and the activation function on the final layer should be softmax. Usually for binary classification, the last layer is logistic regression(as its single node/sigmoid) for deciding the class output.
Example: if Y has category values of (Yes,no) then one hot encoding give 2 columns encoded_yes, encoded_no. for these cases we need to consider 2 neurons in output layer for 2 outputs.
Evaluation metrics for Classfication:
The loss function used for binary classification problems is determined by the data format as well. When dealing with a single target vector of 0s and 1s, you should use BinaryCrossentropy as the loss function. When your target variable is stored as One-Hot-Encoded values, you should use the CategoricalCrossentropy loss function.
Reference:
https://www.enthought.com/blog/neural-network-output-layer/
Friday, December 2, 2022
Training a sign wave with feed forward neural network
Lets create some sample sign wave data and add some noise to it.
#lets train the model with feed forward neural network
Information: One can tune neural network to any number of hidden layers and number of
Sunday, November 18, 2018
Training a Convolutional Neural Network to detect Digits
from sklearn.externals import joblib
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
import numpy as np
import pandas as pd
import os
#changes working directory
os.chdir("D:\\Kartheek\\DigitRecog")
(X_train,y_train),(X_test,y_test)=mnist.load_data()
x_train=X_train.reshape(X_train.shape[0],28,28,1).astype('float32')
x_test=X_test.reshape(X_test.shape[0],28,28,1).astype('float32')
print(X_train.shape)
print(X_test.shape)
#os.chdir("D:\\NeuralNetwork")
batch_size = 132
num_classes = 10
epochs = 16
# input image dimensions
img_rows, img_cols = 28, 28
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
#cv2.imwrite('messigray.png',x_train[0])
model = Sequential()
model.add(Conv2D(64, kernel_size=(5, 5),
activation='relu',
input_shape=(28,28,1)))
model.add(Conv2D(128, (5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
Images = pd.read_csv("test.csv")
Images = Images.values
Images.shape
x_predict=Images.reshape(28000,28,28,1)
y_predict=model.predict(x_predict)
Submission = pd.read_csv("submission.csv")
z=y_predict.argmax(axis=1)
Submission["Label"]=z
Submission.to_csv("submission.csv",index=False)
joblib.dump(model, "digits_cls.pkl", compress=3)
Images = pd.read_csv("test.csv")
Images.shape
Images = Images.values
Images.shape
x_predict=Images.reshape(28000,28,28,1)
y_predict=model.predict(x_predict)
Submission = pd.read_csv("submission.csv")
z=y_predict.argmax(axis=1)
Submission["Label"]=z
Submission.to_csv("submission.csv",index=False)