Introduction to Neural Networks |
By the end of this practical you will know how to:
Open your neuralnets
R project.
Open a new R script. Save it as a new file called representation_practical.R
in the 2_Code
folder.
Using library()
load the the packages tidyverse
and keras
# install.packages("tidyverse")
# install.packages("keras")
# Load packages necessary for this exercise
library(tidyverse)
library(keras)
source()
the helper_2.R
file in your 2_Code
folder.# Load helper.R
source("2_Code/helper.R")
Part 1: Fashion
fashion.RDS
dataset as a new object.# MNIST fashion data
fashion <- readRDS(file = "1_Data/fashion.RDS")
fashion
object using str()
.# Inspect contents
str(digit)
# PREPARATIONS -------
# fashion items
fashion_labels <- c('T-shirt/top','Trouser','Pullover','Dress','Coat',
'Sandal','Shirt','Sneaker','Bag','Ankle boot')
# split digit train
c(fashion_train_images, fashion_train_items) %<-% fashion$train
# split digit test
c(fashion_test_images, fashion_test_items) %<-% fashion$test
# reshape images
fashion_train_images_serialized <- array_reshape(fashion_train_images, c(nrow(fashion_train_images), 784))
fashion_test_images_serialized <- array_reshape(fashion_test_images, c(nrow(fashion_test_images), 784))
# rescale images
fashion_train_images_serialized <- fashion_train_images_serialized / 255
fashion_test_images_serialized <- fashion_test_images_serialized / 255
# expand criterion
fashion_train_items_onehot <- to_categorical(fashion_train_items, 10)
fashion_test_items_onehot <- to_categorical(fashion_test_items, 10)
# MODELING -------
# initialize deepnet
net <- keras_model_sequential()
# add layer
net %>%
layer_dense(input_shape = 784, units = 256, activation = "relu") %>%
layer_dense(units = 144, activation = "relu") %>%
layer_dense(units = 10, activation = "softmax")
# model information
summary(net)
# loss, optimizers, & metrics
net %>% compile(
optimizer = 'adam',
loss = 'categorical_crossentropy',
metrics = c('accuracy')
)
# fit network
net %>% fit(
x = fashion_train_images_serialized,
y = fashion_train_items_onehot,
epochs = 10
)
# extract weights
weights <- get_weights(net)
str()
inspect the structure of the weights
object. Do the contents line up with your expectations?# inspect weights
str(weights)
There are six elements. Three containing the weights (elements 1, 3, 5) and three containing the biases (elements 2, 4, 6).
Use the first elements in weights
to calculate the activation patterns, aka the embeddings, at the first layers for the first 1,000
fashion items, ignoring the bias and the activation function. You’ll see, this can be easily done using matrix multiplication %*%
.
# inspect weights
embedding <- fashion_train_images_serialized[1:1000, ] %*% weights[[1]]
Assess the dimensionality of embedding
using dim()
. Correct numbers of rows and columns?
Use the plot_embedding()
function, which you loaded earlier when you sourced the helper_2.R
file, to visualize the activations. Rows in the plot will be the 1,000
fashion items and columns the 256
nodes of the embedding at the first hidden layer. Looks a bit messy right?
# plot activation
plot_embedding(embedding)
1000
fashion items from fashion_train_items
and then use those to order the rows in embedding
.# extract fashon items
items <- fashion_train_items[1:1000]
# order activations
embedding <- embedding[order(items), ]
plot_embedding()
to plot the embedding. Things should look a lot clearer. The bands correspond to the different items, with the 0
-item ("T-shirt/top"
) at the bottom and the 9
-item ("Angle boot"
) at the top.cosine()
function from the helper_2.R
file to determine the similarities between the fashion item vectors in the embedding. Cosine determines the angle between the locations of two fashion items in the 256
dimensional space that is the embedding. Cosine is algebraically close to the standard correlation coefficient.# calculate cosine similarities
fashion_cosines <- cosine(embedding)
plot_cosine(fashion_cosines)
plot_cosine
function (also from the helper_2.R
file) to plot the matrix of cosine values. The categories 0
to 9
go from top to bottom and from left to right. Light grey values indicate high cosine similarity, darker ones low cosine similarity. Try to make sense of the plot.# Plot cosine similarities
plot_cosine(fashion_cosines)
plot_cosine_mds()
. Try to make sense of the plot.# calculate cosine similarities
plot_cosine_mds(fashion_cosines, fashion_labels[items[order(items)]+1])
# prediction confusion matrix
pred = net %>% predict_classes(fashion_test_images_serialized)
table(fashion_labels[fashion_test_items+1], fashion_labels[pred+1])
Coat
, Pullover
, and Shirt
, which in the cosine mds pretty much sit on top of each other.# second layer embedding
relu = function(z) {z[z < 0] = 0; z}
z_1 <- cbind(img_train[1:1000,],1) %*% rbind(weights[[1]], weights[[2]])
a_1 <- t(apply(z_1, 1, relu))
embedding <- a_1 %*% weights[[3]]
Part 2: Words
capital.RDS
dataset as a new object.# load embeddings
capital <- readRDS(file = "1_Data/capital.RDS")
rownames()
inspect to words for which embeddings are present.# rownames of capital
rownames(capital)
plot_embedding()
to plot the capital embeddings.# plot capital embedding
plot_embedding(capital)
# plot capital
capital_cosine = cosine(capital)
plot_cosine(capital_cosine)
# plot capital
capital_cosine = cosine(capital)
plot_cosine_mds(capital_cosine, rownames(capital_cosine), col = F)