본문 바로가기

인공지능(AI)/Udacity tensorflow 강의

[Lesson 2] Your First Model: Fashion MNIST (1)

- In the previous lesson, I got a quick look at how to build and train neural networks using TensorFlow and Keras, and how neural networks and the training process works.

 

- In this lesson, I will create a neural network that can recognize items of clothing and images.

 

- Remember, an example is a feature(inputs) label(outputs) pair that I feed to the training loop.

 

- The Fashion-MNIST dataset consists of 28 by 28 pixel gray-scale images of clothing.

 

Fahion_MNIST dataset
Full list of all the 10 different items

 

- Out of these 70,000 images, We'll use 60,000 to train the neural network and 10,000 to test how well our neural network can recognize the items of clothing. (split the dataset into training and testing)

 

Test Data & Training Data

- Each image is 784 (28 * 28) bytes. So our job is to create a neural network that takes the 784 bytes as input, and then identifies which of the 10 different items of clothing the image represents.

 

 

 

What out network will look like

 

- The input to out models is an array of length 784 (28 * 28). The process of converting a 2D image into a vector is called flattening.

ex) tf.keras.layers.Flatten (input_shape=(28, 28, 1)) 

 

- The input will be fully connected to the first dense layer of our network, where we've chosen to use 128 units.

ex) tf.keras.layers.Dense (128, activation=tf.nn.relu)

 

- ReLU(Rectified Linear Unit, 정류한 선형 유닛) gives a Dense layer more power. ReLU stands for Rectified Linear Unit and it is a mathematical function that looks like this:

 

ReLU

- ReLU function gives an output of 0 if the input is negative or zero, and if input is positive, then the output will be equal to the input.

 

- ReLU gives the network the ability to solve nonlinear problems (most problems we want to solve). adding ReLU to our Dense layers can help solve the problem.

 

- ReLU is a type of activation function (활성화 함수). want to know more, see this article on ReLU in Deep Learning

 

Rectified Linear Units (ReLU) in Deep Learning

Explore and run machine learning code with Kaggle Notebooks | Using data from no data sources

www.kaggle.com

 

- The output layer contains 10 units because our fashion MNIST dataset contains of articles of clothing.

 

- The model could give us the following probabilities.

 

Probabilities

- These 10 output values refer to probabilities. These 10 numbers are also called the probability distribution or in other words, a distribution of probabilities for each of the output classes of clothing, all summing up to one.

 

- We'll tell our output dense layer to create these probabilities using the softmax statement.

ex) tf.keras.layers.Dense(10, activation=tf.nn.softmax)

 

- Review some of the new terms

  • Flattening: The process of converting a 2d image into 1d vector
  • ReLU: An activation function that allows a model to solve nonlinear problems
  • Softmax: A function that provides probabilities for each possible output class
  • Classification: A machine learning model used for distinguishing among two or more output categories