- In the previous lesson, I got a quick look at how to build and train neural networks using TensorFlow and Keras, and how neural networks and the training process works.
- In this lesson, I will create a neural network that can recognize items of clothing and images.
- Remember, an example is a feature(inputs) label(outputs) pair that I feed to the training loop.
- The Fashion-MNIST dataset consists of 28 by 28 pixel gray-scale images of clothing.
- Out of these 70,000 images, We'll use 60,000 to train the neural network and 10,000 to test how well our neural network can recognize the items of clothing. (split the dataset into training and testing)
- Each image is 784 (28 * 28) bytes. So our job is to create a neural network that takes the 784 bytes as input, and then identifies which of the 10 different items of clothing the image represents.
- The input to out models is an array of length 784 (28 * 28). The process of converting a 2D image into a vector is called flattening.
ex) tf.keras.layers.Flatten (input_shape=(28, 28, 1))
- The input will be fully connected to the first dense layer of our network, where we've chosen to use 128 units.
ex) tf.keras.layers.Dense (128, activation=tf.nn.relu)
- ReLU(Rectified Linear Unit, 정류한 선형 유닛) gives a Dense layer more power. ReLU stands for Rectified Linear Unit and it is a mathematical function that looks like this:
- ReLU function gives an output of 0 if the input is negative or zero, and if input is positive, then the output will be equal to the input.
- ReLU gives the network the ability to solve nonlinear problems (most problems we want to solve). adding ReLU to our Dense layers can help solve the problem.
- ReLU is a type of activation function (활성화 함수). want to know more, see this article on ReLU in Deep Learning
Rectified Linear Units (ReLU) in Deep Learning
Explore and run machine learning code with Kaggle Notebooks | Using data from no data sources
www.kaggle.com
- The output layer contains 10 units because our fashion MNIST dataset contains of articles of clothing.
- The model could give us the following probabilities.
- These 10 output values refer to probabilities. These 10 numbers are also called the probability distribution or in other words, a distribution of probabilities for each of the output classes of clothing, all summing up to one.
- We'll tell our output dense layer to create these probabilities using the softmax statement.
ex) tf.keras.layers.Dense(10, activation=tf.nn.softmax)
- Review some of the new terms
- Flattening: The process of converting a 2d image into 1d vector
- ReLU: An activation function that allows a model to solve nonlinear problems
- Softmax: A function that provides probabilities for each possible output class
- Classification: A machine learning model used for distinguishing among two or more output categories
'인공지능(AI) > Udacity tensorflow 강의' 카테고리의 다른 글
[Lesson 3] Introduction to CNNs (1) (0) | 2020.09.26 |
---|---|
[Lesson 2] Your First Model: Fashion MNIST (2) (0) | 2020.09.25 |
[Lesson 1] Introduction to Machine Learning (2) (0) | 2020.09.24 |
[Lesson 1] Introduction to Machine Learning (1) (0) | 2020.09.24 |
[Intro] Welcome to the Course (0) | 2020.09.24 |