인공지능(AI)/Udacity tensorflow 강의

[Lesson 7] Time Series Forecasting

supremo7 2020. 9. 30. 01:33

- Time series(시계열)은 일정 시간 간격으로 배치된 데이터들의 수열을 말한다. Time series prediction(시계열 예측)이라고 하는 것은 주어진 시계열을 보고 수학적인 모델을 만들어서 미래에 일어날 것들을 예측하는 것을 뜻하는 말이다.

 

- In this course I will learn how to train deep neural networks to forecast time series.

 

- Time series is an ordered sequence of values usually equally spaced over time every year, day, second or even few microseconds like in the audio clip example.

 

  • The time series which have a single vaue at each time-step are called univariate.
  • The sime series which have multiple values at each time-step are called multivariate.

 

 

Multivariate time series

 

 

- Time series can be used to analyze just about anything that evolves over time.

 

- The most obvious being predicting the future is called forecasting. I may also want to predict the past if that makes any sense.

 

- This might be useful if there's missing or corrupted data in the time series. This is called imputation(대체).

 

- Time series analysis can also be used to detect anomalies(변칙).

 

- Time series come in all all shapes and sizes, but some patterns are more common than others. so it's useful to recognize them.

 

- For example, Trend, Seasonality, Noise are common patterns.

 

 

Common patterns

 

 

<Colab Notebook>

-  To access the Colab Notebook, login to your Google account and click on the link below:

Common Patterns

 

Google Colaboratory

 

colab.research.google.com

- Since neural networks rely on stochasticity (i.e. randomness) to initialize their parameters and gradient descent selects random batches of training data at each iteration, is perfectly normal if the outputs you see when you run the Colabs are slightly different.

 

 

- The simplest approach is to take the last value and assume that the next value will be the same. This is called naive forecasting.

 

- To measure the performance of our forecasting model, we typically want to split the time series into a training period, a validation period, test period. This is called fixed partitioning.

 

 

Fixed Partitioning

 

 

- It will not always be a very reliable estimate, because the time series may behave differently in the future, but hopefully it will be a reasonable estimate.

 

- I should train my model one last time on the full time series including the test set, before I deploy my model to production.

 

- Because the most recent period usually contains the most useful information to predict the future, Including the test set is important. In other words, the test period is in the future.

 

- Roll forward partitioning is to start training period and we gradully increase it. The drawback is that it require much more training time, but benefit is that it will more closely mimic the production conditions.

 

 

Roll forward Partitioning

 

 

- For simplicity, we will use fixed partitioning in the rest of this course.

 

<Colab Notebook>

- To access the Colab Notebook, login to your Google account and click on the link below:

Naive Forecasting

 

Google Colaboratory

 

colab.research.google.com

 

<소스 코드>

github.com/HoYoungChun/TensorFlow_study/blob/master/11_Common%20patterns.py

HoYoungChun/TensorFlow_study

Udacity의 Intro to TensorFlow for Deep Learning 강좌 for TF_Certificate 취득 - HoYoungChun/TensorFlow_study

github.com