Genius工具比较最佳时间序列模型以进行多步时间序列建模

1081

收藏 2020-10-19

Genius工具比较最佳时间序列模型以进行多步时间序列建模
多步骤预测与评估
下面的代码段说明了如何将数据正确重整为（1，n_input，n），以预测下一周的情况。对于具有23个样本的测试数据（具有先前步骤的预测输出，即21 + 2）的3个月的多元时间序列，将重新调整为（7
通过获取前几周的预测产量来预测3周
# make a forecast
def forecast(model
# flatten data
data = array(history)
data = data.reshape((data.shape[0]*data.shape[1]
# retrieve last observations for input data
input_x = data[-n_input:
# reshape into [1
input_x = input_x.reshape((1
# forecast the next week
yhat = model.predict(input_x
# we only want the vector forecast
yhat = yhat[0]
return yhat
注意2：如果您希望查看如下所述的每个步骤的评估结果和图表，请在Github上查看笔记本（https://github.com/sharmi1206/covid-19-analysis 笔记本ts_dlearn_mstep_forecats.ipynb）
在这里，以每周的详细程度在每个步骤中，我们评估模型并将其与实际输出进行比较。
# evaluate one or more weekly forecasts against expected values
def evaluate_forecasts(actual
print("Actual Results"
print("Predicted  Results"
scores = list()    # calculate an RMSE score for each day
for i in range(actual.shape[1]):
# calculate mse
mse = mean_squared_error(actual[:
# calculate rmse
rmse = sqrt(mse)
# store
scores.append(rmse)
plt.figure(figsize=(14
plt.plot(actual[:
plt.plot(predicted[:
plt.title(ModelType + ' based Multi-Step Time Series Active Cases Prediction for step ' + str(i))
plt.legend()
plt.show()
# calculate overall RMSE
s = 0
for row in range(actual.shape[0]):
  for col in range(actual.shape[1]):
   s += (actual[row
   score = sqrt(s / (actual.shape[0] * actual.shape[1]))
   return score
def evaluate_model(train
model = None
# fit model
if(ModelType == 'LSTM'):
print('lstm')
model = build_model_lstm(train
elif(ModelType == 'BI_LSTM'):
print('bi_lstm')
model = build_model_bi_lstm(train
elif(ModelType == 'CNN'):
print('cnn')
model = build_model_cnn(train
elif(ModelType == 'LSTM_CNN'):
print('lstm_cnn')
model = build_model_cnn_lstm(train
# history is a list of weekly data
history = [x for x in train]
# walk-forward validation over each week
predictions = list()
for i in range(len(test)):
# predict the week
yhat_sequence = forecast(model
# store the predictions
predictions.append(yhat_sequence)
# get real observation and add to history for predicting the next week
history.append(test[i
# evaluate predictions days for each week
predictions = array(predictions)
score
return score
在这里，我们显示了单变量和多变量，多步时间序列预测。
基于多步Conv2D + LSTM（单变量和多变量）的德里状态预测
卷积神经网络长期短期记忆网络架构资源
CNN-LSTM的一种类型是ConvLSTM（主要用于二维时空数据），其中输入的卷积读取直接内置到每个LSTM单元中。
对于这个特定的单变量时间序列，我们的输入向量为
[timesteps = 14，rows = 1，column = 7，features = 2（输入和输出）]
# train CONV LSTM2D model
def build_model_cnn_lstm_2d(train
# prepare data
train_x
# define parameters
verbose
n_timesteps
# reshape into subsequences [samples
train_x = train_x.reshape((train_x.shape[0]
# reshape output into [samples
train_y = train_y.reshape((train_y.shape[0]
# define model
model = Sequential()
model.add(ConvLSTM2D(filters=64
model.add(Flatten())    model.add(RepeatVector(n_outputs))
model.add(LSTM(200
model.add(TimeDistributed(Dense(100
model.add(TimeDistributed(Dense(1)))    model.compile(loss='mse'
# fit network
model.fit(train_x
return model
# convert history into inputs and outputs
def to_supervised_2cnn_lstm(train
# flatten data  data = train.reshape((train.shape[0]*train.shape[1]
X
in_start = 0
# step over the entire history one time step at a time
for _ in range(len(data)):
# define the end of the input sequence
in_end = in_start + n_input
out_end = in_end + n_out
# ensure we have enough data for this instance
if out_end <= len(data):
      x_input = data[in_start:in_end
      x_input = x_input.reshape((len(x_input)
      X.append(x_input)
      y.append(data[in_end:out_end
      # move along one time step
      in_start += 1
return array(X)
# make a forecast def forecast_2cnn_lstm(model
# flatten data  data = array(history)
data = data.reshape((data.shape[0]*data.shape[1]
# retrieve last observations for input data
input_x = data[-n_input:
# reshape into [samples
input_x = input_x.reshape((1
# forecast the next week
yhat = model.predict(input_x
# we only want the vector forecast
yhat = yhat[0]
return yhat
# evaluate a single model
def evaluate_model_2cnn_lstm(train
# fit model
model = build_model_cnn_lstm_2d(train
# history is a list of weekly data
history = [x for x in train]
# walk-forward validation over each week
predictions = list()
for i in range(len(test)):
# predict the week
yhat_sequence = forecast_2cnn_lstm(model
# store the predictions
predictions.append(yhat_sequence)
# get real observation and add to history for predicting the next week
history.append(test[i
# evaluate predictions days for each week
predictions = array(predictions)
score
return score
读取状态数据和索引时间列：
df_state_all = pd.read_csv('all_states/all.csv')
df_state_all = df_state_all.drop(columns=['Latitude'
stateName = unique_states[8]
dataset = df_state_all[df_state_all['Name of State / UT'] == unique_states[8]]
dataset = dataset.sort_values(by='Date'
dataset = dataset[(dataset['Date'] >= '2020-03-25') & (dataset['Date'] <= '2020-06-06')]
print(np.shape(dataset))
daterange = dataset['Date'].values
no_Dates = len(daterange)
dateStart = daterange[0] dateEnd = daterange[no_Dates - 1]
print(dateStart) print(dateEnd)
dataset = dataset.drop(columns=['Unnamed: 0'
print(np.shape(dataset)) n = np.shape(dataset)[0]  scaler = MinMaxScaler(feature_range=(0
# split into train and test
train
# define the number of subsequences and the length of subsequences
n_steps
# define the total days to use as input
n_input = n_length * n_steps
score
# summarize scores
summarize_scores(ModelType
模型参数可以总结为：
型号摘要Conv2D + LSTM
valuate_model函数在每个步骤后附加模型预测分数，并在最后将其返回。
下图说明了对预测结果进行逆变换（以消除缩放的影响）后，多步ConvLSTM2D模型的实际与预测结果。
单变量ConvLSTM2D
对于具有22个输入特征和一个输出预测的多元时间序列，我们考虑了以下变化：在函数Forecast_2cnn_lstm中，我们替换了输入数据整形以构成多元特征
#In function forecast_2cnn_lstm  input_x = data[-n_input:
#replacing 0 with :  # reshape into [samples
input_x = input_x.reshape((1
#replacing 1 with #data.shape[1] for multi-variate
此外，在函数to_supervised_2cnn_lstm中，我们用23个特征替换x_input的特征大小从0到：和1，如下所示：
x_input = data[in_start:in_end
x_input = x_input.reshape((len(x_input)
                                                多元ConvLSTM2D
Conv2D + BI_LSTM
我们可以进一步尝试带有2D卷积层的双向LSTM，如下图所示。除了使用BI-LSTM代替单个LSTM之外，模型的堆叠和后续层均与上一步中的尝试相同。
资源
测试数据集上模型指标的比较
深度学习法RMSE
LSTM912.224
双向STM1317.841
有线电视新闻网1021.518
LSTM + CNN891.076
Conv2D + LSTM（单变量单步）1288.416
Conv2D + LSTM（多变量多步骤）863.163
结论
在此博客中，我讨论了使用深度学习机制的多步时间序列预测，并基于RMSE进行了比较/评估。在这里，我们注意到对于ConvLSTM2D而言，在7天的预测时间段内效果最佳，其次是具有CNN，CNN和LSTM网络的LSTM。使用有效的超参数调整，使用不同的隐藏层和神经元进行更广泛的模型评估，可以进一步提高准确性。
尽管我们看到多步模型的模型准确性下降，但是对于进行长期预测，这可能是有用的工具，前一周的预测结果有助于在预测输出中发挥主导作用。

关注 CDA 人工智能学院，回复“录播”获取更多人工智能精选直播视频！

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群