Technical Stock Price Analysis with Exponential Moving Average (EMA) and Forecasting using ARIMA Model
¶
_____________________________________________________
Prepared analysis includes two models: (1) Exponentially Weighted Moving Average (EMA) Analysis and (2) ARIMA. The first analysis model is intended to produce buy/sell signals based on crossovers compared to a historical average and the second model is trained to predict stock price fluctuations for the specified number of observation points.
import pandas as pd
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdt
import seaborn as sns
import mplfinance
from pylab import rcParams
%matplotlib inline
import os
import warnings
warnings.filterwarnings('ignore')
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.arima_model import ARIMA
from pmdarima.arima import auto_arima
from sklearn.metrics import mean_squared_error, mean_absolute_error
import math
Prices = pd.read_csv("C:\\...\\Closing prices 2012-2021.csv", parse_dates=['Date'])
Prices['Date']= pd.to_datetime(Prices['Date'])
Prices = Prices.sort_values(by="Date")
Prices.set_index('Date', inplace = True)
Prices.index.name = None
Prices
Close PFE USD | Close NSDAQ USD | Close NSRGF USD | Close AAPL USD | Close AZN USD | Close KO USD | |
---|---|---|---|---|---|---|
2012-01-03 | 21.48 | 24.96 | 58.25 | 14.69 | 23.86 | 35.07 |
2012-01-04 | 21.28 | 24.62 | 57.75 | 14.77 | 23.74 | 34.85 |
2012-01-05 | 21.11 | 24.66 | 57.20 | 14.93 | 23.43 | 34.69 |
2012-01-06 | 21.08 | 24.43 | 56.60 | 15.09 | 23.49 | 34.47 |
2012-01-09 | 21.33 | 24.33 | 57.41 | 15.06 | 23.29 | 34.47 |
... | ... | ... | ... | ... | ... | ... |
2021-11-26 | 54.00 | 203.68 | 131.31 | 156.81 | 56.58 | 53.73 |
2021-11-29 | 52.40 | 209.08 | 131.30 | 160.24 | 55.53 | 54.58 |
2021-11-30 | 53.73 | 203.23 | 129.00 | 165.30 | 54.83 | 52.45 |
2021-12-01 | 54.68 | 199.03 | 127.57 | 164.77 | 54.88 | 52.30 |
2021-12-02 | 53.04 | 201.23 | 128.48 | 163.76 | 54.79 | 53.07 |
2497 rows × 6 columns
Descriptive statistics can be found below in the table. Table represents minimum (min) and maximum (max) price of every selected stock; total number of observations (count); and percentage distribution of the data; the mean that is the estimated central value of a group of numbers (mean); and standard deviation that quantifies the variation (or dispersion) of dataset.
Total number of observations is 2497. The largest standard deviation can be seen in NSDAQ stock price dataset which indicates that NSDAQ price changed the most over a specified period of time.
Prices.describe()
Close PFE USD | Close NSDAQ USD | Close NSRGF USD | Close AAPL USD | Close AZN USD | Close KO USD | |
---|---|---|---|---|---|---|
count | 2497.000000 | 2497.000000 | 2497.000000 | 2497.000000 | 2497.000000 | 2497.000000 |
mean | 33.252002 | 75.290224 | 84.917145 | 48.040521 | 36.423244 | 44.604537 |
std | 5.767063 | 42.992460 | 18.819134 | 37.214999 | 10.341610 | 5.607982 |
min | 20.480000 | 21.240000 | 55.800000 | 13.950000 | 20.020000 | 33.500000 |
25% | 29.640000 | 39.920000 | 72.850000 | 23.610000 | 29.040000 | 40.600000 |
50% | 33.280000 | 68.340000 | 77.600000 | 32.190000 | 34.280000 | 43.510000 |
75% | 36.320000 | 95.130000 | 99.040000 | 53.060000 | 41.270000 | 47.630000 |
max | 54.680000 | 212.830000 | 135.640000 | 165.300000 | 63.830000 | 60.130000 |
Moving Average is a simple form technical analysis used as one of stock trading strategies. The average is taken over a specific period of time selected by trader; it flattens price trends by filtering out the “noise” from random short-term price fluctuations and helps to predict when stock should be bought/sold (produce buy/sell signals based on crossovers-divergences from the calculated historical average.
In this analysis, Exponentially Weighted Moving Average was selected over a Simple Moving Average as it gives more weight and significance to the recent data point. This means that EMA reacts more to recent price changes than a simple moving average that considers all data points as equal. Because of latter reason, EMA reacts quicker to the fluctuations in data than SMA.
For the analysis, long and short term EMA was calculated. For short EMA calculation 20 days were used; for long term EMA calculation 200 days were used (numbers were selected as a common practice after analyzing several publications towards Moving Average methodology).
Exponentially Weighted Moving Average is estimated as per below:
**PRICE(T) × K + EMA(Y) × (1 − K)**
where:
Source: https://www.dailyfx.com/education/moving-averages/ema-exponential-moving-average.html
Health care sector companies analized: Pfizer, Inc (PFE) and AstraZeneca, Plc (AZN)
(1) Calculating short and long term Exponentially Weighted Moving Average for PFE:
PFE_short_term_EMA = Prices['Close PFE USD'].ewm(span=20, adjust=False).mean()
PFE_long_term_EMA = Prices['Close PFE USD'].ewm(span=200, adjust=False).mean()
As already can be seen from the 1st graph, long term moving average line is smoother than short term moving average line. It is because long term EMA flattens price swings more over a greater time period used in calculations. Long term EMA is used to see how stock is acting over a year, it helps to see the trend, while short term EMA is preferred to short-term swing trading.
The general simplified rule for using moving average as trading strategy is that as long as price stays above the exponential moving average, higher prices should be expected further. In opposite, as long as price is below the moving average, lower prices should be expected further as well. And the change should occur when price line crosses the EMA line in the graph. However, this is a simplified trading strategy and there are many more. such as buy/sell signals are generated when short term moving average crosses over omg term moving average and etc.
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close PFE USD'], "cadetblue", label='PFE Close')
ax[0].plot(PFE_long_term_EMA, 'indianred', label='Long-term EMA')
ax[1].plot(Prices['Close PFE USD'], "cadetblue", label='PFE Close')
ax[1].plot(PFE_short_term_EMA, 'indianred', label='Short-term EMA')
ax[0].set_title('PFE Long-Term EMA')
ax[1].set_title('PFE Short-Term EMA')
ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)
plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")
For PFE a clear up-trend is seen in both graphs which means that higher peaks are expected.
(2) Calculating short and long term Exponentially Weighted Moving Average for AZN:
AZN_short_term_EMA = Prices['Close AZN USD'].ewm(span=20, adjust=False).mean()
AZN_long_term_EMA = Prices['Close AZN USD'].ewm(span=200, adjust=False).mean()
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close AZN USD'], "cadetblue", label='AZN Close')
ax[0].plot(AZN_long_term_EMA, 'indianred', label='Long-term EMA')
ax[1].plot(Prices['Close AZN USD'], "cadetblue", label='AZN Close')
ax[1].plot(AZN_short_term_EMA, 'darkorchid', label='Short-term EMA')
ax[0].set_title('AZN Long-Term EMA')
ax[1].set_title('AZN Short-Term EMA')
ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)
plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")
For AZN a beginning of the down-trend is seen in both graphs which means that lower peaks and lower troughs over time are expected.
Technology sector companies analized: Nasdaq, Inc. (NSDAQ) and Apple, Inc. (AAPL)
(1) Calculating short and long term Exponentially Weighted Moving Average for NSDAQ:
NSDAQ_short_term_EMA = Prices['Close NSDAQ USD'].ewm(span=20, adjust=False).mean()
NSDAQ_long_term_EMA = Prices['Close NSDAQ USD'].ewm(span=250, adjust=False).mean()
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close NSDAQ USD'], "slategrey", label='NSDAQ Close')
ax[0].plot(NSDAQ_long_term_EMA, 'indianred', label='Long-term EMA')
ax[1].plot(Prices['Close NSDAQ USD'], "slategrey", label='NSDAQ Close')
ax[1].plot(NSDAQ_short_term_EMA, 'darkorchid', label='Short-term EMA')
ax[0].set_title('NSDAQ Long-Term EMA')
ax[1].set_title('NSDAQ Short-Term EMA')
ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)
plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")
For NSDAQ a clear up-trend is seen in long term EMA trend line which means that higher peaks are expected in long-term perspective, however, a down-trend is seen in short term EMA.
(2) Calculating short and long term Exponentially Weighted Moving Average for AAPL:
AAPL_short_term_EMA = Prices['Close AAPL USD'].ewm(span=20, adjust=False).mean()
AAPL_long_term_EMA = Prices['Close AAPL USD'].ewm(span=200, adjust=False).mean()
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close AAPL USD'], "slategrey", label='AAPL Close')
ax[0].plot(AAPL_long_term_EMA, 'indianred', label='Long-term EMA')
ax[1].plot(Prices['Close AAPL USD'], "slategrey", label='AAPL Close')
ax[1].plot(AAPL_short_term_EMA, 'darkorchid', label='Short-term EMA')
ax[0].set_title('AAPL Long-Term EMA')
ax[1].set_title('AAPL Short-Term EMA')
ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)
plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")
For APPL a clear up-trend is seen in both graphs which means that higher peaks are expected.
Food sector companies analized: Nestle SA (NSRGF) and Coca-Cola, Inc. (KO)
(1) Calculating short and long term Exponentially Weighted Moving Average for NSRGF:
NSRGF_short_term_EMA = Prices['Close NSRGF USD'].ewm(span=20, adjust=False).mean()
NSRGF_long_term_EMA = Prices['Close NSRGF USD'].ewm(span=200, adjust=False).mean()
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close NSRGF USD'], "darkseagreen", label='NSRGF Close')
ax[0].plot(NSRGF_long_term_EMA, 'indianred', label='Long-term EMA')
ax[1].plot(Prices['Close NSRGF USD'], "darkseagreen", label='NSRGF Close')
ax[1].plot(NSRGF_short_term_EMA, 'darkorchid', label='Short-term EMA')
ax[0].set_title('NSRGF Long-Term EMA')
ax[1].set_title('NSRGF Short-Term EMA')
ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)
plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")
Similarly as for NSDAQ, in NSRGF graph a clear up-trend is seen in long term EMA trend line which means that higher peaks are expected in long-term perspective, while a down-trend is seen in short term EMA.
(2) Calculating short and long term Exponentially Weighted Moving Average for KO:
KO_short_term_EMA = Prices['Close KO USD'].ewm(span=20, adjust=False).mean()
KO_long_term_EMA = Prices['Close KO USD'].ewm(span=200, adjust=False).mean()
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close KO USD'], "darkseagreen", label='KO Close')
ax[0].plot(KO_long_term_EMA, 'indianred', label='Long-term EMA')
ax[1].plot(Prices['Close KO USD'], "darkseagreen", label='KO Close')
ax[1].plot(KO_short_term_EMA, 'darkorchid', label='Short-term EMA')
ax[0].set_title('KO Long-Term EMA')
ax[1].set_title('KO Short-Term EMA')
ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)
plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")
Once again, similarly as for NSDAQ and NSRGF, in KO graph an up-trend is seen in long term EMA trend line but a down-trend is seen in short term EMA. However, long term EMA up-trend line is not that assure as for NSDAQ. When a stock price crosses its 200-day EMA, it is a technical signal that a reversal has occurred (seen in the KO Long-Term EMA graph).
Exponentially Weighted Moving Average is used to identify the predominant trend and patterns in the market as EMA reduces a noise of everyday price fluctuations. There are many trading strategies applied by using EMA, the most straightforward one is as per below:
(1) A long position should be held as long as the price timeseries is above the EMA line; (2) and a short position should be realized as long as the price timeseries is below the EMA line (either short term EMA or long term EMA, depending on the aims of the investing/trading).
Sources:
Checking AZN and NSDAQ prices over period 2012 01-2021 12:
fig, ax = plt.subplots(2, 1, figsize=(16,9))
ax[0].plot(Prices['Close AZN USD'], "cadetblue", label='AZN Close')
ax[0].set_title('AZN Price 2012 01-2021 12')
ax[1].plot(Prices['Close NSDAQ USD'], "slategrey", label='NSDAQ Close')
ax[1].set_title('NSDAQ Price 2012 01-2021 12')
ax[0].legend(loc='upper left', frameon=False)
ax[1].legend(loc='upper left', frameon=False)
plt.tight_layout()
sns.set(font_scale=1.5, style="whitegrid")
Data used for ARIMA should be stationary; from above graph it is seen that data is non-stationary. However, to confirm this assumption, Augmented Dickey-Fuller test can be performed.
Hypothesis for ADF test:
*H0: Time series non-stationary*
*H1: Time series stationary*
Checking if AZN and NSDAQ time series data is normally distributed by using Augmented Dickey-Fuller test:
print("ADF p-value AZN:", adfuller(Prices['Close AZN USD'])[1])
print("ADF p-value NSDAQ:", adfuller(Prices['Close NSDAQ USD'])[1])
ADF p-value AZN: 0.7996286742279053 ADF p-value NSDAQ: 1.0
Since p-value for both stocks is greater than 0.05, H0 is accepted, time series are non-stationary and should be decomposed in order to build reliable ARIMA model.
Seasonality and trend in AZN and NSDAQ prices should be separated from series. From below graph we can see that data has an upward seasonality for both stocks.
AZN_seasonality = seasonal_decompose(Prices['Close AZN USD'], model='additive', freq = 60)
fig = plt.figure()
fig = AZN_seasonality.plot().set_size_inches(16, 9)
<Figure size 432x288 with 0 Axes>
NSDAQ_seasonality = seasonal_decompose(Prices['Close NSDAQ USD'], model='additive', freq = 60)
fig = plt.figure()
fig = NSDAQ_seasonality.plot().set_size_inches(16, 9)
<Figure size 432x288 with 0 Axes>
To make data stationary, a log of the series should be made. According to methodology, after logging values, the rolling average of 12 months shuold be calculated.
Logging AZN and NSDAQ values and calculating 1 year rolling average:
rcParams['figure.figsize'] = 16, 9
AZN_log = np.log(Prices['Close AZN USD'])
MA = AZN_log.rolling(12).mean()
STD = AZN_log.rolling(12).std()
rcParams['figure.figsize'] = 16, 9
NSDAQ_log = np.log(Prices['Close NSDAQ USD'])
MA = NSDAQ_log.rolling(12).mean()
STD = NSDAQ_log.rolling(12).std()
As Time Series are made stationary after logging values and calculating rolling average, ARIMA model can be built.
First of all, data should be split for training and for testing. As seen from below, 70% of a dataset was selected for traning and the rest 30% for testing for AZN; while 90% selected for training and 10% for testing for NSDAQ.
train_AZN, test_AZN = AZN_log[0:int(len(AZN_log)*0.70)], AZN_log[int(len(AZN_log)*0.70):]
train_NSDAQ, test_NSDAQ = NSDAQ_log[0:int(len(NSDAQ_log)*0.90)], NSDAQ_log[int(len(NSDAQ_log)*0.90):]
plt.figure(figsize=(16,9))
plt.plot(AZN_log, 'cadetblue', label='AZN Train')
plt.plot(test_AZN, 'indianred', label='AZN Test')
plt.plot(NSDAQ_log, 'slategrey', label='NSDAQ Train')
plt.plot(test_NSDAQ, 'sandybrown', label='NSDAQ Test')
plt.legend(loc='upper left', frameon=False)
<matplotlib.legend.Legend at 0x26a511e96a0>
ARIMA model contains p, d, q parameters that should be defined. For this analysis model, auto ARIMA is used to predict required parameters automatically. In order to build more accurate model, p, d, q should be evaluated more carefully by applying additional tests.
ARIMA_Model = auto_arima(train_AZN, start_p=0,start_q=0,test='adf', max_p=3, max_q=3, m=1, d=None, seasonal=False, start_P=0, D=0,
trace=True, error_action='ignore',suppress_warnings=True, stepwise=True)
Performing stepwise search to minimize aic ARIMA(0,1,0)(0,0,0)[0] intercept : AIC=-9968.176, Time=0.21 sec ARIMA(1,1,0)(0,0,0)[0] intercept : AIC=-9967.525, Time=0.23 sec ARIMA(0,1,1)(0,0,0)[0] intercept : AIC=-9967.563, Time=0.20 sec ARIMA(0,1,0)(0,0,0)[0] : AIC=-9969.487, Time=0.15 sec ARIMA(1,1,1)(0,0,0)[0] intercept : AIC=-9965.028, Time=0.55 sec Best model: ARIMA(0,1,0)(0,0,0)[0] Total fit time: 1.355 seconds
Auto ARIMA selected p, d, q values that is best fit for the AZN dataset.
p = 0
d = 1
q = 0
Since values are known, they can be used in ARIMA model for AZN.
AZN_ARIMA_Model = ARIMA(train_AZN, order=(0,1,0))
fitted_from_above_AZN= AZN_ARIMA_Model.fit()
fitted_from_above_AZN.summary()
C:\Users\Simona\anaconda3\lib\site-packages\statsmodels\tsa\base\tsa_model.py:581: ValueWarning: A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting. warnings.warn('A date index has been provided, but it has no' C:\Users\Simona\anaconda3\lib\site-packages\statsmodels\tsa\base\tsa_model.py:581: ValueWarning: A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting. warnings.warn('A date index has been provided, but it has no'
Dep. Variable: | D.Close AZN USD | No. Observations: | 1746 |
---|---|---|---|
Model: | ARIMA(0, 1, 0) | Log Likelihood | 4986.088 |
Method: | css | S.D. of innovations | 0.014 |
Date: | Sun, 05 Dec 2021 | AIC | -9968.176 |
Time: | 23:17:22 | BIC | -9957.246 |
Sample: | 1 | HQIC | -9964.135 |
coef | std err | z | P>|z| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
const | 0.0003 | 0.000 | 0.830 | 0.407 | -0.000 | 0.001 |
ARIMA_Model = auto_arima(train_NSDAQ, start_p=0,start_q=0,test='adf', max_p=3, max_q=3, m=1, d=None, seasonal=False, start_P=0, D=0,
trace=True, error_action='ignore',suppress_warnings=True, stepwise=True)
Performing stepwise search to minimize aic ARIMA(0,1,0)(0,0,0)[0] intercept : AIC=-12533.901, Time=0.30 sec ARIMA(1,1,0)(0,0,0)[0] intercept : AIC=-12543.093, Time=0.33 sec ARIMA(0,1,1)(0,0,0)[0] intercept : AIC=-12542.377, Time=0.26 sec ARIMA(0,1,0)(0,0,0)[0] : AIC=-12530.647, Time=0.12 sec ARIMA(2,1,0)(0,0,0)[0] intercept : AIC=-12543.945, Time=0.34 sec ARIMA(3,1,0)(0,0,0)[0] intercept : AIC=-12547.461, Time=0.69 sec ARIMA(3,1,1)(0,0,0)[0] intercept : AIC=-12554.554, Time=2.56 sec ARIMA(2,1,1)(0,0,0)[0] intercept : AIC=-12541.666, Time=2.30 sec ARIMA(3,1,2)(0,0,0)[0] intercept : AIC=-12547.410, Time=0.60 sec ARIMA(2,1,2)(0,0,0)[0] intercept : AIC=-12539.425, Time=1.60 sec ARIMA(3,1,1)(0,0,0)[0] : AIC=-12541.874, Time=0.46 sec Best model: ARIMA(3,1,1)(0,0,0)[0] intercept Total fit time: 9.570 seconds
Auto ARIMA selected p, d, q values that is best fit for the NSDAQ dataset.
p = 3
d = 1
q = 1
Since values are known, they can be used in ARIMA model for NSDAQ.
NSDAQ_ARIMA_Model = ARIMA(train_NSDAQ, order=(3,1,1))
fitted_from_above_NSDAQ= NSDAQ_ARIMA_Model.fit()
fitted_from_above_NSDAQ.summary()
C:\Users\Simona\anaconda3\lib\site-packages\statsmodels\tsa\base\tsa_model.py:581: ValueWarning: A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting. warnings.warn('A date index has been provided, but it has no' C:\Users\Simona\anaconda3\lib\site-packages\statsmodels\tsa\base\tsa_model.py:581: ValueWarning: A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting. warnings.warn('A date index has been provided, but it has no'
Dep. Variable: | D.Close NSDAQ USD | No. Observations: | 2246 |
---|---|---|---|
Model: | ARIMA(3, 1, 1) | Log Likelihood | 6283.278 |
Method: | css-mle | S.D. of innovations | 0.015 |
Date: | Sun, 05 Dec 2021 | AIC | -12554.556 |
Time: | 23:17:32 | BIC | -12520.254 |
Sample: | 1 | HQIC | -12542.035 |
coef | std err | z | P>|z| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
const | 0.0007 | 0.000 | 2.741 | 0.006 | 0.000 | 0.001 |
ar.L1.D.Close NSDAQ USD | 0.4335 | 0.121 | 3.584 | 0.000 | 0.196 | 0.671 |
ar.L2.D.Close NSDAQ USD | 0.0649 | 0.024 | 2.655 | 0.008 | 0.017 | 0.113 |
ar.L3.D.Close NSDAQ USD | -0.0876 | 0.021 | -4.154 | 0.000 | -0.129 | -0.046 |
ma.L1.D.Close NSDAQ USD | -0.5023 | 0.120 | -4.183 | 0.000 | -0.738 | -0.267 |
Real | Imaginary | Modulus | Frequency | |
---|---|---|---|---|
AR.1 | 1.7133 | -1.1476j | 2.0621 | -0.0939 |
AR.2 | 1.7133 | +1.1476j | 2.0621 | 0.0939 |
AR.3 | -2.6858 | -0.0000j | 2.6858 | -0.5000 |
MA.1 | 1.9910 | +0.0000j | 1.9910 | 0.0000 |
Alpha for forecasting is selected 0.05 for confidence interval 95%
fc1, se, conf = fitted_from_above_AZN.forecast(750, alpha=0.05) #error dėl index, turi būti 750?
forecast_series_AZN = pd.Series(fc1, index=test_AZN.index)
lower_series_AZN = pd.Series(conf[:, 0], index=test_AZN.index)
upper_series_AZN = pd.Series(conf[:, 0], index=test_AZN.index)
fc2, se, conf = fitted_from_above_NSDAQ.forecast(250, alpha=0.05) #error dėl index, turi būti 250?
forecast_series_NSDAQ = pd.Series(fc2, index=test_NSDAQ.index)
lower_series_NSDAQ = pd.Series(conf[:, 0], index=test_NSDAQ.index)
upper_series_NSDAQ = pd.Series(conf[:, 0], index=test_NSDAQ.index)
plt.plot(train_AZN, 'cadetblue', label='AZN Historical data')
plt.plot(test_AZN, color = 'indianred', label='AZN Stock Price')
plt.plot(forecast_series_AZN, color = 'green')
plt.plot(train_NSDAQ, 'slategrey', label='NSDAQ Historical data')
plt.plot(test_NSDAQ, color = 'sandybrown', label='NSDAQ Stock Price')
plt.plot(forecast_series_NSDAQ, color = 'green',label='Prediction')
plt.legend(loc='upper left', frameon=False)
<matplotlib.legend.Legend at 0x26a50d5fb80>
Green line in the above graph shows forecasted trend line that can be compared to the actual NSDAQ and AZN prices. Graph suggests that prediction is quite accurate, trend line is upward as both stock (AZN and NSDAQ) prices are gradually increasing.
Moreover, some additional tests can be used to check if model is acceptable. According to methodology, if MAPE ratio is around 2.5 %, then ARIMA can be used for predicting future prices.
Checking MAPE for AZN and NSDAQ ARIMA:
MAPE_AZN = np.mean(np.abs(fc1 - test_AZN)/np.abs(test_AZN))
MAPE_NSDAQ = np.mean(np.abs(fc2 - test_NSDAQ)/np.abs(test_NSDAQ))
print('MAPE AZN: '+str(MAPE_AZN))
print('MAPE NSDAQ: '+str(MAPE_NSDAQ))
MAPE AZN: 0.03651058508201623 MAPE NSDAQ: 0.03846177730169523
As MAPE is hihgher than 2.5%, therefore, the model cannot be considered as the most reliable one. However, MAPE is still less than 10%, which produces very good result.