On Wed, Nov 6, 2019, 21:13 Ido Hadanny ***@***.***> wrote:
it is not possible to get an non-stationary / non-invertible model
can you please point me to a paper or to the code of how you're doing
that? are you replacing the bad roots with their reciprocals? or is it
another procedure?
There's nothing wrong with a non-stationary / non-invertible model
why do you say so? in
<http://url>
they say that:
Any roots close to the unit circle may be numerically unstable, and the
corresponding model will not be good for forecasting
doesn't that mean that non-stationary / non-invertible models are bad? or
is it only a problem if the roots are *close* to the unit root, not if
they are far inside/outside the circle? and if it's not a problem, than why
did you say about the non-stationarity of the start_params that:
if they suggest a non-stationary model then that likely indicates problems
with the model specification
It's probably best to use SARIMAX, yes
Then maybe it's a good idea that the popular
<http://url> package use SARIMAX
instead of ARIMA...
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<
?email_source=notifications&email_token=ABKTSRMFE7DD7XCCV6OXHA3QSMXQFA5CNFSM4JIY4XA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDIADDA#issuecomment-550502796>,
or unsubscribe
<
The log likelihood is only defined for stationary time series when using
full MLE. This is how and why the model is restricted to be stationary.
Invertibility isn't a strict requirement but helps with point
identification.
Unstable when need the unit circle means that they are not precisely
estimated. This is leads to problems forecasting since forecasts from. 99
and. 975 and 0.9 look very different after a few steps.
On Wed, Nov 6, 2019, 21:13 Ido Hadanny ***@***.***> wrote:
> it is not possible to get an non-stationary / non-invertible model
> can you please point me to a paper or to the code of how you're doing
> that? are you replacing the bad roots with their reciprocals? or is it
> another procedure?
> There's nothing wrong with a non-stationary / non-invertible model
> why do you say so? in
https://otexts.com/fpp2/arima-r.html
<http://url>
> they say that:
> Any roots close to the unit circle may be numerically unstable, and the
> corresponding model will not be good for forecasting
> doesn't that mean that non-stationary / non-invertible models are bad? or
> is it only a problem if the roots are *close* to the unit root, not if
> they are far inside/outside the circle? and if it's not a problem, than why
> did you say about the non-stationarity of the start_params that:
> if they suggest a non-stationary model then that likely indicates
> problems with the model specification
> It's probably best to use SARIMAX, yes
> Then maybe it's a good idea that the popular
>
https://github.com/tgsmith61591/pmdarima
<http://url> package use
> SARIMAX instead of ARIMA...
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <
#6225
?email_source=notifications&email_token=ABKTSRMFE7DD7XCCV6OXHA3QSMXQFA5CNFSM4JIY4XA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDIADDA#issuecomment-550502796>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/ABKTSRKRCR4D6V4MOCDIOHTQSMXQFANCNFSM4JIY4XAQ
>
can you please point me to a paper or to the code of how you're doing that? are you replacing the bad roots with their reciprocals? or is it another procedure?
We maximize the likelihood function numerically, and we do not consider parameter combinations that would lead to a non-stationary / non-invertible model (as long as
enforce_stationary=True
and
enforce_invertibility=True
).
Specifically, this is done by numerically maximizing over an unconstrained parameter space so that these parameters essentially describe partial autocorrelations. Then we convert these unconstrained (partial autocorrelation) parameters into the corresponding autoregressive or moving average components, which will be stationary / invertible by definition. The citation is:
Monahan, John F. 1984. "A Note on Enforcing Stationarity in Autoregressive-moving Average Models." Biometrika 71 (2) (August 1): 403-404.
thank you very much for these insights, I'd be sure to check out this paper and try my best to understand the technique.
But just to close this logic puzzle, there's one piece that's avoiding me:
When I asked you if non-stationary fit in the start_params method (CSS or some other approximate estimator) are a problem you said it's a big problem, and you even considered throwing an error:
Because the starting parameters estimators are usually consistent estimators of the true parameters (even if not efficient), if they suggest a non-stationary model then that likely indicates problems with the model specification
but when I asked you about non-stationary result of the fancy MLE estimation, you said that its not really a problem:
There's nothing wrong with a non-stationary / non-invertible model.
And you're not at all worried about
almost
non-stationary results, which can happen and Hyndman (
https://otexts.com/fpp2/arima-r.html
) suggests never to use:
The auto.arima() function is even stricter, and will not select a model with roots
close
to the unit circle either
The answer to your question is that there are three different issues here:
What is the correct
order of integration
of a time series
What constitutes a valid model for
estimation
What parameter values are
numerically stable
For (1): If your data is integrated (non-stationary) then if you select an SARIMAX model that enforces stationarity, you will not be able to recover the true data generating process. What will happen is that the parameter estimates will likely be very close to a non-stationary model, but as I mentioned above, they are constrained to be stationary.
That is why if you select a model that enforces stationarity, but the estimated starting parameters are non-stationary, we issue the warning, so that you know that your model may be inappropriate for the data.
In the page you liked to, Hyndman is describing a procedure to automatically determine an SARIMAX model specification, including the order of integration. He is using a heuristic procedure to do this, and so he has apparently made the choice that it is best to reject models that are very close to being non-stationary (I would guess in favor of an additional application of differencing).
It is different here, though, because
SARIMAX
requires that you specify the model you want to fit. It then finds the best parameters
for the given specification
. The closer analogue to
SARIMAX
in R is
Arima()
and not
auto.arima()
.
For (2):
SARIMAX
is estimated by putting the model into state space form, and there is no theoretical problem with non-stationary state space models. The likelihood function is slightly different due to different initializations, but our model class can handle this case with no problem.
For (3): there are various known statistical issues with numerical stability around the boundaries of parameter constraints. We bound our parameters very slightly away from the boundary for this reason. Hyndman's concern, however, appears to me to be not so much about numerical stability as it is about finding a good heuristic method for automatically selecting model orders.
auto_arima's error_action="ignore" does not work when alternative training methods are specified
alkaline-ml/pmdarima#312