-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AR(IMA) implementation #15
Comments
I've actually been poking at this stuff over the past couple of days, for R's arima() function is implemented as a state-space model, which allows it Definitely worth discussing the big-picture approach to take. I'm used to Sam On Mon, Feb 10, 2014 at 12:38 PM, Theodore Papamarkou <
|
Hi! Of course R implementation was my first thought too, but you also can look to python implementation: https://github.com/statsmodels/statsmodels/blob/master/statsmodels/tsa/arima_model.py Also I'm sorry I hadn't time to merge my garch code before, I think I'll have a time to do it this week finally |
That's great @ElOceanografo, @AndreyKolev, it would be amazing if you manage to merge arima and/or garch! There are a few points for discussion, for example if we are going to deal with missing values (I imagine yes), and therefore if we are going to work with data arrays or time arrays or float64 arrays, but these are not so urgent. As long as we have some Julia code for arima or garch to work with, we can easily change these later. |
Possibly ping @carljv about model hierarchy. Heads up on TimeSeries. It no longer supports DataFrames/DataArrays but instead has implemented the TimeArray type. immutable TimeArray{T,N} <: AbstractTimeArray
timestamp::Vector{Date{ISOCalendar}}
values::Array{T,N}
colnames::Vector{ASCIIString}
function TimeArray(timestamp::Vector{Date{ISOCalendar}}, values::Array{T,N}, colnames::Vector{ASCIIString})
nrow, ncol = size(values, 1), size(values, 2)
nrow != size(timestamp, 1) ? error("values must match length of timestamp"):
ncol != size(colnames,1) ? error("column names must match width of array"):
timestamp != unique(timestamp) ? error("there are duplicate dates"):
~(flipud(timestamp) == sort(timestamp) || timestamp == sort(timestamp)) ? error("dates are mangled"):
flipud(timestamp) == sort(timestamp) ?
new(flipud(timestamp), flipud(values), colnames):
new(timestamp, values, colnames)
end
end The inner constructor looks like a lot of code but simply enforces obvious invariants and orders the data from youngest to oldest. Overall, the design's goals are simplicity speed. It would be interesting to take it through the paces with time models and see where the weaknesses are. |
Thanks @milktrader. Two questions, as I look at the
Cheers! |
It does not yet deal with missing values. That missing piece is being explored in the DataArrays branch. The creation of a TimeArray is straightforward. julia> mydates = [today():days(1):date(2014,3,1)]
20-element Array{Date{ISOCalendar},1}:
2014-02-10
2014-02-11
2014-02-12
2014-02-13
2014-02-14
2014-02-15
⋮
2014-02-24
2014-02-25
2014-02-26
2014-02-27
2014-02-28
2014-03-01
julia> myarray = [rand(20) rand(20)];
julia> mycolnames = ["r1", "r2"];
julia> mytimearray = TimeArray(mydates, myarray, mycolnames)
20x2 TimeArray{Float64,2} 2014-02-10 to 2014-03-01
r1 r2
2014-02-10 | 0.58 0.13
2014-02-11 | 0.81 0.64
2014-02-12 | 0.18 0.07
2014-02-13 | 0.54 0.07
...
2014-02-25 | 0.42 0.99
2014-02-26 | 0.49 0.47
2014-02-27 | 0.77 0.19
2014-02-28 | 0.97 0.55
2014-03-01 | 0.45 0.72 The |
That's really helpful. One more question; is there any constructor that allows leaving the timestamp empty or to some default value if the user of a time series model is not so much interested in the |
If you don't mind cloning the julia> Pkg.clone("git://github.com/JuliaQuant/MarketData.jl.git")
julia> using MarketData
julia> OHLC
16103x6 TimeArray{Float64,2} 1950-01-03 to 2013-12-31
Open High Low Close Volume Adj Cl
1950-01-03 | 16.66 16.66 16.66 16.66 1260000.00 16.66
1950-01-04 | 16.85 16.85 16.85 16.85 1890000.00 16.85
1950-01-05 | 16.93 16.93 16.93 16.93 2550000.00 16.93
1950-01-06 | 16.98 16.98 16.98 16.98 2010000.00 16.98
...
2013-12-24 | 1828.02 1833.32 1828.02 1833.32 1307630000.00 1833.32
2013-12-26 | 1834.96 1842.84 1834.96 1842.02 1982270000.00 1842.02
2013-12-27 | 1842.97 1844.89 1839.81 1841.40 2052920000.00 1841.40
2013-12-30 | 1841.47 1842.47 1838.77 1841.07 0.00 1841.07
2013-12-31 | 1842.61 1849.44 1842.41 1848.36 2312840000.00 1848.36 Yeah, the |
Dropping down to just an array is simple julia> OHLC.values
16103x6 Array{Float64,2}:
16.66 16.66 16.66 16.66 1.26e6 16.66
16.85 16.85 16.85 16.85 1.89e6 16.85
16.93 16.93 16.93 16.93 2.55e6 16.93
16.98 16.98 16.98 16.98 2.01e6 16.98
17.08 17.08 17.08 17.08 2.52e6 17.08
17.03 17.03 17.03 17.03 2.16e6 17.03
⋮ ⋮
1822.92 1829.75 1822.92 1827.99 2.85154e9 1827.99
1828.02 1833.32 1828.02 1833.32 1.30763e9 1833.32
1834.96 1842.84 1834.96 1842.02 1.98227e9 1842.02
1842.97 1844.89 1839.81 1841.4 2.05292e9 1841.4
1841.47 1842.47 1838.77 1841.07 0.0 1841.07
1842.61 1849.44 1842.41 1848.36 2.31284e9 1848.36 |
That's good. What I had in mind is this situation; once the time series models rely on time arrays (as their input or output), the user may just have a float64 array, i.e. the |
@AndreyKolev I've looked into the best way to merge two repositories like these and it's actually not a simple matter. Given that the original TimeModels doesn't have a deep history and the history for GARCH is relatively short, I think the best approach is as follows (but let's discuss this before someone pulls the trigger)
Now you have a branch called |
@Scidom I've been creating dummy timestamp data doing this.
So if you want 100 observations with dummy timestamps, you'd do:
|
@milktrader OK I can try it |
@AndreyKolev I just noticed we already have a branch named |
Just a quick question so that I Know where we are standing with ARIMA. @ElOceanografo is ARIMA coding for now in your realm of activity? This is perfectly fine, just want to know so that we hold off from any coding in case you are/will be working on it. |
In theory yes...though I'm pretty busy with other work right now. If The idea for the state space ARIMA-fitting procedure (in case it's not -specify order of ARIMA model: p, d, q I think we should probably have classic methods (Levinson-Durbin, OLS) Sam On Wed, Feb 12, 2014 at 12:59 PM, Theodore Papamarkou <
|
A week is more than fine - I don't think I will have time within the next 2-3 weeks. If let's say after a month none of us has made any progress, then I will try to make time to lay my hands on it. |
Alright, sounds good. On Wed, Feb 12, 2014 at 2:26 PM, Theodore Papamarkou <
|
Hello, why different languages returns different coefficients of an AR data? The data being the same. I have checked with R, Python and MATLAB, and coefficients are different in each languages. What is reason behind this? Thanks. |
What data are you using? And are the differences substantial? |
Hi @milktrader. I am sorry I haven't had much time to get more actively involved in
TimeModels
. I wanted to ask you what would be the best way to create a model hierarchy in order to implement ARIMA. This won't be useful only to those who work with time series, it will be very beneficial for MCMC too given that several summary statistics for MCMC rely on AR(p) models.I would be happy to try implement ARIMA or to help with it. Do you have any recommendations from which language we could use some code as prototype for ARIMA or maybe at least AR?
The text was updated successfully, but these errors were encountered: