MLE of a multivariate Hawkes process

I’m struggling with implementing the maximum likelihood estimator for a multivariate Hawkes process (HP). Specifically, while the analytical expression for a log-likelihood function of a univariate HP can be found easily online (e.g. Ozaki, 1979), there seem to be different (inconsistent or equivalent?) versions of the log-likelihood function of a multivariate HP out there. I also tried to derive the estimator myself below and I get yet another result (I’m very new to this subject though). Could somebody clear this up for me? Thanks!

This is my own go at a derivation (I follow the notation used in Laub et al., 2015). Consider a collection of m counting processes N=(N1,..,Nm) with ti,j the observed arrival times for each counting process (i=1,..,m and j a natural number). Define a multivariate HP with exponentially decaying exictation functions such that the intensities are λi(t)=λi+mj=1tj,k<tαi,jeβi,j(ttj,k). For this m-variate HP the log-likelihood lnL(t) is equal to the sum of the individual log-likelihoods, i.e.: lnL(t)=mj=1lnLj(t), with each individual component lnLj(t)=T0λj(u)du+T0lnλj(u)dNj(u).

Let us first focus on the first part, which we call the compensator Λ.

enter image description here

Combining this with the results for the other parts of the log-likelihood should result in: lnL1(ti)=λ1Tα1,1β1,1Ff=1[eβ1,1(t1,Ft1,f)1]α1,2β1,2Gg=1[eβ1,2(t2,Gt2,g)1]+Ff=1ln[λ1+2j=1α1,jR1,j(f)]

with R1,j(f)=tj,k<t1,feβ1,j(t1,ftj,k). A similar expression can be derived for lnL2(ti).

However, when I compare this result with other articles, I notice some differences. For example, in Toke (slide 56) the expression for the compensator is very different (sums over every element for every event-type) and, also, there is no λiT term. Next, in Crowley (2013) (pg. 29) the expression for the compensator is much more elaborate. Further, the equation on 2.8 (page 9) in Zheng (2013) offers again an alternative (sums over a subset of the elements for every event-type) (note: there is a Matlab implementation at the end of the document). The article that resembles mostly to what I find is page 6 in Carlsson et al. (2007). As you can see I'm clearly confused. What is the correct likelihood function that I should program?


  • Ozaki, 1979, Maximum likelihood estimation of Hawkes' self-exciting point processes

  • Crowley, 2013, Point Process Models for Multivariate High-Frequency Irregularly Spaced Data

  • Laub, Taimre & Pollett, 2015, Hawkes Processes

  • Zheng, 2013, High frequency dynamics of order flow

  • Carlsson, Foo, Lee & Shek, 2007, High Frequency Trade Prediction with Bivariate Hawkes Process


There is a small mistake in the derivation. In line 5 (in the inserted figure) one needs T=t1,F=t2,G for the identity to be correct, and this is generally not the case. The terms in the final sums should be eβi,1(Tt1,f)1 and eβi,2(Tt2,g)1, respectively. Otherwise the derivation looks correct.

A slightly simpler derivation can take line 3 as a starting point. Then interchange the sums and integration with the resulting inner integral being from tj,k to T.

It might be worth noting that for the Hawkes process considered here, it is possible to compute λi(ti,j) recursively, which implies that the computational complexity of the log-likelihood can be made linear in the number of jumps (instead of quadratic as the double sum over the jumps suggests).

I doubt that there are inconsistent versions of the likelihood in the literature, but there may, of course, be mistakes in some of the references. Another (likely) possibility is that the notation or the assumptions differ, or that the representations are, indeed, equivalent, but written in different ways. One possibility is that the baseline intensity λi is omitted, so that the λiT term disappears.

Source : Link , Question Author : Pilik , Answer Author : NRH

Leave a Comment