Probability model for duration, how to write log-likelihood function for Weibull distribution to fit aggreate data

Background: I have a basic to moderate knowledge on probability and statistics. I have familiarity with R programming language and optimization routines. I’m reading an excellent article “Jobs, Strikes, and Wars: Probability Models for Duration” by Morrison and Schmittlein.. The authors of this article go on to develop probability models to fit duration data like … Read more

Linear regression and maximum likelihood theory

The Maximum Likelihood estimator theory comes often with the the theoretical result on the variance: σ2(ˆθMLE)∼−(E[∂2logL(θ=θ0)∂2θ])−1 , called the inverse of the Fisher information (θ0 are the unknown exact parameters). The proofs I saw of this result always consider the problem of estimating the parameters of an unknown distribution having at disposal {y1,…yT}, a sample … Read more

Fitting a neural network via maximum likelihood

Is it possible to fit a NN given pairs (X, Y), such that NN(X) = Y using maximum-likelihood (ML) and not gradient descent (GD)? Example: I have a NN with weights W I feed it with 10 samples X and get 10 output Y I add noise e and get new output Y+e I evaluate … Read more

Likelihood approximations for shared parameter problems

In graphical models people tend to exploit conditional independence to factorize the likelihood and make the problem simpler. By simpler I mean that the dimension is reduced due to factorization. For example: Consider 3 random variables y1,y2 and y3 and consider the graph structure y1→y3←y2 In this case we can write the joint density f(y1,y2,y3)=f(y1)f(y2)f(y3|y1,y2) … Read more

Numerical issues in Maximum Likelihood Estimation

I am estimating the following noise model: \begin{equation} n \sim \mathcal N\left(\mu(x,y),\sigma(x,y)^2\right)\in\mathbb R \end{equation} \begin{equation} \begin{cases} \mu(x,y) = k_1+k_2x+k_3x^2+k_4y+k_5y^2 \\ \sigma(x,y) = k_6+k_7x+k_8x^2+k_9y+k_{10}y^2 \\ \end{cases} \end{equation} where $x\in[0,3]$ and $y\in[0,\pi/2]$ (thus, scaling does not immediately seem to be an issue). The sample size $\{\hat n_i, \hat x_i, \hat y_i\}_{i=1}^N$ has size $N=10981$. Here are some … Read more

Parameters estimation by MLE and Kalman filter

I am trying to estimate the parameters of a discrete nonlinear state space model using MLE and kalman filter: xk=f(xk−1,θ)+qk−1yk=h(xk,θ)+rk And xk the state of the dynamic system at discrete time k, yk is the measurement, qk−1 and rk are independent process and measurement Gaussian noise sequences with zeros means and covariances Qk andRk respectively. … Read more

Maximum Likelihood Stopping Tolerance

I have a large scale problem that I am training with MLE. It is taking quite a long time. I would like to set a stopping condition. How does one set a tolerance level in MLE optimization? absolute change in logL < tolerance absolute change in logL/number_of_samples < tolerance %change in LogL < tolerance Obviously … Read more

Maximum likelihood estimation using a random number generator

Imagine that I have a random number generator $X$ where it is impossible (or mathematicaly intractable) to calculate its probability density function – this can happen when we compose several simpler random variables, that is, the parameters of a random variable are determined by another random variables. That random variable depends of one (or more) … Read more

Maximum Likelihood Estimator – Covariance Squared Exponential Matlab

Following the Rasmussen&Williams Gpml Machine Learning book i’m trying to implement my gaussian process in matlab, avoiding to use other existing toolbox or complex pre-assembled functions, but now i’m stuck with the minimization of the negative log marginal likelihood in order to estimate the hyperparameters of my noisy covariance function and i’m not sure of … Read more