The Maximum Likelihood Estimation (MLE) is the process of estimating the parameters of a model for sample data by finding the parameters that maximise the likelihood function. Consider a sample of independent and identically distributed random variables \( X = (X_1, X_2, ... , X_n) \) with the joint density function:
$$f(X_1, X_2, ... , X_n) = f(X_1) .f(X_2) ... f(X_n).$$
The likelihood function is defined as the probability of given observations \( (x_1, x_2,...,x_n) \) as a function of \( \theta\)
$$L(\theta; x_1, x_2, ... , x_n) = f(x_1, x_2, ... , x_n| \theta) = \prod_{i=1} ^n f(x_i | \theta). $$
The maximum likelihood estimator is then given by:
$$ \hat{\theta} = argmax _{\theta} L(\theta; x_1, x_2,..., x_n ). $$
Finding the maximum of a product of terms is tedious in practice, therefore we regard the logarithm of \( L \) which translates the product into a summation instead, i.e.,
$$log(L(\theta; x_1, x_2, ... , x_n)) = \sum _{i=1} ^{n} log (f(x_i | \theta)).$$
Since the logarithm is an increasing function, finding the maximum of \(L\) is equivalent to maximising \(log(L) \), which is a much simpler problem.