Numerical Estimation of Parameter Covariances

Neil A Thacker, Ashley Seepujak,Paul D Tar, Georgios Krokos

semanticscholar(2015)

引用 0|浏览0
暂无评分
摘要
The conventional method for estimation of parameter covariances, following a model fitting procedure, involves an analytical estimate of the second derivatives. This makes use of the minimum variance (or Kramer-Rao) bound. We present an approach based upon direct inspection of the behaviour of the cost function around the minima. The new approach is not only more convenient, but also provides a better approximation to the average behaviour of the cost function, whilst supporting useful tests of cost function behaviour. A test for asymmetry can indicate poor cost function construction and poor parameter specification. Elimination of these problems can lead to better optimisation and to more appropriate covariances. In general therefore, numerical estimation of covariances can result in both better software and algorithm design methodology. 1 The Motivation for a Numerical Estimation of Covariances The purpose of many software algorithms is the estimation of parameters and the best algorithms are designed using statistical principles. Generally, we can relate the information available in data s1, to the information we desire s2, using the probability P (s2|s1) or something monotonically related to it. The most common approach is to use Likelihood; many algorithms are based around the optimisation of a Likelihood-based cost function, using one of a number of optimisation routines. A meaningful solution which minimises the cost function represents the optimal solution. A necessary condition for locating the optima of the cost function is that both the cost function and the feasible region are convex [9] in the optimisation problem. Most methods for locating the optimum in fact erroneously locate local optima, as opposed to the global optima. Discontinuous cost functions are particularly susceptible to this problem, where there may be several local optima. It is therefore always important to try to ensure that cost functions do not have numerical problems, which generate unnecessary discontinuities. Although a cost function can be optimised using brute-force methods (testing every possible combination of parameter, and subsequently choosing the optimal set), generally it is much more efficient to use a search process. Search processes are more efficient when we can assume some properties of the underlying cost function. General cost functions can exhibit a range of behaviours, generating a consequent range of problems for optimisation algorithms. In order of difficulty, valid cost functions (i.e. those based appropriately on probability theory) can be : smooth , differentially discontinuous, noisy or even stochastic (i.e. non-deterministic). An approximately quadratic function might be considered the simplest and can be optimised using a number of efficient methods, generally involving the need to compute the derivatives of the cost function with respect to each parameter. One category of approach is referred to as ‘quasi-Newton’, and includes methods which iterate towards a solution utilising a series of matrix inverses or linear searches. Specific algorithms include methods such as Levenberg-Marquadt [1] and conjugate gradient [2,3]. Though there are slight differences in numerical stability, most common optimisation schemes are well-enough understood to be used to locate a reliable solution when the cost function is smooth and continuous. Provided that any choice of algorithm ultimately converges on the same optimal solution, then its statistical characteristics must be the same, irrespective of optimisation method. Therefore, the properties of the parametric solution (for example, estimation error and correlations) are dictated by the underlying probability theory, and not by the algorithm used to find it, despite often strong contrary opinions of developers (evaluations often show indistinguishable perfomance for competing approaches [4,5]). Choice of algorithm therefore, normally depend upon other issues, for example, robustness to local optima, computational efficiency, ease of application and auxiliary factors, such as the provision of parameter covariances. Parameter covariances are particularly important for scientific applications, where we need to use the parameters in hypothesis tests, or to combine them with other such estimates. Quasi-Newton approaches are often selected due to good efficiency, but also provide easy access to parameter covariances (the required second derivatives being computed as part of the optimisation algorithm). To achieve this efficiency, however, they rely on a cost function which is quadratic in the region of the minimum. Nevertheless, in the context of basic research, we often do not know the behaviour of a novel Likelihood function; the requirement of an approximate quadratic behaviour of the cost function is something which one cannot simply expect. We
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要