Logo of the Physikalisch-Technische Bundesanstalt

Regression

Working Group 8.42

Description

Regression problems occur in many metrological applications, e.g. in everyday calibration tasks (as illustrated in Annex H.3 of the Opens external link in new windowGUM), in the evaluation of interlaboratory comparisons, the characterization of sensors [Opens external link in new windowMatthews et al., 2014], determination of fundamental constants [Opens external link in new windowBodnar et al., 2014], interpolation or prediction tasks [Opens external link in new windowWübbeler et al., 2012], and many more. Such problems arise when the quantity of interest cannot be measured directly, but has to be inferred from measurement data (and their uncertainties) using a mathematical model that relates the quantity of interest to the data. For example, regressions may serve to evaluate the functional relation between variables.

result of straight line regression and prediction
Fig. 1: Illustration of a typical straight line regression problem with normally distributed measurement errors. Displayed is the fitted mean regression curve (solid line) and its pointwise 95% credible intervals (dashed lines). The thin vertical line shows a prediction at a new value $x$ and its 95% credible interval. The small circles represent the measurements data.

Definition and Examples

Regression problems often take the form
$$
\begin{equation*}
y_i = f_{\boldsymbol{\theta}}(x_i) + \varepsilon_i , \quad i=1, \ldots, n \,,
\end{equation*}
$$
where the measurements $\boldsymbol{y}=(y_1, \ldots, y_n)^\top$ are explained by a function $f_{\boldsymbol{\theta}}$ evaluated at values $\boldsymbol{x}=(x_1, \ldots, x_n)^\top$ and depending on unknown parameters $\boldsymbol{\theta}=(\theta_1, \ldots, \theta_p)^\top$. The measurement error $\pmb{\varepsilon}=(\varepsilon_1, \ldots, \varepsilon_n)^\top$ follows a specified distribution $p(\pmb{\varepsilon} | \boldsymbol{\theta}, \boldsymbol{\sigma}).$
Regressions may be used to describe the relationship between a traceable, highly accurate reference device with values denoted by $x$ and a device to be calibrated with values denoted by $y$. The pairs $(x_i,y_i)$ then denote simultaneous measurements made by the two devices of the same measurand such as, for example, temperature.
A simple example is the Normal straight line regression model (as illustrated in Figure 1):
$$
\begin{equation} \label{int_reg_eq1}
y_i = \theta_1 + \theta_2 x_i + \varepsilon_i , \quad \varepsilon_i \stackrel{iid}{\sim} \text{N}(0, \sigma^2), \quad i=1, \ldots, n \,.
\end{equation}
$$
The basic goal of regression tasks is to estimate the unknown parameters $\pmb{\theta}$ of the regression function and possibly also the unknown parameters of the error distribution $\pmb{\sigma}$. The estimated regression model may then be used to evaluate the shape of the regression function, predictions or interpolations of intermediate or extrapolated $x$-values, or to invert the regression function to predict $x$-values for new measurements.

To top

Research

Decisions based on regression analyses require a reliable evaluation of measurement uncertainty. The current state of the art in uncertainty evaluation in metrology (i.e. the GUM and its supplements) provides little guidance for regression, however. One reason is that the GUM guidelines are based on a model that relates the quantity of interest (the measurand) to the input quantities. Yet, regression models cannot be uniquely formulated as such a measurement function. By way of example, Annex H.3 of the GUM nevertheless suggests a possibility for analyzing regression problems. However, this analysis contains elements from both classical (least squares) and Bayesian statistics such that the results are not deduced from state-of-knowledge distributions and usually differ from a purely classical or Bayesian approach which was shown in [Opens external link in new windowElster et al., 2011].


Consequently, there is a need for guidance and research in metrology for uncertainty evaluation in regression problems. The Joint Committee for Guides in Metrology (JCGM) has recognized this need. PTB Working Group 8.42 lead the development of guidance for Bayesian inference of regression problems within the EMRP project NEW04, which is summarized in a Guide [Initiates file downloadElster et al., 2015]. This Guide also contains template solutions for specific regression problems with known values $\boldsymbol{x}$ and is available free of charge at the Opens external link in new windowNEW04 project web page. For regression problems with Gaussian measurement errors and linear regression functions (such as in formula (1)), [Opens external link in new windowKlauenberg et al., 2015_2] provide guidance when extensive numerical calculations (such as Markov Chain Monte Carlo methods) are to be avoided in a Bayesian inference.

Regression problems often involve uncertainty in the x-values as well. Within the EMPIR project Opens external link in new window17NRM05 EMUE three adaptable examples were developed, which illustrate different aspects of fitting a straight-line:

  • For calibrating a sonic nozzle in line with the GUM, [Opens external link in new windowMartens et.al., 2020a] demonstrates how all uncertainties involved can be quantified and emphasizes the importance of accounting for correlation.
  • For two methods measuring haemoglobin, [Opens external link in new windowMartens et.al., 2020b] quantifies the uncertainty when comparing measurement methods. In particular, the example demonstrates how correlations can be accounted for and shows their impact on regression estimates and uncertainties.
  • For calibrating a torque measuring system and known x-values, [Opens external link in new windowMartens et.al., 2020c] compares the approaches according to GUM and Bayes. The Bayesian approach is recommended because it accounts for little and different knowledge on the variability of each observation. Analytic expressions are supplied


In addition, PTB Working Group 8.42 carries out research emerging from metrological applications involving regression. For example,

  • for the analysis of magnetic field fluctuation thermometry, [Opens external link in new windowWübbeler et al., 2012] propose and validate a Bayesian and [Opens external link in new windowWübbeler et al., 2013] a simplified approach  to perform interpolations or predictions based on regression results,
  • for the determination of fundamental constants, [Opens external link in new windowBodnar et al., 2014] provide an objective Bayesian inference and compare it to the Birge ratio method,
  • for the analysis of immunological tests called ELISA, [Opens external link in new windowKlauenberg et al., 2015] have developed informative prior distributions which are widely applicable,
  • for the calibration of flow meters, [Opens external link in new windowKok et al., 2015] provide a Bayesian analysis which accounts for constraints on the values of the regression curve.
To top

Software

To top

Publications

Publication single view

Article

Title: Reliable uncertainty evaluation for ODE parameter estimation - a comparison
Author(s): S. Eichstädt;C. Elster
Journal: Journal of Physics: Conference Series
Year: 2014
Volume: 490
Issue: 1
Pages: 012230
IOP Publishing
DOI: 10.1088/1742-6596/490/1/012230
ISSN: 1742-6596
Web URL: http://iopscience.iop.org/article/10.1088/1742-6596/490/1/012230
Keywords: Regression, ODE, parameter identification, dynamic calibration, modelling
Tags: 8.42,Dynamik, Regression

Back to the list view

To top