GpRegressor

class inference.gp.GpRegressor(x: ~numpy.ndarray, y: ~numpy.ndarray, y_err: ~numpy.ndarray = None, y_cov: ~numpy.ndarray = None, hyperpars: ~numpy.ndarray = None, kernel: ~inference.gp.covariance.CovarianceFunction = <class 'inference.gp.covariance.SquaredExponential'>, mean: ~inference.gp.mean.MeanFunction = <class 'inference.gp.mean.ConstantMean'>, cross_val: bool = False, optimizer: str = 'bfgs', n_processes: int = 1, n_starts: int = None)

A class for performing Gaussian-process regression in one or more dimensions.

Gaussian-process regression (GPR) is a non-parametric regression technique which can fit arbitrarily spaced data in any number of dimensions. A unique feature of GPR is its ability to account for uncertainties on the data (which must be assumed to be Gaussian) and propagate that uncertainty to the regression estimate by modelling the regression estimate itself as a multivariate normal distribution.

Parameters
  • x – The x-data points as a 2D numpy.ndarray with shape (number of points, number of dimensions). Alternatively, a list of array-like objects can be given, which will be converted to a ndarray internally.

  • y – The y-data values as a 1D numpy.ndarray.

  • y_err – The error on the y-data values supplied as a 1D numpy.ndarray. This technique explicitly assumes that errors are Gaussian, so the supplied error values represent normal distribution standard deviations. If this argument is not specified the errors are taken to be small but non-zero.

  • y_cov – A covariance matrix representing the uncertainties on the y-data values. This is an alternative to the ‘y_err’ keyword argument, allowing the y-data covariance matrix to be specified directly.

  • hyperpars – An array specifying the hyper-parameter values to be used by the covariance function class, which by default is SquaredExponential. See the documentation for the relevant covariance function class for a description of the required hyper-parameters. Generally this argument should be left unspecified, in which case the hyper-parameters will be selected automatically.

  • kernel (class) – The covariance function class which will be used to model the data. The covariance function classes can be imported from the gp module and then passed to GpRegressor using this keyword argument.

  • cross_val (bool) – If set to True, leave-one-out cross-validation is used to select the hyper-parameters in place of the marginal likelihood.

  • optimizer (str) – Selects the method used to optimize the hyper-parameter values. The available options are “bfgs” for scipy.optimize.fmin_l_bfgs_b or “diffev” for scipy.optimize.differential_evolution.

  • n_processes (int) – Sets the number of processes used in optimizing the hyper-parameter values. Multiple processes are only used when the optimizer keyword is set to “bfgs”.

  • n_starts (int) – Sets the number of randomly-selected starting positions from which the BFGS algorithm is launched during hyper-parameter optimization. If unspecified, the number of starting positions is determined based on the total number of hyper-parameters.

__call__(points: ndarray)

Calculate the mean and standard deviation of the regression estimate at a series of specified spatial points.

Parameters

points – The points at which the mean and standard deviation of the regression estimate is to be calculated, given as a 2D numpy.ndarray with shape (number of points, number of dimensions). Alternatively, a list of array-like objects can be given, which will be converted to a ndarray internally.

Returns

Two 1D arrays, the first containing the means and the second containing the standard deviations.

build_posterior(points: ndarray)

Generates the full mean vector and covariance matrix for the Gaussian-process posterior distribution at a set of specified points.

Parameters

points – The points for which the mean vector and covariance matrix are to be calculated, given as a 2D numpy.ndarray with shape (number of points, number of dimensions). Alternatively, a list of array-like objects can be given, which will be converted to a ndarray internally.

Returns

The mean vector as a 1D array, followed by the covariance matrix as a 2D array.

gradient(points: ndarray)

Calculate the mean and covariance of the gradient of the regression estimate with respect to the spatial coordinates at a series of specified points.

Parameters

points – The points at which the mean vector and covariance matrix of the gradient of the regression estimate are to be calculated, given as a 2D numpy.ndarray with shape (number of points, number of dimensions). Alternatively, a list of array-like objects can be given, which will be converted to a ndarray internally.

Return means, covariances

Two arrays containing the means and covariances of each given spatial point. If the number of spatial dimensions N is greater than 1, then the covariances array is a set of 2D covariance matrices, having shape (M,N,N) where M is the given number of spatial points.

Example code

Example code can be found in the Gaussian-process regression jupyter notebook demo.