GpRegressor
- class inference.gp.GpRegressor(x: ~numpy.ndarray, y: ~numpy.ndarray, y_err: ~numpy.ndarray = None, y_cov: ~numpy.ndarray = None, hyperpars: ~numpy.ndarray = None, kernel: ~inference.gp.covariance.CovarianceFunction = <class 'inference.gp.covariance.SquaredExponential'>, mean: ~inference.gp.mean.MeanFunction = <class 'inference.gp.mean.ConstantMean'>, cross_val: bool = False, optimizer: str = 'bfgs', n_processes: int = 1, n_starts: int = None)
A class for performing Gaussian-process regression in one or more dimensions.
Gaussian-process regression (GPR) is a non-parametric regression technique which can fit arbitrarily spaced data in any number of dimensions. A unique feature of GPR is its ability to account for uncertainties on the data (which must be assumed to be Gaussian) and propagate that uncertainty to the regression estimate by modelling the regression estimate itself as a multivariate normal distribution.
- Parameters:
x – The x-data points as a 2D
numpy.ndarraywith shape (number of points, number of dimensions). Alternatively, a list of array-like objects can be given, which will be converted to andarrayinternally.y – The y-data values as a 1D
numpy.ndarray.y_err – The error on the y-data values supplied as a 1D
numpy.ndarray. This technique explicitly assumes that errors are Gaussian, so the supplied error values represent normal distribution standard deviations. If this argument is not specified the errors are taken to be small but non-zero.y_cov – A covariance matrix representing the uncertainties on the y-data values. This is an alternative to the ‘y_err’ keyword argument, allowing the y-data covariance matrix to be specified directly.
hyperpars – An array specifying the hyper-parameter values to be used by the covariance function class, which by default is
SquaredExponential. See the documentation for the relevant covariance function class for a description of the required hyper-parameters. Generally this argument should be left unspecified, in which case the hyper-parameters will be selected automatically.kernel (class) – The covariance function class which will be used to model the data. The covariance function classes can be imported from the
gpmodule and then passed toGpRegressorusing this keyword argument.cross_val (bool) – If set to
True, leave-one-out cross-validation is used to select the hyper-parameters in place of the marginal likelihood.optimizer (str) – Selects the method used to optimize the hyper-parameter values. The available options are “bfgs” for
scipy.optimize.fmin_l_bfgs_bor “diffev” forscipy.optimize.differential_evolution.n_processes (int) – Sets the number of processes used in optimizing the hyper-parameter values. Multiple processes are only used when the optimizer keyword is set to “bfgs”.
n_starts (int) – Sets the number of randomly-selected starting positions from which the BFGS algorithm is launched during hyper-parameter optimization. If unspecified, the number of starting positions is determined based on the total number of hyper-parameters.
- __call__(points: ndarray)
Calculate the mean and standard deviation of the regression estimate at a series of specified spatial points.
- Parameters:
points – The points at which the mean and standard deviation of the regression estimate is to be calculated, given as a 2D
numpy.ndarraywith shape (number of points, number of dimensions). Alternatively, a list of array-like objects can be given, which will be converted to andarrayinternally.- Returns:
Two 1D arrays, the first containing the means and the second containing the standard deviations.
- build_posterior(points: ndarray, mean_only=False)
Generates the full mean vector and covariance matrix for the Gaussian-process posterior distribution at a set of specified points.
- Parameters:
points – The points for which the mean vector and covariance matrix are to be calculated, given as a 2D
numpy.ndarraywith shape (number of points, number of dimensions). Alternatively, a list of array-like objects can be given, which will be converted to andarrayinternally.mean_only – If set to
True, only the mean vector of the posterior is calculated and returned, instead of both the mean and covariance.
- Returns:
The mean vector as a 1D array, followed by the covariance matrix as a 2D array.
- gradient(points: ndarray)
Calculate the mean and covariance of the gradient of the regression estimate with respect to the spatial coordinates at a series of specified points.
- Parameters:
points – The points at which the mean vector and covariance matrix of the gradient of the regression estimate are to be calculated, given as a 2D
numpy.ndarraywith shape (number of points, number of dimensions). Alternatively, a list of array-like objects can be given, which will be converted to andarrayinternally.- Return means, covariances:
Two arrays containing the means and covariances of each given spatial point. If the number of spatial dimensions
Nis greater than 1, then the covariances array is a set of 2D covariance matrices, having shape(M,N,N)whereMis the given number of spatial points.
Example code
Example code can be found in the Gaussian-process regression jupyter notebook demo.