Borehole Function#

import numpy as np
import matplotlib.pyplot as plt
import uqtestfuns as uqtf

The Borehole test function is an eight-dimensional scalar-valued function. The function has been used in the context of sensitivity analysis [HG83, Wor87] and metamodeling [MMY93].

Test function instance#

To create a default instance of the Borehole test function:

my_testfun = uqtf.Borehole()

Check if it has been correctly instantiated:

print(my_testfun)
Function ID      : Borehole
Input Dimension  : 8 (fixed)
Output Dimension : 1
Parameterized    : False
Description      : Borehole function from Harper and Gupta (1983)
Applications     : metamodeling, sensitivity

Description#

The Borehole function models the flow rate of water through a borehole drilled from the ground surface through two aquifers. The model assumes laminar-isothermal flow, and there is no groundwater gradient, with steady-state flow between the upper aquifer and the borehole and between the borehole and the lower aquifer [HG83]. The function computes the water flow rate through the borehole using the following analytical formula:

\[ \mathcal{M}(\boldsymbol{x}) = \frac{2 \, \pi \, T_u \, (H_u - H_l)}{\ln{(r/rw)} \left[1 + \frac{2 \, L \, Tu}{\ln{(r/rw)} \, r_w^2 K_w} + \frac{T_u}{T_l} \right]} \]

where \(\boldsymbol{x} = \{ r_w, r, T_u, H_u, T_l, H_l, L, K_w\}\) is the vector of input variables defined below. The unit of the output is \(\left[ \mathrm{m}^3 / \mathrm{year} \right]\).

Probabilistic input#

There are two probabilistic input models available as shown in the table below.

No.

Keyword

Source

1.

Harper1983 (default)

[HG83]

2.

Morris1993

[MMY93]

In both specifications, the input variables are modeled as independent random variables. The marginals of the original specification (from [HG83]) are shown below:

Hide code cell source
print(my_testfun.prob_input)
Function ID     : Borehole
Input ID        : Harper1983
Input Dimension : 8
Description     : Probabilistic input model of the Borehole model from
                  Harper and Gupta (1983)
Marginals       :

 No.    Name    Distribution        Parameters                          Description
-----  ------  --------------  ---------------------  -----------------------------------------------
  1      rw        normal      [0.1       0.0161812]            radius of the borehole [m]
  2      r       lognormal        [7.71   1.0056]                 radius of influence [m]
  3      Tu       uniform        [ 63070. 115600.]      transmissivity of upper aquifer [m^2/year]
  4      Hu       uniform          [ 990. 1100.]         potentiometric head of upper aquifer [m]
  5      Tl       uniform          [ 63.1 116. ]        transmissivity of lower aquifer [m^2/year]
  6      Hl       uniform           [700. 820.]          potentiometric head of lower aquifer [m]
  7      L        uniform          [1120. 1680.]                length of the borehole [m]
  8      Kw       uniform         [ 9985. 12045.]     hydraulic conductivity of the borehole [m/year]

Copulas         : Independence

Note

In [MMY93], the non-uniform distributions (\(r_w\) and \(r\)) are replaced with uniform distributions.

For example, to create a Borehole test function using the input specification by [MMY93]:

my_testfun = uqtf.Borehole(input_id="Morris1993")

Reference results#

This section provides several reference results of typical UQ analyses involving the test function.

Sample histogram#

Shown below is the histogram of the output based on \(100'000\) random points:

Hide code cell source
np.random.seed(42)
xx_test = my_testfun.prob_input.get_sample(100000)
yy_test = my_testfun(xx_test)

plt.hist(yy_test, bins="auto", color="#8da0cb");
plt.grid();
plt.ylabel("Counts [-]");
plt.xlabel("$\mathcal{M}(\mathbf{X})$");
plt.gcf().set_dpi(150);
../_images/22a810dd36a7aa4e843c44d0890e0cf08bce1d3b745adefe5a23c5123bf47321.png

Moments estimation#

Shown below is the convergence of a direct Monte-Carlo estimation of the output mean and variance with increasing sample sizes.

Hide code cell source
# --- Compute the mean and variance estimate
np.random.seed(42)
sample_sizes = np.array([1e1, 1e2, 1e3, 1e4, 1e5, 1e6, 1e7], dtype=int)
mean_estimates = np.empty(len(sample_sizes))
var_estimates = np.empty(len(sample_sizes))

for i, sample_size in enumerate(sample_sizes):
    xx_test = my_testfun.prob_input.get_sample(sample_size)
    yy_test = my_testfun(xx_test)
    mean_estimates[i] = np.mean(yy_test)
    var_estimates[i] = np.var(yy_test)

# --- Compute the error associated with the estimates
mean_estimates_errors = np.sqrt(var_estimates) / np.sqrt(np.array(sample_sizes))
var_estimates_errors = var_estimates * np.sqrt(2 / (np.array(sample_sizes) - 1))

# --- Do the plot
fig, ax_1 = plt.subplots(figsize=(6,4))

ax_1.errorbar(
    sample_sizes,
    mean_estimates,
    yerr=mean_estimates_errors,
    marker="o",
    color="#66c2a5",
    label="Mean",
)
ax_1.set_xlabel("Sample size")
ax_1.set_ylabel("Output mean estimate")
ax_1.set_xscale("log");
ax_2 = ax_1.twinx()
ax_2.errorbar(
    sample_sizes + 1,
    var_estimates,
    yerr=var_estimates_errors,
    marker="o",
    color="#fc8d62",
    label="Variance",
)
ax_2.set_ylabel("Output variance estimate")

# Add the two plots together to have a common legend
ln_1, labels_1 = ax_1.get_legend_handles_labels()
ln_2, labels_2 = ax_2.get_legend_handles_labels()
ax_2.legend(ln_1 + ln_2, labels_1 + labels_2, loc=0)

plt.grid()
fig.set_dpi(150)
../_images/25c58cb0e23aa265fdd3498fd75ee400048826a0a93ebfe3046373b00a067b74.png

The tabulated results for each sample size is shown below.

Hide code cell source
from tabulate import tabulate

# --- Compile data row-wise
outputs = []
for (
    sample_size,
    mean_estimate,
    mean_estimate_error,
    var_estimate,
    var_estimate_error,
) in zip(
    sample_sizes,
    mean_estimates,
    mean_estimates_errors,
    var_estimates,
    var_estimates_errors,
):
    outputs += [
        [
            sample_size,
            mean_estimate,
            mean_estimate_error,
            var_estimate,
            var_estimate_error,
            "Monte-Carlo",
        ],
    ]

header_names = [
    "Sample size",
    "Mean",
    "Mean error",
    "Variance",
    "Variance error",
    "Remark",
]

tabulate(
    outputs,
    headers=header_names,
    floatfmt=(".1e", ".4e", ".4e", ".4e", ".4e", "s"),
    tablefmt="html",
    stralign="center",
    numalign="center",
)
Sample size Mean Mean error Variance Variance error Remark
1.0e+01 6.1835e+01 9.3631e+00 8.7667e+02 4.1327e+02 Monte-Carlo
1.0e+02 7.2622e+01 2.8201e+00 7.9528e+02 1.1304e+02 Monte-Carlo
1.0e+03 7.3542e+01 8.6838e-01 7.5409e+02 3.3741e+01 Monte-Carlo
1.0e+04 7.2562e+01 2.8161e-01 7.9307e+02 1.1216e+01 Monte-Carlo
1.0e+05 7.3016e+01 8.9094e-02 7.9377e+02 3.5498e+00 Monte-Carlo
1.0e+06 7.2920e+01 2.8100e-02 7.8959e+02 1.1166e+00 Monte-Carlo
1.0e+07 7.2893e+01 8.8805e-03 7.8863e+02 3.5269e-01 Monte-Carlo

References#

[HG83] (1,2,3,4)

William V. Harper and Sumant K. Gupta. Sensitivity/uncertainty analysis of a borehole scenario comparing latin hypercube sampling and deterministic sensitivity approaches. Technical Report BMI/ONWI-516, Office of Nuclear Waste Isolation, Battelle Memorial Institute, 1983. URL: https://inldigitallibrary.inl.gov/PRR/84393.pdf.

[Wor87]

Brian A. Worley. Deterministic uncertainty analysis. Technical Report ORNL-6428, Oak Ridge National Laboratory (ORNL), Oak Ridge, Tennessee, 1987. doi:10.2172/5534706.

[MMY93] (1,2,3,4)

Max D. Morris, T. J. Mitchell, and D. Ylvisaker. Bayesian design and analysis of computer experiments: use of derivatives in surface prediction. Technometrics, 35(3):245–255, 1993. doi:10.1080/00401706.1993.10485320.