A data.frame with numeric response and explanatory variables simulated the real life borehole example of the flow rate of water through a borehole from an upper aquifer to a lower aquifer separated by an impermeable rock layer.
Format
data_numeric_regression
A data frame with 10000 rows and 9 columns:
- \(\mathcal{Y}\)
response variable, flow rate through the borehole.
- \(r_{\mathrm{w}}\)
the radius of borehole.
- \(r\)
the radius of influence.
- \(T_{\mathrm{u}}\)
the transmissivity of upper aquifer.
- \(T_1\)
the transmissivity of lower aquifer.
- \(H_{\mathrm{u}}\)
the potentiometric head of upper aquifer.
- \(H_1\)
the potentiometric head of lower aquifer.
- \(L\)
the length of borehole.
- \(K_{\mathrm{w}}\)
the hydraulic conductivity of borehole.
Details
The response variable \(\mathcal{Y}\), the flow rate through the borehole in \(m^3 / yr\), is determined by a complex nonlinear function as follows, $$\mathcal{Y}=\frac{2 \pi T_{\mathrm{u}}\left(H_{\mathrm{u}}-H_{\mathrm{l}}\right)}{\ln \left(r / r_{\mathrm{w}}\right)\left[1+\frac{2 L T_{\mathrm{u}}}{\ln \left(r / r_{\mathrm{w}}\right) r_{\mathrm{w}}^2 K_{\mathrm{w}}^2}+\frac{T_{\mathrm{u}}}{T_1}\right]},$$ where the 8 input variables with their usual input ranges are listed as follows:
\(r_{\mathrm{w}} \in[0.05,0.15]\) means the radius of borehole (\(m\));
\(r \in[100,50000]\) means the radius of influence (\(m\));
\(T_{\mathrm{u}} \in[63070,115600]\) means the transmissivity of upper aquifer(\(\left(\mathrm{m}^2 / \mathrm{yr}\right)\));
\(T_1 \in[63.1,116]\) means the transmissivity of lower aquifer(\(\left(m^2 / y r\right)\));
\(H_{\mathrm{u}} \in[990,1110]\) means the potentiometric head of upper aquifer(\(m\));
\(H_1 \in [700,820]\) means the potentiometric head of lower aquifer (\(m\));
\(L \in[1120,1680]\) means the length of borehole (\(m\));
\(K_{\mathrm{w}} \in[9855,12045]\) means the hydraulic conductivity of borehole (\(m / y r\)).
The distribution of \(r_{\mathrm{w}}\) is the normal distribution\(\mathcal{N}\left(0.10,0.0161812^2\right)\), the distribution of \(r\) is the lognormal distribution \(\operatorname{Lognormal}\left(7.71,1.0056^2\right)\), and the distributions of other variables are all continuous uniform distribution on their corresponding domains.