Title: | Agnostic Fay-Herriot Model for Small Area Statistics |
---|---|
Description: | Implements the Agnostic Fay-Herriot model, an extension of the traditional small area model. In place of normal sampling errors, the sampling error distribution is estimated with a Gaussian process to accommodate a broader class of distributions. This flexibility is most useful in the presence of bounded, multi-modal, or heavily skewed sampling errors. |
Authors: | Marten Thompson [aut, cre, cph], Snigdhansu Chatterjee [ctb, cph] |
Maintainer: | Marten Thompson <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.2.1 |
Built: | 2025-02-14 04:42:26 UTC |
Source: | https://github.com/cran/agfh |
A maker function that returns a function. The returned function is the adjusted profile likelihood of the data for a given (latent) variance, from Yoshimori & Lahiri (2014).
adj_profile_likelihood_theta_var_maker(X, Y, D)
adj_profile_likelihood_theta_var_maker(X, Y, D)
X |
observed independent data to be analyzed |
Y |
observed dependent data to be analyzed |
D |
known precisions of response |
Returns the adjusted profile likelihood as a function of the variance term in the latent model.
Marten Thompson [email protected]
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) adj.lik <- adj_profile_likelihood_theta_var_maker(X, Y, D) adj.lik(0.5)
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) adj.lik <- adj_profile_likelihood_theta_var_maker(X, Y, D) adj.lik(0.5)
A maker function that returns a function. The returned function is the adjusted residual likelihood of the data for a given (latent) variance, from Yoshimori & Lahiri (2014).
adj_resid_likelihood_theta_var_maker(X, Y, D)
adj_resid_likelihood_theta_var_maker(X, Y, D)
X |
observed independent data to be analyzed |
Y |
observed dependent data to be analyzed |
D |
known precisions of response |
Returns the adjusted residual likelihood as a function of the variance term in the latent model.
Marten Thompson [email protected]
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) adj.lik <- adj_resid_likelihood_theta_var_maker(X, Y, D) adj.lik(0.5)
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) adj.lik <- adj_resid_likelihood_theta_var_maker(X, Y, D) adj.lik(0.5)
Find predictions of using posterior samples from the AGFH model
agfh_theta_new_pred(X_new, beta_samples, theta_var_samples)
agfh_theta_new_pred(X_new, beta_samples, theta_var_samples)
X_new |
single new independent data to be analyzed |
beta_samples |
posterior samples of latent regression parameter |
theta_var_samples |
posterior samples of latent variance parameter |
X_new
should be
x
shaped.
beta_samples
and theta_var_samples
should contain the same number of samples (columns for the former, length of the latter).
Vector containing n samples-many estimates of at
X_new
.
Marten Thompson [email protected]
p <- 3 n.post.samp <- 10 X.new <- matrix(rep(1,p), nrow=1) beta.samp <- matrix(rnorm(n.post.samp*p, mean=2, sd=0.1), ncol=n.post.samp) thvar.samp <- runif(n.post.samp, 0.1, 1) th.preds <- agfh_theta_new_pred(X.new, beta.samp, thvar.samp)
p <- 3 n.post.samp <- 10 X.new <- matrix(rep(1,p), nrow=1) beta.samp <- matrix(rnorm(n.post.samp*p, mean=2, sd=0.1), ncol=n.post.samp) thvar.samp <- runif(n.post.samp, 0.1, 1) th.preds <- agfh_theta_new_pred(X.new, beta.samp, thvar.samp)
Test a sample against the null hypothesis that it comes from a standard Normal distribution.
anderson_darling(samples)
anderson_darling(samples)
samples |
vector of values to be tested |
Wrapper function for corresponding functionality in goftest
. Originally, from Anderson and Darling (1954).
A list containing
name |
authors of normality test applied i.e. 'Anderson Darling' |
statistic |
scalar value of test statistics |
p.value |
corresponding p-value of the test |
Anderson and Darling (1954) via goftest
.
sample <- rnorm(100) anderson_darling(sample)
sample <- rnorm(100) anderson_darling(sample)
The traditional Fay-Herriot small area model has a Normal latent variable and Normal observed response errors. This method generates data with Normal latent variables and Beta errors on the response. Note that the sampling errors are transformed so their mean and variance match the the first two moments of the traditional model.
beta_err_gen (M, p, D, lambda, a, b)
beta_err_gen (M, p, D, lambda, a, b)
M |
number of areal units |
p |
dimension of regressors i.e. |
D |
vector of precisions for response, length |
lambda |
value of latent variance |
a |
first shape parameter of Beta distribution |
b |
second shape parameter of Beta distribution |
A list containing
D |
copy of argument 'D' |
beta |
vector of length 'p' latent coefficients |
lambda |
copy of argument 'lambda' |
X |
matrix of independent variables |
theta |
vector of latent effects |
Y |
vector of responses |
err |
vector of sampling errors |
name |
name of sampling error distribution, including shape parameters |
Marten Thompson [email protected]
M <- 50 p <- 3 D <- rep(0.1, M) lamb <- 1/2 dat <- beta_err_gen(M, p, D, lamb, 1/2, 1/4)
M <- 50 p <- 3 D <- rep(0.1, M) lamb <- 1/2 dat <- beta_err_gen(M, p, D, lamb, 1/2, 1/4)
Test a sample against the null hypothesis that it comes from a standard Normal distribution.
cramer_vonmises(samples)
cramer_vonmises(samples)
samples |
vector of values to be tested |
Wrapper function for corresponding functionality in goftest
. Originally developed in Cramer (1928), Mises (1931), and Smirnov (1936).
A list containing
name |
authors of normality test applied i.e. 'Cramer von Mises' |
statistic |
scalar value of test statistics |
p.value |
corresponding p-value of the test |
Cramer (1928), Mises (1931), and Smirnov (1936) via goftest
.
sample <- rnorm(100) cramer_vonmises(sample)
sample <- rnorm(100) cramer_vonmises(sample)
The traditional Fay-Herriot small area model has a Normal latent variable and Normal observed response errors. This method generates data with Normal latent variables and Gamma errors on the response. Note that the sampling errors are transformed so their mean and variance match the the first two moments of the traditional model.
gamma_err_gen (M, p, D, lambda, shape, rate)
gamma_err_gen (M, p, D, lambda, shape, rate)
M |
number of areal units |
p |
dimension of regressors i.e. |
D |
vector of precisions for response, length |
lambda |
value of latent variance |
shape |
shape parameter of Gamma distribution |
rate |
rate parameter of Gamma distribution |
A list containing
D |
copy of argument 'D' |
beta |
vector of length 'p' latent coefficients |
lambda |
copy of argument 'lambda' |
X |
matrix of independent variables |
theta |
vector of latent effects |
Y |
vector of responses |
err |
vector of sampling errors |
name |
name of sampling error distribution, including shape and rate parameters |
Marten Thompson [email protected]
M <- 50 p <- 3 D <- rep(0.1, M) lamb <- 1/2 dat <- gamma_err_gen(M, p, D, lamb, 1/2, 10)
M <- 50 p <- 3 D <- rep(0.1, M) lamb <- 1/2 dat <- gamma_err_gen(M, p, D, lamb, 1/2, 10)
Find predictions using posterior samples from the traditional Fay-Herriot hierarchical bayesian model
hb_theta_new_pred(X_new, beta_samples, theta_var_samples)
hb_theta_new_pred(X_new, beta_samples, theta_var_samples)
X_new |
single new independent data to be analyzed |
beta_samples |
posterior samples of latent regression parameter |
theta_var_samples |
posterior samples of latent variance parameter |
X_new
should be
x
shaped.
beta_samples
and theta_var_samples
should contain the same number of samples (columns for the former, length of the latter).
Vector containing n samples-many estimates of at
X_new
.
Marten Thompson [email protected]
p <- 3 n.post.samp <- 10 X.new <- matrix(rep(1,p), nrow=1) beta.samp <- matrix(rnorm(n.post.samp*p, mean=2, sd=0.1), ncol=n.post.samp) thvar.samp <- runif(n.post.samp, 0.1, 1) th.preds <- hb_theta_new_pred(X.new, beta.samp, thvar.samp)
p <- 3 n.post.samp <- 10 X.new <- matrix(rep(1,p), nrow=1) beta.samp <- matrix(rnorm(n.post.samp*p, mean=2, sd=0.1), ncol=n.post.samp) thvar.samp <- runif(n.post.samp, 0.1, 1) th.preds <- hb_theta_new_pred(X.new, beta.samp, thvar.samp)
Test a sample against the null hypothesis that it comes from a standard Normal distribution.
kolmogorov_smirnov(samples)
kolmogorov_smirnov(samples)
samples |
vector of values to be tested |
Wrapper function for corresponding functionality in stats
. Originally, from Kolmogorov (1933).
A list containing
name |
name of normality test applied i.e. 'Komogorov Smirnov' |
statistic |
scalar value of test statistics |
p.value |
corresponding p-value from test |
Kolmogorov (1933) via stats
.
sample <- rnorm(100) kolmogorov_smirnov(sample)
sample <- rnorm(100) kolmogorov_smirnov(sample)
A maker function that returns a function. The returned function is a sampler for the agnostic Fay-Herriot model.
X |
observed independent data to be analyzed |
Y |
observed dependent data to be analyzed |
D |
known precisions of response |
var_gamma_a |
latent variance prior parameter, |
var_gamma_b |
latent variance prior parameter, |
S |
vector of starting support values for |
kern.a0 |
scalar variance parameter of GP kernel |
kern.a1 |
scalar lengthscale parameter of GP kernel |
kern.fuzz |
scalar noise variance of kernel |
Creates a Metropolis-within-Gibbs sampler of the agnostic Fay-Herriot model (AGFH).
Returns a sampler, itself a function of initial parameter values (a list with values for , the latent variance of
, and starting values for
, typically zeros), number of samples, thinning rate, and scale of Metropolis-Hastings jumps for
sampling.
Marten Thompson [email protected]
n <- 10 X <- matrix(1:n, ncol=1) Y <- 2*X + rnorm(n, sd=1.1) D <- rep(1, n) ag <- make_agfh_sampler(X, Y, D) params.init <- list( beta=1, theta=rep(0,n), theta.var=1, gamma=rep(0,n) ) ag.out <- ag(params.init, 5, 1, 0.1)
n <- 10 X <- matrix(1:n, ncol=1) Y <- 2*X + rnorm(n, sd=1.1) D <- rep(1, n) ag <- make_agfh_sampler(X, Y, D) params.init <- list( beta=1, theta=rep(0,n), theta.var=1, gamma=rep(0,n) ) ag.out <- ag(params.init, 5, 1, 0.1)
A maker function that returns a function. The returned function is a Gibbs sampler for the traditional Fay-Herriot model.
make_gibbs_sampler(X, Y, D, var_gamma_a=1, var_gamma_b=1)
make_gibbs_sampler(X, Y, D, var_gamma_a=1, var_gamma_b=1)
X |
observed independent data to be analyzed |
Y |
observed dependent data to be analyzed |
D |
known precisions of response |
var_gamma_a |
latent variance prior parameter, |
var_gamma_b |
latent variance prior parameter, |
Returns a Gibbs sampler, itself a function of initial parameter values (a list with values for , and latent variance of
), number of samples, and thinning rate.
Marten Thompson [email protected]
n <- 10 X <- matrix(1:n, ncol=1) Y <- 2*X + rnorm(n, sd=1.1) D <- rep(1, n) gib <- make_gibbs_sampler(X, Y, D) params.init <- list( beta=1, theta=rep(0,n), theta.var=1 ) gib.out <- gib(params.init, 5, 1)
n <- 10 X <- matrix(1:n, ncol=1) Y <- 2*X + rnorm(n, sd=1.1) D <- rep(1, n) gib <- make_gibbs_sampler(X, Y, D) params.init <- list( beta=1, theta=rep(0,n), theta.var=1 ) gib.out <- gib(params.init, 5, 1)
Find maximum a posteriori estimate using posterior samples
map_from_density(param.ts, plot=FALSE)
map_from_density(param.ts, plot=FALSE)
param.ts |
vector of scalar samples |
plot |
boolean, plot or not |
Finds location of max of density from samples.
Scalar MAP estimate.
Marten Thompson [email protected]
n.post.samp <- 10 beta.samp <- rnorm(n.post.samp, 0, 1/2) map_from_density(beta.samp)
n.post.samp <- 10 beta.samp <- rnorm(n.post.samp, 0, 1/2) map_from_density(beta.samp)
Merely wanted to use such a function by name; nothing fancy
mse(x,y)
mse(x,y)
x |
vector of values |
y |
vector of values |
A scalar: the MSE between x
and y
.
Marten Thompson [email protected]
mse(seq(1:10), seq(10:1))
mse(seq(1:10), seq(10:1))
The Fay-Herriot small area model has a Normal latent variable and Normal observed response. This generates data according to that specification.
null_gen (M, p, D, lambda)
null_gen (M, p, D, lambda)
M |
number of areal units |
p |
dimension of regressors i.e. |
D |
vector of precisions for response, length |
lambda |
value of latent variance |
A list containing
D |
copy of argument 'D' |
beta |
vector of length 'p' latent coefficients |
lambda |
copy of argument 'lambda' |
X |
matrix of independent variables |
theta |
vector of latent effects |
Y |
vector of responses |
err |
vector of sampling errors |
name |
name of sampling error distribution |
Marten Thompson [email protected]
M <- 50 p <- 3 D <- rep(0.1, M) lamb <- 1/2 dat <- null_gen(M, p, D, lamb)
M <- 50 p <- 3 D <- rep(0.1, M) lamb <- 1/2 dat <- null_gen(M, p, D, lamb)
A maker function that returns a function. The returned function is the (non-adjusted) residual likelihood of the data for a given (latent) variance, from Yoshimori & Lahiri (2014).
resid_likelihood_theta_var_maker(X, Y, D)
resid_likelihood_theta_var_maker(X, Y, D)
X |
observed independent data to be analyzed |
Y |
observed dependent data to be analyzed |
D |
known precisions of response |
Returns the (non-adjusted) residual likelihood as a function of the variance term in the latent model.
Marten Thompson [email protected]
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) resid.lik <- resid_likelihood_theta_var_maker(X, Y, D) resid.lik(0.5)
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) resid.lik <- resid_likelihood_theta_var_maker(X, Y, D) resid.lik(0.5)
Traditional EBLUE Estimator of Beta
RM_beta_eblue(X, Y, D, theta_var_est)
RM_beta_eblue(X, Y, D, theta_var_est)
X |
observed independent data to be analyzed |
Y |
observed dependent data to be analyzed |
D |
known precisions of response |
theta_var_est |
estimate of variance term for latent model |
Traditional EBLUE estimator of beta.
Returns a vector estimate of beta.
Marten Thompson [email protected]
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) th.var.est <- 0.1 RM_beta_eblue(X, Y, D, th.var.est)
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) th.var.est <- 0.1 RM_beta_eblue(X, Y, D, th.var.est)
Traditional EBLUP Estimator of Theta
RM_theta_eblup(X, Y, D, theta.var.est)
RM_theta_eblup(X, Y, D, theta.var.est)
X |
observed independent data to be analyzed |
Y |
observed dependent data to be analyzed |
D |
known precisions of response |
theta.var.est |
estimate of variance term for latent model; if |
Traditional EBLUP estimator of latent values theta.
Returns a vector of estimates of theta.
Marten Thompson [email protected]
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) th.var.est <- 0.1 RM_theta_eblup(X, Y, D, th.var.est) RM_theta_eblup(X, Y, D)
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) th.var.est <- 0.1 RM_theta_eblup(X, Y, D, th.var.est) RM_theta_eblup(X, Y, D)
X
valuesTraditional EBLUP Estimator of Theta for new X
values
RM_theta_new_pred(X.new, beta.est)
RM_theta_new_pred(X.new, beta.est)
X.new |
new independent data to be analyzed |
beta.est |
estimate of regression term for latent model |
Simply X
'beta.est
Returns a vector of estimates of theta.
Marten Thompson [email protected]
X <- matrix(1:10, ncol=1) b <- 1 RM_theta_new_pred(X, b)
X <- matrix(1:10, ncol=1) b <- 1 RM_theta_new_pred(X, b)
Simple moment-based estimator of the variance of the latent model.
RM_theta_var_moment_est(X, Y, D)
RM_theta_var_moment_est(X, Y, D)
X |
observed independent data to be analyzed |
Y |
observed dependent data to be analyzed |
D |
known precisions of response |
Simple moment-based estimator of the variance of the latent model.
Returns a scalar estimate of variance.
Marten Thompson [email protected]
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) RM_theta_var_moment_est(X, Y, D)
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) RM_theta_var_moment_est(X, Y, D)
Test a sample against the null hypothesis that it comes from a standard Normal distribution.
shapiro_wilk(samples)
shapiro_wilk(samples)
samples |
vector of values to be tested |
Wrapper function for corresponding functionality in stats
. Originally, from Shapiro and Wilk (1975).
A list containing
name |
authors of normality test applied i.e. 'Shapiro Wilk' |
statistic |
scalar value of test statistics |
p.value |
corresponding p-value of the test |
Shapiro and Wilk (1975) via stats
.
sample <- rnorm(100) shapiro_wilk(sample)
sample <- rnorm(100) shapiro_wilk(sample)
Test a sample against the null hypothesis that it comes from a standard Normal distribution with the specified test.
test_u_normal(samples, test)
test_u_normal(samples, test)
samples |
vector of values to be tested |
test |
name of test, one of |
Convenience function for consistent syntax in calling shapiro_wilk
, kolmogorov_smirnov
, cramer_vonmises
, and anderson_darling
tests.
A list containing
name |
authors of normality test applied |
statistic |
scalar value of test statistics |
p.value |
corresponding p-value from test |
Marten Thompson [email protected]
sample <- rnorm(100) test_u_normal(sample, 'SW')
sample <- rnorm(100) test_u_normal(sample, 'SW')
A basic grid search optimizer. Here, used to estimate the variance in the latent model by maximum likelihood.
theta_var_est_grid(likelihood_theta_var)
theta_var_est_grid(likelihood_theta_var)
likelihood_theta_var |
some flavor of likelihood function in terms of latent variance |
likelihood_theta_var
may be created using adj_resid_likelihood_theta_var_maker
or similar.
We recommended implementing a more robust optimizer.
The scalar value that optimizes likelihood_theta_var
, or an error if this value is on the search boundary .
Marten Thompson [email protected]
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) adj.lik <- adj_resid_likelihood_theta_var_maker(X, Y, D) theta_var_est_grid(adj.lik)
X <- matrix(1:10, ncol=1) Y <- 2*X + rnorm(10, sd=1.1) D <- rep(1, 10) adj.lik <- adj_resid_likelihood_theta_var_maker(X, Y, D) theta_var_est_grid(adj.lik)