Skip to contents

This vignette summarizes the key mathematical objects and estimating-equation derivations behind the empirical likelihood (EL) estimator implemented in the nmar package, and maps them to code. It covers both data-frame (IID) and survey design use cases, allows arbitrary numbers of response-model and auxiliary covariates, and supports both logit and probit response families.

Notation

Units

  • i=1,,ni = 1, \ldots, n index respondents (those with observed YY)
  • Ri{0,1}R_i \in \{0, 1\} is the response indicator; we work on observed subset Ri=1R_i = 1

Data

  • Outcome: YiY_i (observed when Ri=1R_i = 1; missing otherwise)
  • Response covariates: row vector ZiKZ_i \in \mathbb{R}^K, the iith row of the missingness-model design matrix built from the formula as an intercept, the LHS outcome expression (evaluated in the model frame), and any additional missingness predictors on the RHS after |
  • Auxiliary covariates: row vector XiLX_i \in \mathbb{R}^L (possibly L=0L = 0), from auxiliary RHS (no intercept)
  • Population auxiliary means: μxL\mu_x \in \mathbb{R}^L, known; names match columns of XX

Mapping to code: - response_model_matrix (the missingness_design in el_prepare_inputs()) corresponds to ZZ and has columns: (Intercept), the evaluated LHS outcome expression, and any RHS2 predictors (after the | in the formula) - auxiliary_matrix corresponds to XX (no intercept); we center it in code as XμxX - \mu_x

Response Model (Family functions)

  • Linear predictor: ηi=Ziβ\eta_i = Z_i \, \beta
  • Response probability: wig(ηi)=linkinv(ηi)w_i \equiv g(\eta_i) = \mathrm{linkinv}(\eta_i)
  • First derivative: dwdη(ηi)=μη,i=mu.eta(ηi)\dfrac{dw}{d\eta}(\eta_i) = \mu_{\eta,i} = \mathrm{mu.eta}(\eta_i)
  • Second derivative: d2wdη2(ηi)=d2mu.deta2(ηi)\dfrac{d^2 w}{d\eta^2}(\eta_i) = \mathrm{d2mu.deta2}(\eta_i) Here linkinv, mu.eta, and d2mu.deta2 refer to the chosen response family (logit or probit). We follow the paper’s wiw_i notation for the response probability and reserve piELp_i^{\text{EL}} for empirical-likelihood weights.

Weight Re-parameterization

  • W(0,1)W \in (0,1) nuisance scalar; we parameterize via z=logit(W)z = \text{logit}(W) for stability and set W=plogis(z)W = \text{plogis}(z)
  • λW\lambda_W \in \mathbb{R} and λxL\lambda_x \in \mathbb{R}^L are EL Lagrange multipliers for the WW-constraint and the auxiliary constraints

EL Weights

  • Denominator: Di=1+λW(wiW)+(Xiμx)TλxD_i = 1 + \lambda_W (w_i - W) + (X_i - \mu_x)^T \lambda_x
  • Base sampling weights: ai=1a_i = 1 (IID) or ai=a_i = survey base weight for respondent ii
  • EL weights for respondents: piELai/Dip_i^{\text{EL}} \propto a_i / D_i (proportionality normalized by totals below)

Estimator

  • Ŷ=piELYi/piEL\hat{Y} = \sum p_i^{\text{EL}} Y_i / \sum p_i^{\text{EL}}

Notation at a Glance

Symbol Meaning
ii Respondent index (rows with observed YY)
YiY_i Outcome for unit ii (observed if Ri=1R_i=1)
ZiZ_i Row of response design matrix (includes intercept)
XiX_i Row of auxiliary design (no intercept)
μx\mu_x Known population means of auxiliaries (vector)
β\beta Response-model coefficients
ηi=Ziβ\eta_i=Z_i\beta Linear predictor for response model
wiw_i linkinv(ηi)\mathrm{linkinv}(\eta_i) (logit: plogis\mathrm{plogis}; probit: Φ\Phi)
μη,i\mu_{\eta,i} dwidηi\dfrac{dw_i}{d\eta_i}
λW\lambda_W Multiplier for the WW-constraint (wiW)/Di=0\sum (w_i-W)/D_i=0
λx\lambda_x Multipliers for auxiliary constraints (Xiμx)/Di=0\sum (X_i-\mu_x)/D_i=0
DiD_i 1+λW(wiW)+(Xiμx)Tλx1+\lambda_W(w_i-W)+(X_i-\mu_x)^T\lambda_x
aia_i Base weight (IID: 1; survey: design weight)
piELp_i^{\mathrm{EL}} Empirical-likelihood weight ai/Di\propto a_i/D_i
Ŷ\hat Y piELYi/piEL\sum p_i^{\mathrm{EL}} Y_i/\sum p_i^{\mathrm{EL}}
F(θ)F(\theta) Stacked estimating system (beta, W, constraints)
AA Jacobian F/θ\partial F/\partial \theta

Note: QLS use θ\theta for the response-model parameter; in this vignette that parameter is β\beta. We use θ\theta for the full stacked unknown vector solved by the EL engine.

Engines

  • Family: “logit” (default) or “probit”. For respondents (Ri=1R_i = 1), the score with respect to η\eta is si=logwi/ηi=μη,i/wis_i = \partial\log w_i/\partial\eta_i = \mu_{\eta,i}/w_i (equals 1wi1 - w_i for logit and ϕ(ηi)/Φ(ηi)\phi(\eta_i)/\Phi(\eta_i) for probit). In code we compute these using stable log-domain formulas for probit and clip probabilities away from 0 and 1 when they appear in ratios.
  • Scaling: optional standardization of design matrices and μx\mu_x via nmar_scaling_recipe

Data and Interface Constraints

Before applying the EL equations, the implementation enforces several constraints on the formula and data (el_prepare_inputs() in src_dev/engines/el/impl/input.R and the entry points in src_dev/engines/el/impl/dataframe.R and src_dev/engines/el/impl/survey.R):

  • Single outcome source: The LHS expression must reference exactly one outcome source variable in data (for example Y_miss). Any transformation is applied to this variable in the model frame, and the transformed values must be finite for all respondent rows (no new NA/NaN introduced among Ri=1R_i=1).
  • Outcome only via LHS in the response model: The raw outcome variable and the LHS expression are not allowed on RHS1 (auxiliaries) or RHS2 (missingness predictors), either explicitly or via . expansion. The response model uses the evaluated LHS outcome column as a dedicated predictor in missingness_design, together with an intercept and any additional RHS2 predictors.
  • Auxiliaries among respondents: Auxiliary variables (RHS1) must be fully observed and non-constant among respondents. If auxiliary_means are not supplied, auxiliaries must be fully observed in the full data so that population means can be estimated from the sample.
  • Missingness predictors among respondents: Missingness predictors (RHS2) must be fully observed among respondents. Zero-variance predictors are allowed but generate a warning; their columns still enter the response model design matrix.
  • Respondents-only data: When the outcome has no missing values (respondents-only data), the EL engines require n_total to be supplied so that NpopN_{\text{pop}} can be set on the analysis scale. If auxiliaries are requested in this setting, auxiliary_means must also be supplied; otherwise the engines error with a descriptive message.

From Paper to Implementation: Core Ideas

The paper (Qin-Leung-Shao, JASA 2002) sets EL under nonignorable response using:

  • Empirical likelihood weights for respondents that satisfy:
    • Zero-sum residual: piEL(wiW)=0\sum p_i^{\text{EL}} (w_i - W) = 0
    • Auxiliary moments: piEL(Xiμx)=0\sum p_i^{\text{EL}} (X_i - \mu_x) = 0
  • A response model probability wi=g(ηi)w_i = g(\eta_i), ηi=Ziβ\eta_i = Z_i \, \beta

In our code, we adopt the same EL structure and estimating equations. We extend it to arbitrary ZZ and XX, and to survey designs. For uncertainty, we provide bootstrap variance (IID resampling and survey replicate-weight bootstrap). Because Ŷ\hat Y is a ratio-of-weights estimator, any common normalization of piELai/Dip_i^{\text{EL}} \propto a_i/D_i cancels in Ŷ\hat Y; only relative weights matter (the KKT multipliers λ\lambda enforce the constraints; normalization affects only a common scale that vanishes in the ratio).

Closed-form λW\lambda_W (QLS Eq. 10, IID path)

QLS derive λW=(N/n1)/(1W)\lambda_W = (N/n - 1)/(1-W) when ai1a_i \equiv 1 and nn counts respondents. In our IID (data-frame) path we reuse this relation with base weights fixed at ai1a_i \equiv 1:

  • Let nresp_weighted=i:Ri=1ain_{\text{resp\_weighted}} = \sum_{i:R_i=1} a_i be the respondent-weighted total.
  • Let NpopN_{\text{pop}} be the analysis-scale population total (by default nrow(data), or a user-supplied n_total).

In the data.frame path el_prepare_inputs() is called with weights = NULL, so respondent_weights are identically 1 and nresp_weighted=nn_{\text{resp\_weighted}} = n. We then set

λW=Npop/nresp_weighted11W, \lambda_W = \frac{N_{\text{pop}}/n_{\text{resp\_weighted}} - 1}{1 - W},

which reduces exactly to the original QLS formula when NpopN_{\text{pop}} is the total number of sampled units. This closed form is used only in the IID path to profile out λW\lambda_W; for complex survey designs we instead treat λW\lambda_W as a free parameter and solve for it jointly with (β,W,λx)(\beta, W, \lambda_x) via the design-weighted system described in the survey extension section.

Guarding and Numerical Stability

We solve the stacked system with a consistent guarding policy across equations, Jacobian, and post-solution:

  • Cap η\eta: ηmax{min(η,ηmax),ηmax}\eta \leftarrow \max\{\min(\eta,\,\eta_{\max}),\,-\eta_{\max}\}
  • Compute w=linkinv(η)w = \mathrm{linkinv}(\eta); clip ww to [1012,11012][10^{-12}, 1-10^{-12}] when used in ratios
  • Guard denominators: Dimax{Di,δ}D_i \leftarrow \max\{D_i,\,\delta\} with a small δ>0\delta>0
  • In the Jacobian, multiply terms involving (1/Di)/\partial(1/D_i)/\partial\cdot by 𝟙{Diraw>δ}\mathbb{1}\{D_i^{\text{raw}} > \delta\} so the analytic Jacobian matches the piecewise-smooth equations being solved

For the probit link, si(η)=logw/η=ϕ(η)/Φ(η)s_i(\eta) = \partial\log w/\partial\eta = \phi(\eta)/\Phi(\eta) (Mills ratio) is computed in the log domain for stability; its derivative is dsidη=ηsisi2\frac{d s_i}{d\eta} = -\eta\,s_i - s_i^2.

Equation Crosswalk (QLS 2002 -> This Vignette/Code)

  • QLS (5): Discrete mass form for pip_i with two multipliers -> Our Di=1+λW(wiW)+(Xiμx)TλxD_i = 1 + \lambda_W (w_i - W) + (X_i - \mu_x)^T \lambda_x and piELai/Dip_i^{\text{EL}} \propto a_i/D_i.
  • QLS (7): xiμx1+=0\sum \dfrac{x_i - \mu_x}{1 + \cdots} = 0 (or with μx\mu_x replaced by X\bar X when auxiliary variables are observed for all sampled units) -> Our auxiliary constraints ai(Xiμx)/Di=0\sum a_i (X_i - \mu_x)/D_i = 0, where μx\mu_x is taken from auxiliary_means if supplied, otherwise estimated from the full input (unweighted for IID, design-weighted for surveys).
  • QLS (8): wiW1+=0\sum \dfrac{w_i - W}{1 + \cdots} = 0 -> Our WW-equation ai(wiW)/Di=0\sum a_i (w_i - W)/D_i = 0.
  • QLS (10): λ̂2=(N/n1)/(1W)\hat{\lambda}_2 = (N/n - 1)/(1 - W) -> In the IID path we set λW=((Npop/nresp_weighted)1)/(1W)\lambda_W = ((N_{\text{pop}}/n_{\text{resp\_weighted}}) - 1)/(1 - W); in the survey path λW\lambda_W is solved from the additional linkage equation gW(1)g_W^{(1)}.
  • Estimator Ŷ\hat Y in QLS -> Our ratio Ŷ=piELYi/piEL\hat Y = \sum p_i^{\mathrm{EL}} Y_i/\sum p_i^{\mathrm{EL}} using piELai/Dip_i^{\mathrm{EL}} \propto a_i/D_i.

Likelihood and Profiling (sketch)

QLS start from the factorized semiparametric likelihood (their Eq. (2)): (β,W,F)={i=1nw(Yi,Xi;β)dF(Yi,Xi)W}Wn(1W)Nn, \mathcal{L}(\beta, W, F) = \left\{\prod_{i=1}^{n} \frac{w(Y_i, X_i; \beta)\, dF(Y_i, X_i)}{W}\right\} W^{n}(1-W)^{N-n}, where W=w(y,x;β)dF(y,x)W=\iint w(y,x;\beta)\,dF(y,x) is the unconditional response rate. The WnW^{-n} factor in the conditional likelihood cancels the WnW^n in the binomial term, so the overall likelihood is equivalently proportional to {i=1nw(Yi,Xi;β)dF(Yi,Xi)}(1W)Nn. \left\{\prod_{i=1}^{n} w(Y_i, X_i; \beta)\, dF(Y_i, X_i)\right\}(1-W)^{N-n}.

Maximization is subject to (i) dF=1\int dF = 1, (ii) XdF=μx\int X \, dF = \mu_x (or X\bar X when applicable), and (iii) w(Y,X;β)dF=W\int w(Y,X;\beta)\, dF = W. Discretizing FF at observed respondents by assigning unknown masses pip_i and introducing multipliers λ\lambda, the KKT conditions yield the familiar EL weight form with denominator

Di=1+λW(wiW)+(Xiμx)λx, D_i \;=\; 1 + \lambda_W (w_i - W) + (X_i - \mu_x)^\top \lambda_x,

and, with base weights aia_i, the working masses are proportional to ai/Dia_i / D_i.

Remark on conditioning: QLS’s Eq. (2) writes the first product as i[w(yi,xi;β)dF(yi,xi)/W]\prod_i [\, w(y_i,x_i;\beta)\,dF(y_i,x_i)/W\,] so that it explicitly represents the likelihood of (Yi,Xi)(Y_i,X_i) conditional on Ri=1R_i=1. Multiplying by the binomial term Wn(1W)NnW^n(1-W)^{N-n} yields the same overall likelihood as above because the WnW^{-n} in the first factor cancels the WnW^n in the second. Both factorizations lead to the same estimating equations and the same profiled log-likelihood form used subsequently in QLS after introducing the multipliers.

KKT and Denominator (details)

There are two closely related objects in the EL construction:

  • The unknown conditional masses pip_i on respondent support points (these are the pip_i in QLS).
  • The probability mass weights actually used to form expectations under the discretized law. For surveys this mass is proportional to aipia_i p_i (because aia_i represents how many population units respondent ii stands for).

In a survey-weighted setting (with base weights aia_i acting as multiplicities), we can write the discretized empirical distribution as FEL(A)=iaipi𝟏{(yi,xi)A}, F_{\text{EL}}(A) = \sum_i a_i p_i\,\mathbf{1}\{(y_i,x_i)\in A\}, with constraints iaipi=1,iaipi(Xiμx)=0,iaipi(wiW)=0. \sum_i a_i p_i = 1,\qquad \sum_i a_i p_i(X_i-\mu_x)=0,\qquad \sum_i a_i p_i(w_i-W)=0.

Introducing Lagrange multipliers (λ0,λx,λW)(\lambda_0,\lambda_x,\lambda_W) for these constraints and profiling the pip_i’s gives the KKT stationarity conditions

pi[jajlogpjλ0(jajpj1)λxTjajpj(Xjμx)λWjajpj(wjW)]=0, \frac{\partial}{\partial p_i} \Big[ \sum_j a_j\, \log p_j - \lambda_0 (\sum_j a_j p_j - 1) - \lambda_x^T \sum_j a_j p_j (X_j - \mu_x) - \lambda_W \sum_j a_j p_j (w_j - W) \Big] = 0,

which solve to

pi11+λxT(Xiμx)+λW(wiW)1Di. p_i \;\propto\; \frac{1}{\,1 + \lambda_x^T (X_i-\mu_x) + \lambda_W (w_i - W)\,} \;\equiv\; \frac{1}{D_i}.

Normalizing to enforce iaipi=1\sum_i a_i p_i = 1 yields pi=Di1jajDj1. p_i = \frac{D_i^{-1}}{\sum_j a_j D_j^{-1}}. The probability mass weight placed on respondent ii under FELF_{\text{EL}} is then aipi=aiDi1jajDj1aiDi. a_i p_i = \frac{\displaystyle a_i D_i^{-1}}{\displaystyle \sum_j a_j D_j^{-1}} \;\propto\; \frac{a_i}{D_i}. In the implementation we store unnormalized EL masses mi=ai/Dim_i = a_i/D_i and use probability-scale weights piEL=mi/jmjp_i^{\text{EL}} = m_i/\sum_j m_j for expectations.

miaiDiwithDi=1+λW(wiW)+(Xiμx)Tλx. m_i \;\propto\; \frac{a_i}{D_i} \quad \text{with} \quad D_i = 1 + \lambda_W (w_i - W) + (X_i - \mu_x)^T\lambda_x.

The EL weights piELp_i^{\text{EL}} are then used to build the mean estimator

Ŷ=ipiELYiipiEL. \hat Y \;=\; \frac{\sum_i p_i^{\text{EL}} Y_i}{\sum_i p_i^{\text{EL}}}.

The remaining unknowns (β,W,λx)(\beta, W, \lambda_x) (and λW\lambda_W in the survey system) are determined by the estimating equations below.

Clarification: Relationship Between WW and λW\lambda_W

In the IID (data-frame) path, the EL multiplier for the response-rate constraint is expressed as

λW=C1W,with C=Npopnresp_weighted1 and W=plogis(z). \lambda_W = \frac{C}{1 - W}, \quad \text{with } C = \frac{N_{\text{pop}}}{n_{\text{resp\_weighted}}} - 1 \text{ and } W = \text{plogis}(z).

Intuition: In the EL KKT system, the constraint piEL(wiW)=0\sum p_i^{\text{EL}} (w_i - W) = 0 sits alongside normalization and (optionally) auxiliary constraints. Incorporating base weights aia_i and the ratio between population and respondent totals induces a scaling of the multiplier linked to the mass constraint. Writing λW\lambda_W in this scaled form keeps the parameter on a numerically stable scale and lets the derivative structure (with respect to zz via WW) be handled cleanly. This is consistent with the EL structure when the baseline mass is nresp_weightedn_{\text{resp\_weighted}} and the “full population” target is NpopN_{\text{pop}}, and it is exactly what the IID code path uses to match the normalization implied by base weights.

Derivation sketch (KKT, IID case): The discretized semiparametric likelihood (QLS, 2002) maximizes, over the unknown masses {pi}\{p_i\} at observed points and over (β,W)(\beta, W),

(β,W,λx,λW)=i=1nlogwi(β)+(Npopnresp_weighted)log(1W)i=1nlog(1+(Xiμx)λx+λW(wiW)), \ell(\beta, W, \lambda_x, \lambda_W) \;=\; \sum_{i=1}^{n} \log w_i(\beta) \; +\; (N_{\text{pop}} - n_{\text{resp\_weighted}}) \log(1 - W) \; -\; \sum_{i=1}^{n} \log\!\Big(1 + (X_i - \mu_x)^\top \lambda_x + \lambda_W (w_i - W)\Big),

subject to the normalization and moment constraints that generate the EL denominator. In the IID QLS case (ai1a_i \equiv 1), profiling the pip_i’s under ipi=1\sum_i p_i = 1 gives pi=1/(nDi)p_i = 1/(n D_i) and therefore iDi1=n\sum_i D_i^{-1} = n. Combining this identity with the first-order condition for WW yields the closed form

λW=Npopnresp_weighted11W=C1W, \lambda_W \;=\; \frac{\tfrac{N_{\text{pop}}}{n_{\text{resp\_weighted}}} - 1}{1 - W} \;=\; \frac{C}{1 - W},

which coincides with QLS (10) when ai1a_i \equiv 1. This closed-form relationship is used in the IID EL implementation to profile out λW\lambda_W. In the survey-design path, by contrast, λW\lambda_W is treated as an explicit unknown and the linkage between WW and λW\lambda_W is enforced through the additional equation

gW(1)(β,W,λW,λx)=T01WλWiRdiDi=0, g_W^{(1)}(\beta, W, \lambda_W, \lambda_x) \;=\; \frac{T_0}{1 - W} - \lambda_W \sum_{i \in R} \frac{d_i}{D_i} = 0,

where T0=NpopiRdiT_0 = N_{\mathrm{pop}} - \sum_{i \in R} d_i and did_i are the design weights. At the QLS simple-random-sampling limit (equal weights, no auxiliaries) this system reduces to the same closed-form relation.

Estimating Equations

Unknown parameters: βK\beta \in \mathbb{R}^K, zz \in \mathbb{R} (for W=plogis(z)W = \text{plogis}(z)), λxL\lambda_x \in \mathbb{R}^L; define θ=(β,z,λx)\theta = (\beta, z, \lambda_x).

Define wi=linkinv(ηi)w_i = \mathrm{linkinv}(\eta_i) and μη,i=dwdη(ηi)\mu_{\eta,i} = \frac{dw}{d\eta}(\eta_i) (denoted mu.eta(eta_i) in code).

In the IID (data-frame) path all base weights are ai1a_i \equiv 1, so we can use the closed-form Qin-Leung-Shao (QLS) relation between WW and the EL multiplier for the response constraint. Writing C=Npopnresp_weighted1,nresp_weighted=iai, C = \frac{N_{\text{pop}}}{n_{\text{resp\_weighted}}} - 1,\qquad n_{\text{resp\_weighted}} = \sum_i a_i, QLS show that λW=C1W=Npop/nresp_weighted11W. \lambda_W = \frac{C}{1 - W} = \frac{N_{\text{pop}}/n_{\text{resp\_weighted}} - 1}{1 - W}. Our IID implementation follows this and profiles out λW\lambda_W: the unknowns for the Newton solver are (β,z,λx)(\beta, z, \lambda_x).

In the survey path, base weights are general design weights ai=dia_i = d_i and the corresponding QLS-style relation no longer has a simple closed form. In that case we treat λW\lambda_W as an additional free parameter and include a separate equation linking λW\lambda_W and WW (see the “Survey extension” section below).

Denominator: Di=1+λW(wiW)+(Xiμx)TλxD_i = 1 + \lambda_W (w_i - W) + (X_i - \mu_x)^T \lambda_x, with DiϵD_i \geq \epsilon enforced numerically.

Define the score term si=μη,i/wis_i = \mu_{\eta,i}/w_i (the unit-level contribution to the log-likelihood score with respect to η\eta). For logit, si=1wis_i = 1 - w_i; for probit, sis_i behaves like ϕ(ηi)/Φ(ηi)\phi(\eta_i)/\Phi(\eta_i) when wiw_i is bounded away from 0 via clipping (as implemented).

Intuition (why this score appears): for each respondent we observe Ri=1R_i=1, so the Bernoulli log-likelihood contribution of the response model is logwi(ηi)\log w_i(\eta_i). Differentiating w.r.t. the linear predictor gives ηilogwi(ηi)=1widwidηi=μη,iwisi. \frac{\partial}{\partial\eta_i} \log w_i(\eta_i) \,=\, \frac{1}{w_i}\, \frac{dw_i}{d\eta_i} \,=\, \frac{\mu_{\eta,i}}{w_i} \;\equiv\; s_i.

Thus sis_i measures the local sensitivity of the observed-response likelihood to ηi\eta_i. In the logit family, μη,i=wi(1wi)\mu_{\eta,i}=w_i(1-w_i) so si=1wis_i=1-w_i-the familiar residual-like term; in the probit family, si=ϕ(ηi)/Φ(ηi)s_i=\phi(\eta_i)/\Phi(\eta_i), the (inverse) Mills ratio. The EL β\beta-equations balance this likelihood score against the EL penalty term λWμη,i/Di\lambda_W\,\mu_{\eta,i}/D_i, enforcing the calibration constraints while fitting the response model.

The System of Estimating Equations F(θ)=0F(\theta) = 0

β\beta-equations (KK equations): aiZi[siλWμη,i/Di]=0\sum a_i Z_i [s_i - \lambda_W \mu_{\eta,i} / D_i] = 0

W-equation (1 equation): ai(wiW)/Di=0\sum a_i (w_i - W) / D_i = 0

Auxiliary constraints (LL equations): ai(Xiμx)/Di=0\sum a_i (X_i - \mu_x) / D_i = 0

These are exactly how el_build_equation_system constructs the function in code (src_dev/engines/el/impl/equations.R).

Intuition: the β\beta-equations equate the score of the respondent log-likelihood with the EL penalty term λWμη,i/Di\lambda_W \mu_{\eta,i}/D_i; the WW-equation centers the modeled response probabilities around the unconditional mean WW under the EL weights; the auxiliary equations calibrate the centered auxiliaries to zero mean under the EL weights.

Code cross-reference (equations and Jacobian)

This table maps the theory blocks to the exact builders and argument/variable names in: src_dev/engines/el/impl/equations.R and src_dev/engines/el/impl/jacobian.R.

Estimating-equation builders (equations.R)

Theory block IID (data.frame) implementation Survey (survey.design) implementation Code identifiers used inside the closure
Unknown vector θ\theta el_build_equation_system(...)(params) with params = c(beta, z, lambda_x) el_build_equation_system_survey(...)(params) with params = c(beta, z, lambda_W, lambda_x) beta_vec, z, W <- plogis(z), lambda_x, (survey) lambda_W
Denominator DiD_i el_denominator(lambda_W, W_bounded, Xc_lambda, w_i, denom_floor) el_denominator(lambda_W, W_bounded, Xc_lambda, w_i, denom_floor) dpack$denom (guarded), inv_denominator <- dpack$inv, active <- dpack$active
wiw_i and derivatives el_core_eta_state(family, eta_raw, ETA_CAP) el_core_eta_state(family, eta_raw, ETA_CAP) eta_raw, w_i, mu_eta_i, s_eta_i
β\beta equations gβg_\beta eq_betas <- shared_weighted_Xty(missingness_model_matrix, respondent_weights, beta_eq_term) same missingness_model_matrix, respondent_weights, beta_eq_term <- s_eta_i - lambda_W * mu_eta_i * inv_denominator
WW constraint gW(2)g_W^{(2)} eq_W <- crossprod(respondent_weights * inv_denominator, (w_i - W_bounded)) eq_W_constraint <- crossprod(respondent_weights * inv_denominator, (w_i - W_bounded)) w_i, W_bounded, inv_denominator
Auxiliary constraints gxg_x eq_constraints <- shared_weighted_Xty(X_centered, respondent_weights, inv_denominator) same X_centered <- sweep(auxiliary_matrix, 2, mu_x_scaled, "-")
λW\lambda_W profiling / linkage IID: lambda_W <- el_lambda_W(C_const, W_bounded) with C_const <- (N_pop/n_resp_weighted) - 1 Survey: eq_W_link <- (T0/(1-W_bounded)) - lambda_W * sum_d_over_D IID: C_const, n_resp_weighted; Survey: T0 <- N_pop - n_resp_weighted, sum_d_over_D <- crossprod(respondent_weights, inv_denominator)

Analytic Jacobian builders (jacobian.R)

Object IID (data.frame) builder Survey (survey.design) builder Notes on block ordering / names
A(θ)=F/θA(\theta) = \partial F/\partial\theta el_build_jacobian(...)(params) el_build_jacobian_survey(...)(params) Both return a square matrix full_mat with parameter ordering matching the corresponding equations.R closure.
Parameter ordering params = c(beta, z, lambda_x) params = c(beta, z, lambda_W, lambda_x) Indices in code: IID uses idx_beta, idx_W; survey uses idx_beta, idx_z, idx_lambdaW, idx_lambda_x.
Equation ordering (rows) c(beta eqs, W eq, aux eqs) c(beta eqs, W constraint, aux eqs, W link) Survey row indices are annotated in el_build_jacobian_survey() as idx_eq_beta, idx_eq_W, idx_eq_aux, idx_eq_link.
Guard consistency el_denominator(...); active <- dpack$active same Terms involving derivatives of 1/D_i are multiplied by active to match the denominator floor in the equations.

Remarks

  • For logit and probit, sis_i is the log-likelihood score logwi/ηi=μη,i/wi\partial\log w_i/\partial\eta_i = \mu_{\eta,i}/w_i (equals 1wi1-w_i for logit; behaves like ϕ/Φ\phi/\Phi for probit when wiw_i is clipped away from 0). This follows the paper’s MLE derivation; EL constraints supply the nonparametric part.

Survey extension: design-weighted QLS system

The original QLS paper derives these equations under simple random sampling, where each respondent has equal weight. In practice we often work with complex survey designs and design weights di1/πid_i \approx 1/\pi_i, where πi\pi_i is the inclusion probability for unit ii. In our implementation we extend the QLS system using a design-weighted empirical likelihood:

  • For respondents iRi \in R we use base weights ai=dia_i = d_i.
  • We approximate the unknown distribution of (Y,X)(Y,X) by a discrete measure FEL(A)=iRdipi𝟏{(yi,xi)A}, F_{\text{EL}}(A) = \sum_{i\in R} d_i p_i\,\mathbf{1}\{(y_i,x_i)\in A\}, where pi0p_i \ge 0 and idipi=1\sum_i d_i p_i = 1.
  • Expectations under FF are represented by design-weighted sums idipi()\sum_i d_i p_i(\cdot).

We impose the following constraints, which are the design-weighted analogues of QLS (3):

  • Normalization: iRdipi=1. \sum_{i\in R} d_i p_i = 1.
  • Response-rate constraint: iRdipi(wi(θ)W)=0. \sum_{i\in R} d_i p_i \bigl(w_i(\theta) - W\bigr) = 0.
  • Auxiliary constraints (vector case): iRdipi(Xiμx)=0. \sum_{i\in R} d_i p_i (X_i - \mu_x) = 0.

Maximizing the design-weighted pseudo-likelihood under these constraints yields EL weights of the same tilted form as in QLS: pi(θ,W,λx,λW)1Di,Di=1+λW(wi(θ)W)+(Xiμx)λx, p_i(\theta, W, \lambda_x, \lambda_W) \;\propto\; \frac{1}{D_i},\qquad D_i = 1 + \lambda_W\bigl(w_i(\theta) - W\bigr) + (X_i - \mu_x)^\top \lambda_x, with the proportionality constant chosen such that idipi=1\sum_i d_i p_i = 1. In our implementation the unnormalized EL masses are mi=diDi, m_i = \frac{d_i}{D_i}, and the probability-scale weights are piEL=mi/jmjp_i^{\mathrm{EL}} = m_i / \sum_j m_j.

The corresponding design-weighted QLS estimating system in (β,W,λW,λx)(\beta, W, \lambda_W, \lambda_x) can be written as:

  • Auxiliary block: gx(β,W,λW,λx)=iRdiXiμxDi=0. g_x(\beta, W, \lambda_W, \lambda_x) = \sum_{i\in R} d_i \frac{X_i - \mu_x}{D_i} = 0.
  • Response-rate constraint: gW(2)(β,W,λW,λx)=iRdiwi(β)WDi=0. g_W^{(2)}(\beta, W, \lambda_W, \lambda_x) = \sum_{i\in R} d_i \frac{w_i(\beta) - W}{D_i} = 0.
  • Score equations for β\beta: gβ(β,W,λW,λx)=iRdi[logwi(β)βλW1Diwi(β)β]=0. g_\beta(\beta, W, \lambda_W, \lambda_x) = \sum_{i\in R} d_i \left[ \frac{\partial \log w_i(\beta)}{\partial \beta} - \lambda_W \frac{1}{D_i} \frac{\partial w_i(\beta)}{\partial \beta} \right] = 0.
  • Linkage between λW\lambda_W and the nonrespondent total: gW(1)(β,W,λW,λx)=T01WλWiRdiDi=0, g_W^{(1)}(\beta, W, \lambda_W, \lambda_x) = \frac{T_0}{1 - W} - \lambda_W \sum_{i\in R} \frac{d_i}{D_i} = 0, where T0=NpopiRdiT_0 = N_{\mathrm{pop}} - \sum_{i\in R} d_i on the analysis scale.

In code this system is implemented by el_build_equation_system_survey() in src_dev/engines/el/impl/equations.R. The parameter vector is θsurvey=(β,z,λW,λx),z=logit(W), \theta_{\text{survey}} = (\beta, z, \lambda_W, \lambda_x),\qquad z = \operatorname{logit}(W), and the solver treats λW\lambda_W as an explicit unknown. When all design weights are equal and NpopN_{\text{pop}} and the respondent count match the simple random sampling setup, this system reduces exactly to the original QLS equations (6)-(10).

For survey designs we build an analytic Jacobian for this design-weighted system whenever the response family supplies a second derivative d2mu.deta2 (logit and probit). The Jacobian structure mirrors the IID case but with the expanded parameter vector (β,z,λW,λx)(\beta, z, \lambda_W, \lambda_x) and the additional blocks for gW(1)g_W^{(1)} and gW(2)g_W^{(2)}. When analytic derivatives are not available, nleqslv falls back to numeric/Broyden Jacobians.

Wu-style strata augmentation (survey designs)

For some stratified designs, especially when the NMAR mechanism varies strongly across strata, it is important that the EL weights preserve the stratum composition implied by the survey design. Following ideas from Wu-style calibration, we augment the auxiliary vector with stratum indicators when a survey.design object is provided:

  • Recover a strata factor from the design (prefer design$strata; fall back to the original strata= call when needed).
  • Build dummy variables for strata (dropping one reference level).
  • Compute stratum totals NhN_h on the analysis scale from the design weights and convert to stratum shares Wh=Nh/NpopW_h = N_h / N_{\mathrm{pop}}.
  • Append these stratum dummies to the auxiliary matrix XX and their targets WhW_h to the auxiliary means.

The EL constraints then include additional terms of the form iRdipi(𝟏{Hi=h}Wh)=0 \sum_{i\in R} d_i p_i \bigl(\mathbf{1}\{H_i = h\} - W_h\bigr) = 0 for each nonreference stratum hh. This forces the EL weights to reproduce the design-implied stratum shares while still adjusting within strata for NMAR. In the implementation this augmentation is performed in the survey entry point (src_dev/engines/el/impl/survey.R) before auxiliary means are resolved, and the resulting augmented auxiliaries flow through to el_build_equation_system() or el_build_equation_system_survey() depending on the data type. The behavior is controlled by the logical strata_augmentation argument of el_engine() (default TRUE); it has an effect only when data is a survey.design with defined strata.

Implementation detail: when the user does not supply auxiliary_means, the targets for the augmented stratum indicators are obtained automatically as the design-weighted means of those dummy columns in the full sample (via el_resolve_auxiliaries()), which equals the design-implied stratum shares on the analysis scale. When the user does supply auxiliary_means, the augmentation appends the implied WhW_h targets to that vector.

Our survey EL implementation should be viewed as a design-weighted analogue of QLS, informed by the pseudo empirical likelihood literature (Chen and Sitter 1999; Wu 2005), rather than a verbatim implementation of any single paper.

Analytic Jacobian (AA Matrix, IID path)

For the IID (data-frame) path we differentiate F(θ)=0F(\theta) = 0 with respect to θ=(β,z,λx)\theta = (\beta, z, \lambda_x). Let:

  • ηi=Ziβ\eta_i = Z_i \beta, wi=linkinv(ηi)w_i = \text{linkinv}(\eta_i), μη,i=dwdη(ηi)\mu_{\eta,i} = \dfrac{dw}{d\eta}(\eta_i), μi=d2wdη2(ηi)\mu''_i = \dfrac{d^2 w}{d\eta^2}(\eta_i)
  • W=plogis(z)W = \text{plogis}(z), dWdz=W(1W)\frac{dW}{dz} = W(1 - W)
  • λW=C1W\lambda_W = \frac{C}{1 - W}, so dλWdW=C(1W)2\frac{d\lambda_W}{dW} = \frac{C}{(1 - W)^2} and dλWdz=dλWdWdWdz\frac{d\lambda_W}{dz} = \frac{d\lambda_W}{dW} \cdot \frac{dW}{dz}
  • Xcentered,i=XiμxX_{\text{centered},i} = X_i - \mu_x

Intermediate Derivatives

  • si=μη,i/widsidηi=(μη,iwiμη,i2)/wi2s_i = \mu_{\eta,i}/w_i \Rightarrow \;\frac{ds_i}{d\eta_i} = (\mu'_{\eta,i}w_i - \mu_{\eta,i}^2)/w_i^2 with μη,i=dμη,idηi=d2wdηi2μi\mu'_{\eta,i} = \dfrac{d\mu_{\eta,i}}{d\eta_i} = \dfrac{d^2 w}{d\eta_i^2} \equiv \mu''_i (this is d2mu.deta2(eta_i) in code)
  • Di=1+λW(wiW)+Xcentered,iTλxD_i = 1 + \lambda_W (w_i - W) + X_{\text{centered},i}^T \lambda_x
    • Diηi=λWμη,i\frac{\partial D_i}{\partial \eta_i} = \lambda_W \mu_{\eta,i}
    • Diz=λWz(wiW)λWdWdz\frac{\partial D_i}{\partial z} = \frac{\partial \lambda_W}{\partial z} \cdot (w_i - W) - \lambda_W \cdot \frac{dW}{dz}
    • Diλx=Xcentered,i\frac{\partial D_i}{\partial \lambda_x} = X_{\text{centered},i}

Define invi=1/Di\text{inv}_i = 1 / D_i and the scalar term driving β\beta-equations:

Ti=siλWμη,iinvi,si=μη,iwi.T_i = s_i - \lambda_W \mu_{\eta,i} \text{inv}_i,\quad s_i = \frac{\mu_{\eta,i}}{w_i}.

For the logit and probit families we use simpler closed-form derivatives of sis_i in code: for logit, dsi/dηi=μη,ids_i/d\eta_i = -\mu_{\eta,i} (because si=1wis_i = 1 - w_i); for probit, dsi/dηi=(ηiri+ri2)ds_i/d\eta_i = -(\eta_i r_i + r_i^2) with ri=ϕ(ηi)/Φ(ηi)r_i = \phi(\eta_i)/\Phi(\eta_i) the Mills ratio. The expression above is kept as the generic fallback for other families.

Compute Its Derivatives

Using μη,i=dμη,i/dηi=d2mudeta2(ηi)\,\mu'_{\eta,i} = d\mu_{\eta,i}/d\eta_i = \mathrm{d2mu\,deta2}(\eta_i) and dwi/dηi=μη,i\,dw_i/d\eta_i = \mu_{\eta,i},

siηi=μη,iwiμη,i2wi2.\frac{\partial s_i}{\partial \eta_i} = \frac{\mu'_{\eta,i} w_i - \mu_{\eta,i}^2}{w_i^2}.

Also inviηi=invi2Diηi=invi2(λWμη,i)\,\frac{\partial \text{inv}_i}{\partial \eta_i} = -\text{inv}_i^2 \cdot \frac{\partial D_i}{\partial \eta_i} = -\text{inv}_i^2 (\lambda_W \mu_{\eta,i}). Therefore

Tiηi=μη,iwiμη,i2wi2λWμη,iinvi+λW2(μη,i)2invi2.\frac{\partial T_i}{\partial \eta_i} = \frac{\mu'_{\eta,i} w_i - \mu_{\eta,i}^2}{w_i^2} - \lambda_W \mu'_{\eta,i} \text{inv}_i + \lambda_W^2 (\mu_{\eta,i})^2 \text{inv}_i^2.

Tiz=λWzμη,iinvi+λWμη,iinvi2Diz\frac{\partial T_i}{\partial z} = -\frac{\partial \lambda_W}{\partial z} \cdot \mu_{\eta,i} \text{inv}_i + \lambda_W \mu_{\eta,i} \text{inv}_i^2 \cdot \frac{\partial D_i}{\partial z}

Tiλx=λWμη,iinvi2Xcentered,i\frac{\partial T_i}{\partial \lambda_x} = \lambda_W \mu_{\eta,i} \text{inv}_i^2 \cdot X_{\text{centered},i}

Assemble Jacobian Blocks (with aia_i weights)

JββJ_{\beta\beta} (K×KK \times K): J11=aiZiT[Tiηi]ZiJ_{11} = \sum a_i Z_i^T \left[ \frac{\partial T_i}{\partial \eta_i} \right] Z_i

JβzJ_{\beta z} (K×1K \times 1): J12=aiZiT[Tiz]J_{12} = \sum a_i Z_i^T \left[ \frac{\partial T_i}{\partial z} \right]

JβλJ_{\beta \lambda} (K×LK \times L): J13=aiZiT[Tiλx]J_{13} = \sum a_i Z_i^T \left[ \frac{\partial T_i}{\partial \lambda_x} \right]

JzβJ_{z\beta} (1×K1 \times K): derivative of W-equation w.r.t. β\beta

Equation: GW=ai(wiW)inviG_W = \sum a_i (w_i - W) \text{inv}_i

GWηi=ai[μη,iinvi(wiW)invi2(Diηi)]=ai[μη,iinvi(wiW)invi2(λWμη,i)]\frac{\partial G_W}{\partial \eta_i} = a_i \left[ \mu_{\eta,i} \text{inv}_i - (w_i - W) \text{inv}_i^2 \left(\frac{\partial D_i}{\partial \eta_i}\right) \right] = a_i \left[ \mu_{\eta,i} \text{inv}_i - (w_i - W) \text{inv}_i^2 (\lambda_W \mu_{\eta,i}) \right]

Then: J21=GWηiZiJ_{21} = \sum \frac{\partial G_W}{\partial \eta_i} \cdot Z_i

JzzJ_{zz} (1×11 \times 1): GWz=ai[dWdzinvi(wiW)invi2Diz]\frac{\partial G_W}{\partial z} = \sum a_i \left[ -\frac{dW}{dz} \cdot \text{inv}_i - (w_i - W) \text{inv}_i^2 \cdot \frac{\partial D_i}{\partial z} \right]

JzλJ_{z\lambda} (1×L1 \times L): GWλx=ai[(wiW)invi2Xcentered,i]\frac{\partial G_W}{\partial \lambda_x} = \sum a_i \left[ -(w_i - W) \text{inv}_i^2 X_{\text{centered},i} \right]

JλβJ_{\lambda\beta} (L×KL \times K): constraints H(λ):aiinviXcentered,i=0H(\lambda): \sum a_i \text{inv}_i X_{\text{centered},i} = 0

Hηi=aiinvi2DiηiXcentered,i=aiinvi2(λWμη,i)Xcentered,i\frac{\partial H}{\partial \eta_i} = -a_i \text{inv}_i^2 \frac{\partial D_i}{\partial \eta_i} X_{\text{centered},i} = -a_i \text{inv}_i^2 (\lambda_W \mu_{\eta,i}) X_{\text{centered},i}

Thus, component-wise J31=iai(λWμη,iinvi2)Xcentered,iTZiJ_{31} = \sum_i a_i\,(-\lambda_W \mu_{\eta,i}\,\text{inv}_i^2)\, X_{\text{centered},i}^T Z_i. In compact matrix form:

J31=XcenteredTdiag(aiλWμη,iinvi2)Z.J_{31} = X_{\text{centered}}^T \operatorname{diag}\!\big(-a_i\,\lambda_W\,\mu_{\eta,i}\,\text{inv}_i^2\big) Z.

JλzJ_{\lambda z} (L×1L \times 1): Hz=aiinvi2(Diz)Xcentered,i\frac{\partial H}{\partial z} = -\sum a_i \text{inv}_i^2 \left(\frac{\partial D_i}{\partial z}\right) X_{\text{centered},i}

JλλJ_{\lambda\lambda} (L×LL \times L): Hλx=XcenteredTdiag(aiinvi2)Xcentered.\frac{\partial H}{\partial \lambda_x} = -X_{\text{centered}}^T \operatorname{diag}(a_i\,\text{inv}_i^2) X_{\text{centered}}.

These expressions match the unguarded analytic derivatives; in the code (src_dev/engines/el/impl/jacobian.R), any terms involving derivatives of 1/Di1/D_i are additionally multiplied by the active mask 𝟙{Diraw>δ}\mathbb{1}\{D_i^{\text{raw}} > \delta\} to respect the denominator floor used for numerical stability.

Why Analytic A Helps

  • Newton-Raphson (as used in our outer solve) linearizes F(θ)F(\theta) near the current iterate: F(θ+Δ)F(θ)+A(θ)ΔF(\theta + \Delta) \approx F(\theta) + A(\theta)\,\Delta. The update Δ\Delta solves AΔ=FA\,\Delta = -F, hence a high-quality AA is critical for fast, stable convergence.

Solving Strategy and Initialization

  • In the IID path the unknowns are θ=(β,z,λx)\theta = (\beta, z, \lambda_x) with W=plogis(z)W = \mathrm{plogis}(z). In the survey path the unknowns are (β,z,λW,λx)(\beta, z, \lambda_W, \lambda_x), with λW\lambda_W treated as a free parameter. In both cases we solve the full stacked system F(θ)=0F(\theta) = 0 via Newton with the analytic Jacobian A=F/θA = \partial F/\partial\theta using nleqslv.
  • Globalization and scaling: we rely on nleqslv’s globalization (default global = "qline", xscalm = "auto") and enforce denominator positivity (miniDiε\min_i D_i \ge \varepsilon) within equation evaluations. Optional standardization of design matrices improves conditioning.
  • Initialization: by default β\beta starts at zeros in the scaled space (unless the user supplies start$beta), and zz is seeded at logit(observed response rate)\mathrm{logit}(\text{observed response rate}). An internal last-chance Broyden retry may be used if Newton fails to converge; this is not a user-facing mode.

Practical Identifiability and Diagnostics

The EL system balances the parametric response-model score against calibration constraints. Identifiability can weaken in the following situations:

  • Weak or nearly collinear auxiliaries: if XiμxX_i-\mu_x have little variation or are nearly collinear with the response score direction, the constraint block in A=F/θA=\partial F/\partial\theta becomes ill-conditioned.
  • Inconsistent auxiliary means: if supplied μx\mu_x are far from what the respondent sample can support (under the response model), denominators DiD_i cluster near 0 and κ(A)\kappa(A) inflates.
  • Heavy nonresponse or near-boundary WW: when WW approaches 0 or 1, λW=C/(1W)\lambda_W=C/(1-W) can spike and amplify sensitivity.

Diagnostics exposed by the implementation help assess these issues:

  • jacobian_condition_number (κ(A)\kappa(A)), max_equation_residual, denominator summaries (min, lower quantiles, median), weight concentration (max share, top-5 share, ESS), and the trimming fraction.

Mitigations include standardizing predictors, trimming extreme weights (trim_cap), adding informative response-model predictors, and preferring bootstrap variance when diagnostics indicate fragility.

Survey Design Details

We extend QLS’s methodology to complex surveys in two complementary ways:

  • Estimating equations with base weights: All sums include the base weight aia_i; set aia_i to the survey design weight for respondents. Totals NpopN_{\text{pop}} and nresp_weighted=ain_{\text{resp\_weighted}}=\sum a_i are computed from the design weights and used throughout the design-weighted system.

  • Nonrespondent total T0T_0 in the linkage equation: In the survey-specific system we form T0=Npopnresp_weightedT_0 = N_{\text{pop}} - n_{\text{resp\_weighted}} and enforce the linkage between WW and λW\lambda_W through the equation T0/(1W)λWdi/Di=0T_0/(1-W) - \lambda_W \sum d_i/D_i = 0 rather than using the closed-form λW=((Npop/nresp_weighted)1)/(1W)\lambda_W = ((N_{\text{pop}}/n_{\text{resp\_weighted}}) - 1)/(1-W).

  • Bootstrap variance via replicate weights: For standard errors, we use bootstrap replicate-weight designs created with svrep::as_bootstrap_design. For each replicate, the estimator is re-fit on a reconstructed design using that replicate’s weights, and survey::svrVar is used to compute the variance of replicate estimates with appropriate scaling.

Weight scale note

The survey system is defined on an analysis scale through NpopN_{\text{pop}} and the design weights did_i. By default we set Npop=idiN_{\text{pop}}=\sum_i d_i using weights(design). If the design weights have been rescaled (for example, to sum to the sample size for numerical reasons), you should supply n_total on the intended population-total scale so that T0=NpopiRdiT_0 = N_{\text{pop}}-\sum_{i\in R} d_i is computed consistently with your analysis.

This matches the paper’s guidance to adapt the likelihood/estimating framework to stratification or unequal-probability sampling while relying on standard survey resampling for uncertainty. Analytic variance has not been implemented yet.

Degrees-of-freedom: For confidence intervals, we use survey degrees-of-freedom (t-quantiles) when a survey.design is supplied; otherwise, we use normal quantiles.

Scaling and Unscaling

Scaling (optional; standardize=TRUE)

  • Compute a nmar_scaling_recipe: for each column jj in ZZ and XX (excluding intercept), using (if present) the same base weights aia_i that enter the estimating equations:
    • meanj\text{mean}_j, sdj\text{sd}_j; if sdj0\text{sd}_j \approx 0, set sdj=1\text{sd}_j = 1 to avoid blow-ups.
  • Transform:
    • Zscaled[,j]=(Zun[,j]meanj)/sdjZ_{\text{scaled}}[,j] = (Z_{\text{un}}[,j] - \text{mean}_j) / \text{sd}_j
    • Xscaled[,j]=(Xun[,j]meanj)/sdjX_{\text{scaled}}[,j] = (X_{\text{un}}[,j] - \text{mean}_j) / \text{sd}_j
    • μx,scaled[j]=(μx,un[j]meanj)/sdj\mu_{x,\text{scaled}}[j] = (\mu_{x,\text{un}}[j] - \text{mean}_j) / \text{sd}_j

Unscaling β\beta and vcov

  • Construct linear map DD of size K×KK \times K:
    • For columns jj \neq intercept: D[j,j]=1/sdjD[j,j] = 1/\text{sd}_j
    • For intercept: adjust to absorb centering: D[intercept,j]=meanj/sdjD[\text{intercept},j] = -\text{mean}_j/\text{sd}_j
  • Transform: βunscaled=Dβscaled\beta_{\text{unscaled}} = D \beta_{\text{scaled}}; if a covariance matrix is available, vcovunscaled=DvcovscaledDT\text{vcov}_{\text{unscaled}} = D \, \text{vcov}_{\text{scaled}} \, D^T

Code: centralized in src_dev/shared/scaling.R; engines call validate_and_apply_nmar_scaling() and unscale_coefficients(). For the EL engine, only β\beta is currently unscaled because no analytic coefficient covariance is computed.

Bootstrap Variance

  • IID:
    • Resample rows with replacement (nn to nn), re-run estimator, compute var\text{var} of bootstrap Ŷ\hat{Y}s; warn if many failures; return var\sqrt{\text{var}}.
  • Survey:
    • Convert to bootstrap replicate-weight design via svrep::as_bootstrap_design.
    • For each replicate, re-construct a temporary design and run estimator; use survey::svrVar to compute variance of replicate estimates (with scale/rscales).

Code mapping:

  • Engine: el_engine(..., family, standardize, trim_cap, variance_method, ...) in src_dev/engines/el/engine.R
  • Dispatch: run_engine.nmar_engine_el(...) in src_dev/engines/el/run_engine.R adapts the formula and forwards arguments to internal el() methods.
  • EL Core: el_estimator_core(...) in src_dev/engines/el/impl/core.R runs:
    • Construct F(θ)F(\theta) via el_build_equation_system() (src_dev/engines/el/impl/equations.R).
    • Solve F(θ)=0F(\theta)=0 via nleqslv (Newton with analytic Jacobian when available, Broyden fallback).
    • Build EL weights, mean, and diagnostics.
  • Jacobian: el_build_jacobian(...) in src_dev/engines/el/impl/jacobian.R returns analytic A whenever family supplies d2mu.deta2 (logit, probit).
  • Variance: Bootstrap variance is implemented in src_dev/shared/bootstrap.R.
  • S3 result: src_dev/engines/el/s3.R defines EL-specific print and summary methods (print.nmar_result_el, summary.nmar_result_el). Generic methods such as tidy(), glance(), weights(), and coef() are defined for the parent nmar_result class in src_dev/S3/nmar_result_methods.R.

Practical Notes

  • Denominator guard: DiεD_i \ge \varepsilon (default 10810^{-8}) across all steps; diagnostics report extreme fractions.
  • Eta cap option: you can adjust the η\eta cap via options(nmar.eta_cap = 60) (default is 50) to suit your data scale and link

Algorithm

We solve the full stacked system F(θ)=0F(\theta)=0 with Newton using the analytic Jacobian A=F/θA = \partial F/\partial \theta and globalization via nleqslv. Denominator positivity (miniDiε\min_i D_i \ge \varepsilon), predictor standardization, and capped η\eta ensure numerical stability. For the IID path the estimating equations are Qin, Leung and Shao (2002) up to the small numeric guards on η\eta, wiw_i, and DiD_i; for survey designs we use a consistent design-weighted analogue.

Input: Z (response design), X (auxiliary design), mu_x (population means),
       a (base weights), family (logit/probit), trim_cap, tolerances.
Initialize: beta = 0 in scaled space (or user-supplied start),
            z = logit(observed response rate), lambda_x = 0
            (and lambda_W = 0 for survey designs).
Repeat until convergence of F(theta) = 0:
  1) Compute eta = Z beta, w = linkinv(eta), W = plogis(z).
     - IID (data.frame): set lambda_W = ((N_pop/n_resp_weighted) - 1)/(1 - W).
     - survey.design: use the current lambda_W component of theta.
  2) Evaluate full stacked equations using guarded denominators
     D_i = 1 + lambda_W (w_i - W) + (X_i - mu_x)^T lambda_x.
  3) Compute analytic Jacobian A = dF/dtheta (if available; else numeric/Broyden).
  4) Newton step: solve A * step = -F with globalization; enforce min D_i >= eps.
  5) Update theta <- theta + step.
Return: p_i \propto a_i / D_i and \hat{Y} = Sum p_i Y_i / Sum p_i.

References

  • Qin, J., Leung, D., and Shao, J. (2002). Estimation with survey data under nonignorable nonresponse or informative sampling. Journal of the American Statistical Association, 97(457), 193-200.
  • Chen, J., and Sitter, R. R. (1999). A pseudo empirical likelihood approach to the effective use of auxiliary information in complex surveys. Statistica Sinica, 9, 385-406.
  • Wu, C. (2005). Algorithms and R codes for the pseudo empirical likelihood method in survey sampling. Survey Methodology, 31(2), 239-243.