
Empirical Likelihood Theory for NMAR
Source:vignettes/el_empirical_likelihood_theory.Rmd
el_empirical_likelihood_theory.RmdThis vignette summarizes the key mathematical objects and
estimating-equation derivations behind the empirical likelihood (EL)
estimator implemented in the nmar package, and maps them to
code. It covers both data-frame (IID) and survey design use cases,
allows arbitrary numbers of response-model and auxiliary covariates, and
supports both logit and probit response families.
Notation
Units
- index respondents (those with observed )
- is the response indicator; we work on observed subset
Data
- Outcome: (observed when ; missing otherwise)
-
Response covariates: row vector
,
the
th
row of the missingness-model design matrix built from the formula as an
intercept, the LHS outcome expression (evaluated in the model frame),
and any additional missingness predictors on the RHS after
| - Auxiliary covariates: row vector (possibly ), from auxiliary RHS (no intercept)
- Population auxiliary means: , known; names match columns of
Mapping to code: - response_model_matrix (the
missingness_design in el_prepare_inputs())
corresponds to
and has columns: (Intercept), the evaluated LHS outcome
expression, and any RHS2 predictors (after the | in the
formula) - auxiliary_matrix corresponds to
(no intercept); we center it in code as
Response Model (Family functions)
- Linear predictor:
- Response probability:
- First derivative:
-
Second derivative:
Here
linkinv,mu.eta, andd2mu.deta2refer to the chosen response family (logit or probit). We follow the paper’s notation for the response probability and reserve for empirical-likelihood weights.
Weight Re-parameterization
- nuisance scalar; we parameterize via for stability and set
- and are EL Lagrange multipliers for the -constraint and the auxiliary constraints
EL Weights
- Denominator:
- Base sampling weights: (IID) or survey base weight for respondent
- EL weights for respondents: (proportionality normalized by totals below)
Notation at a Glance
| Symbol | Meaning |
|---|---|
| Respondent index (rows with observed ) | |
| Outcome for unit (observed if ) | |
| Row of response design matrix (includes intercept) | |
| Row of auxiliary design (no intercept) | |
| Known population means of auxiliaries (vector) | |
| Response-model coefficients | |
| Linear predictor for response model | |
| (logit: ; probit: ) | |
| Multiplier for the -constraint | |
| Multipliers for auxiliary constraints | |
| Base weight (IID: 1; survey: design weight) | |
| Empirical-likelihood weight | |
| Stacked estimating system (beta, W, constraints) | |
| Jacobian |
Note: QLS use for the response-model parameter; in this vignette that parameter is . We use for the full stacked unknown vector solved by the EL engine.
Engines
- Family: “logit” (default) or “probit”. For respondents (), the score with respect to is (equals for logit and for probit). In code we compute these using stable log-domain formulas for probit and clip probabilities away from 0 and 1 when they appear in ratios.
- Scaling: optional standardization of design matrices and
via
nmar_scaling_recipe
Data and Interface Constraints
Before applying the EL equations, the implementation enforces several
constraints on the formula and data (el_prepare_inputs() in
src_dev/engines/el/impl/input.R and the entry points in
src_dev/engines/el/impl/dataframe.R and
src_dev/engines/el/impl/survey.R):
-
Single outcome source: The LHS expression must
reference exactly one outcome source variable in
data(for exampleY_miss). Any transformation is applied to this variable in the model frame, and the transformed values must be finite for all respondent rows (no new NA/NaN introduced among ). -
Outcome only via LHS in the response model: The raw
outcome variable and the LHS expression are not allowed on RHS1
(auxiliaries) or RHS2 (missingness predictors), either explicitly or via
.expansion. The response model uses the evaluated LHS outcome column as a dedicated predictor inmissingness_design, together with an intercept and any additional RHS2 predictors. -
Auxiliaries among respondents: Auxiliary variables
(RHS1) must be fully observed and non-constant among respondents. If
auxiliary_meansare not supplied, auxiliaries must be fully observed in the full data so that population means can be estimated from the sample. - Missingness predictors among respondents: Missingness predictors (RHS2) must be fully observed among respondents. Zero-variance predictors are allowed but generate a warning; their columns still enter the response model design matrix.
-
Respondents-only data: When the outcome has no
missing values (respondents-only data), the EL engines require
n_totalto be supplied so that can be set on the analysis scale. If auxiliaries are requested in this setting,auxiliary_meansmust also be supplied; otherwise the engines error with a descriptive message.
From Paper to Implementation: Core Ideas
The paper (Qin-Leung-Shao, JASA 2002) sets EL under nonignorable response using:
-
Empirical likelihood weights for respondents that
satisfy:
- Zero-sum residual:
- Auxiliary moments:
- A response model probability ,
In our code, we adopt the same EL structure and estimating equations. We extend it to arbitrary and , and to survey designs. For uncertainty, we provide bootstrap variance (IID resampling and survey replicate-weight bootstrap). Because is a ratio-of-weights estimator, any common normalization of cancels in ; only relative weights matter (the KKT multipliers enforce the constraints; normalization affects only a common scale that vanishes in the ratio).
Closed-form (QLS Eq. 10, IID path)
QLS derive when and counts respondents. In our IID (data-frame) path we reuse this relation with base weights fixed at :
- Let be the respondent-weighted total.
- Let
be the analysis-scale population total (by default
nrow(data), or a user-suppliedn_total).
In the data.frame path el_prepare_inputs()
is called with weights = NULL, so
respondent_weights are identically 1 and
.
We then set
which reduces exactly to the original QLS formula when is the total number of sampled units. This closed form is used only in the IID path to profile out ; for complex survey designs we instead treat as a free parameter and solve for it jointly with via the design-weighted system described in the survey extension section.
Guarding and Numerical Stability
We solve the stacked system with a consistent guarding policy across equations, Jacobian, and post-solution:
- Cap :
- Compute ; clip to when used in ratios
- Guard denominators: with a small
- In the Jacobian, multiply terms involving by so the analytic Jacobian matches the piecewise-smooth equations being solved
For the probit link, (Mills ratio) is computed in the log domain for stability; its derivative is .
Equation Crosswalk (QLS 2002 -> This Vignette/Code)
- QLS (5): Discrete mass form for with two multipliers -> Our and .
- QLS (7):
(or with
replaced by
when auxiliary variables are observed for all sampled units) -> Our
auxiliary constraints
,
where
is taken from
auxiliary_meansif supplied, otherwise estimated from the full input (unweighted for IID, design-weighted for surveys). - QLS (8): -> Our -equation .
- QLS (10): -> In the IID path we set ; in the survey path is solved from the additional linkage equation .
- Estimator in QLS -> Our ratio using .
Likelihood and Profiling (sketch)
QLS start from the factorized semiparametric likelihood (their Eq. (2)): where is the unconditional response rate. The factor in the conditional likelihood cancels the in the binomial term, so the overall likelihood is equivalently proportional to
Maximization is subject to (i) , (ii) (or when applicable), and (iii) . Discretizing at observed respondents by assigning unknown masses and introducing multipliers , the KKT conditions yield the familiar EL weight form with denominator
and, with base weights , the working masses are proportional to .
Remark on conditioning: QLS’s Eq. (2) writes the first product as so that it explicitly represents the likelihood of conditional on . Multiplying by the binomial term yields the same overall likelihood as above because the in the first factor cancels the in the second. Both factorizations lead to the same estimating equations and the same profiled log-likelihood form used subsequently in QLS after introducing the multipliers.
KKT and Denominator (details)
There are two closely related objects in the EL construction:
- The unknown conditional masses on respondent support points (these are the in QLS).
- The probability mass weights actually used to form expectations under the discretized law. For surveys this mass is proportional to (because represents how many population units respondent stands for).
In a survey-weighted setting (with base weights acting as multiplicities), we can write the discretized empirical distribution as with constraints
Introducing Lagrange multipliers for these constraints and profiling the ’s gives the KKT stationarity conditions
which solve to
Normalizing to enforce yields The probability mass weight placed on respondent under is then In the implementation we store unnormalized EL masses and use probability-scale weights for expectations.
The EL weights are then used to build the mean estimator
The remaining unknowns (and in the survey system) are determined by the estimating equations below.
Clarification: Relationship Between and
In the IID (data-frame) path, the EL multiplier for the response-rate constraint is expressed as
Intuition: In the EL KKT system, the constraint sits alongside normalization and (optionally) auxiliary constraints. Incorporating base weights and the ratio between population and respondent totals induces a scaling of the multiplier linked to the mass constraint. Writing in this scaled form keeps the parameter on a numerically stable scale and lets the derivative structure (with respect to via ) be handled cleanly. This is consistent with the EL structure when the baseline mass is and the “full population” target is , and it is exactly what the IID code path uses to match the normalization implied by base weights.
Derivation sketch (KKT, IID case): The discretized semiparametric likelihood (QLS, 2002) maximizes, over the unknown masses at observed points and over ,
subject to the normalization and moment constraints that generate the EL denominator. In the IID QLS case (), profiling the ’s under gives and therefore . Combining this identity with the first-order condition for yields the closed form
which coincides with QLS (10) when . This closed-form relationship is used in the IID EL implementation to profile out . In the survey-design path, by contrast, is treated as an explicit unknown and the linkage between and is enforced through the additional equation
where and are the design weights. At the QLS simple-random-sampling limit (equal weights, no auxiliaries) this system reduces to the same closed-form relation.
Estimating Equations
Unknown parameters: , (for ), ; define .
Define
and
(denoted mu.eta(eta_i) in code).
In the IID (data-frame) path all base weights are , so we can use the closed-form Qin-Leung-Shao (QLS) relation between and the EL multiplier for the response constraint. Writing QLS show that Our IID implementation follows this and profiles out : the unknowns for the Newton solver are .
In the survey path, base weights are general design weights and the corresponding QLS-style relation no longer has a simple closed form. In that case we treat as an additional free parameter and include a separate equation linking and (see the “Survey extension” section below).
Denominator: , with enforced numerically.
Define the score term (the unit-level contribution to the log-likelihood score with respect to ). For logit, ; for probit, behaves like when is bounded away from 0 via clipping (as implemented).
Intuition (why this score appears): for each respondent we observe , so the Bernoulli log-likelihood contribution of the response model is . Differentiating w.r.t. the linear predictor gives
Thus measures the local sensitivity of the observed-response likelihood to . In the logit family, so -the familiar residual-like term; in the probit family, , the (inverse) Mills ratio. The EL -equations balance this likelihood score against the EL penalty term , enforcing the calibration constraints while fitting the response model.
The System of Estimating Equations
-equations ( equations):
W-equation (1 equation):
Auxiliary constraints ( equations):
These are exactly how el_build_equation_system
constructs the function in code
(src_dev/engines/el/impl/equations.R).
Intuition: the -equations equate the score of the respondent log-likelihood with the EL penalty term ; the -equation centers the modeled response probabilities around the unconditional mean under the EL weights; the auxiliary equations calibrate the centered auxiliaries to zero mean under the EL weights.
Code cross-reference (equations and Jacobian)
This table maps the theory blocks to the exact builders and
argument/variable names in:
src_dev/engines/el/impl/equations.R and
src_dev/engines/el/impl/jacobian.R.
Estimating-equation builders (equations.R)
| Theory block | IID (data.frame) implementation | Survey (survey.design) implementation | Code identifiers used inside the closure |
|---|---|---|---|
| Unknown vector |
el_build_equation_system(...)(params) with
params = c(beta, z, lambda_x)
|
el_build_equation_system_survey(...)(params) with
params = c(beta, z, lambda_W, lambda_x)
|
beta_vec, z,
W <- plogis(z), lambda_x, (survey)
lambda_W
|
| Denominator | el_denominator(lambda_W, W_bounded, Xc_lambda, w_i, denom_floor) |
el_denominator(lambda_W, W_bounded, Xc_lambda, w_i, denom_floor) |
dpack$denom (guarded),
inv_denominator <- dpack$inv,
active <- dpack$active
|
| and derivatives | el_core_eta_state(family, eta_raw, ETA_CAP) |
el_core_eta_state(family, eta_raw, ETA_CAP) |
eta_raw, w_i, mu_eta_i,
s_eta_i
|
| equations | eq_betas <- shared_weighted_Xty(missingness_model_matrix, respondent_weights, beta_eq_term) |
same |
missingness_model_matrix,
respondent_weights,
beta_eq_term <- s_eta_i - lambda_W * mu_eta_i * inv_denominator
|
| constraint | eq_W <- crossprod(respondent_weights * inv_denominator, (w_i - W_bounded)) |
eq_W_constraint <- crossprod(respondent_weights * inv_denominator, (w_i - W_bounded)) |
w_i, W_bounded,
inv_denominator
|
| Auxiliary constraints | eq_constraints <- shared_weighted_Xty(X_centered, respondent_weights, inv_denominator) |
same | X_centered <- sweep(auxiliary_matrix, 2, mu_x_scaled, "-") |
| profiling / linkage | IID: lambda_W <- el_lambda_W(C_const, W_bounded)
with C_const <- (N_pop/n_resp_weighted) - 1
|
Survey:
eq_W_link <- (T0/(1-W_bounded)) - lambda_W * sum_d_over_D
|
IID: C_const, n_resp_weighted; Survey:
T0 <- N_pop - n_resp_weighted,
sum_d_over_D <- crossprod(respondent_weights, inv_denominator)
|
Analytic Jacobian builders (jacobian.R)
| Object | IID (data.frame) builder | Survey (survey.design) builder | Notes on block ordering / names |
|---|---|---|---|
el_build_jacobian(...)(params) |
el_build_jacobian_survey(...)(params) |
Both return a square matrix full_mat with parameter
ordering matching the corresponding equations.R
closure. |
|
| Parameter ordering | params = c(beta, z, lambda_x) |
params = c(beta, z, lambda_W, lambda_x) |
Indices in code: IID uses idx_beta, idx_W;
survey uses idx_beta, idx_z,
idx_lambdaW, idx_lambda_x. |
| Equation ordering (rows) | c(beta eqs, W eq, aux eqs) |
c(beta eqs, W constraint, aux eqs, W link) |
Survey row indices are annotated in
el_build_jacobian_survey() as idx_eq_beta,
idx_eq_W, idx_eq_aux,
idx_eq_link. |
| Guard consistency | el_denominator(...); active <- dpack$active |
same | Terms involving derivatives of 1/D_i are multiplied by
active to match the denominator floor in the
equations. |
Survey extension: design-weighted QLS system
The original QLS paper derives these equations under simple random sampling, where each respondent has equal weight. In practice we often work with complex survey designs and design weights , where is the inclusion probability for unit . In our implementation we extend the QLS system using a design-weighted empirical likelihood:
- For respondents we use base weights .
- We approximate the unknown distribution of by a discrete measure where and .
- Expectations under are represented by design-weighted sums .
We impose the following constraints, which are the design-weighted analogues of QLS (3):
- Normalization:
- Response-rate constraint:
- Auxiliary constraints (vector case):
Maximizing the design-weighted pseudo-likelihood under these constraints yields EL weights of the same tilted form as in QLS: with the proportionality constant chosen such that . In our implementation the unnormalized EL masses are and the probability-scale weights are .
The corresponding design-weighted QLS estimating system in can be written as:
- Auxiliary block:
- Response-rate constraint:
- Score equations for :
- Linkage between and the nonrespondent total: where on the analysis scale.
In code this system is implemented by
el_build_equation_system_survey() in
src_dev/engines/el/impl/equations.R. The parameter vector
is
and the solver treats
as an explicit unknown. When all design weights are equal and
and the respondent count match the simple random sampling setup, this
system reduces exactly to the original QLS equations (6)-(10).
For survey designs we build an analytic Jacobian for this
design-weighted system whenever the response family supplies a second
derivative d2mu.deta2 (logit and probit). The Jacobian
structure mirrors the IID case but with the expanded parameter vector
and the additional blocks for
and
.
When analytic derivatives are not available, nleqslv falls
back to numeric/Broyden Jacobians.
Wu-style strata augmentation (survey designs)
For some stratified designs, especially when the NMAR mechanism
varies strongly across strata, it is important that the EL weights
preserve the stratum composition implied by the survey
design. Following ideas from Wu-style calibration, we augment the
auxiliary vector with stratum indicators when a
survey.design object is provided:
- Recover a strata factor from the design (prefer
design$strata; fall back to the originalstrata=call when needed). - Build dummy variables for strata (dropping one reference level).
- Compute stratum totals on the analysis scale from the design weights and convert to stratum shares .
- Append these stratum dummies to the auxiliary matrix and their targets to the auxiliary means.
The EL constraints then include additional terms of the form
for each nonreference stratum
.
This forces the EL weights to reproduce the design-implied stratum
shares while still adjusting within strata for NMAR. In the
implementation this augmentation is performed in the survey entry point
(src_dev/engines/el/impl/survey.R) before auxiliary means
are resolved, and the resulting augmented auxiliaries flow through to
el_build_equation_system() or
el_build_equation_system_survey() depending on the data
type. The behavior is controlled by the logical
strata_augmentation argument of el_engine()
(default TRUE); it has an effect only when
data is a survey.design with defined
strata.
Implementation detail: when the user does not supply
auxiliary_means, the targets for the augmented stratum
indicators are obtained automatically as the design-weighted means of
those dummy columns in the full sample (via
el_resolve_auxiliaries()), which equals the design-implied
stratum shares on the analysis scale. When the user does supply
auxiliary_means, the augmentation appends the implied
targets to that vector.
Our survey EL implementation should be viewed as a design-weighted analogue of QLS, informed by the pseudo empirical likelihood literature (Chen and Sitter 1999; Wu 2005), rather than a verbatim implementation of any single paper.
Analytic Jacobian ( Matrix, IID path)
For the IID (data-frame) path we differentiate with respect to . Let:
- , , ,
- ,
- , so and
Intermediate Derivatives
-
with
(this is
d2mu.deta2(eta_i)in code) -
Define and the scalar term driving -equations:
For the logit and probit families we use simpler closed-form derivatives of in code: for logit, (because ); for probit, with the Mills ratio. The expression above is kept as the generic fallback for other families.
Assemble Jacobian Blocks (with weights)
():
():
():
(): derivative of W-equation w.r.t.
Equation:
Then:
():
():
(): constraints
Thus, component-wise . In compact matrix form:
():
():
These expressions match the unguarded analytic derivatives; in the
code (src_dev/engines/el/impl/jacobian.R), any terms
involving derivatives of
are additionally multiplied by the active mask
to respect the denominator floor used for numerical stability.
Why Analytic A Helps
- Newton-Raphson (as used in our outer solve) linearizes near the current iterate: . The update solves , hence a high-quality is critical for fast, stable convergence.
Solving Strategy and Initialization
- In the IID path the unknowns are
with
.
In the survey path the unknowns are
,
with
treated as a free parameter. In both cases we solve the full stacked
system
via Newton with the analytic Jacobian
using
nleqslv. - Globalization and scaling: we rely on
nleqslv’s globalization (defaultglobal = "qline",xscalm = "auto") and enforce denominator positivity () within equation evaluations. Optional standardization of design matrices improves conditioning. - Initialization: by default
starts at zeros in the scaled space (unless the user supplies
start$beta), and is seeded at . An internal last-chance Broyden retry may be used if Newton fails to converge; this is not a user-facing mode.
Practical Identifiability and Diagnostics
The EL system balances the parametric response-model score against calibration constraints. Identifiability can weaken in the following situations:
- Weak or nearly collinear auxiliaries: if have little variation or are nearly collinear with the response score direction, the constraint block in becomes ill-conditioned.
- Inconsistent auxiliary means: if supplied are far from what the respondent sample can support (under the response model), denominators cluster near 0 and inflates.
- Heavy nonresponse or near-boundary : when approaches 0 or 1, can spike and amplify sensitivity.
Diagnostics exposed by the implementation help assess these issues:
-
jacobian_condition_number(),max_equation_residual, denominator summaries (min, lower quantiles, median), weight concentration (max share, top-5 share, ESS), and the trimming fraction.
Mitigations include standardizing predictors, trimming extreme
weights (trim_cap), adding informative response-model
predictors, and preferring bootstrap variance when diagnostics indicate
fragility.
Survey Design Details
We extend QLS’s methodology to complex surveys in two complementary ways:
Estimating equations with base weights: All sums include the base weight ; set to the survey design weight for respondents. Totals and are computed from the design weights and used throughout the design-weighted system.
Nonrespondent total in the linkage equation: In the survey-specific system we form and enforce the linkage between and through the equation rather than using the closed-form .
Bootstrap variance via replicate weights: For standard errors, we use bootstrap replicate-weight designs created with
svrep::as_bootstrap_design. For each replicate, the estimator is re-fit on a reconstructed design using that replicate’s weights, andsurvey::svrVaris used to compute the variance of replicate estimates with appropriate scaling.
Weight scale note
The survey system is defined on an analysis scale through
and the design weights
.
By default we set
using weights(design). If the design weights have been
rescaled (for example, to sum to the sample size for numerical reasons),
you should supply n_total on the intended population-total
scale so that
is computed consistently with your analysis.
This matches the paper’s guidance to adapt the likelihood/estimating framework to stratification or unequal-probability sampling while relying on standard survey resampling for uncertainty. Analytic variance has not been implemented yet.
Degrees-of-freedom: For confidence intervals, we use survey
degrees-of-freedom (t-quantiles) when a survey.design is
supplied; otherwise, we use normal quantiles.
Scaling and Unscaling
Scaling (optional; standardize=TRUE)
-
Compute a
nmar_scaling_recipe: for each column in and (excluding intercept), using (if present) the same base weights that enter the estimating equations:- , ; if , set to avoid blow-ups.
-
Transform:
Unscaling and vcov
-
Construct linear map
of size
:
- For columns intercept:
- For intercept: adjust to absorb centering:
- Transform: ; if a covariance matrix is available,
Code: centralized in src_dev/shared/scaling.R; engines
call validate_and_apply_nmar_scaling() and
unscale_coefficients(). For the EL engine, only
is currently unscaled because no analytic coefficient covariance is
computed.
Bootstrap Variance
-
IID:
- Resample rows with replacement ( to ), re-run estimator, compute of bootstrap s; warn if many failures; return .
-
Survey:
- Convert to bootstrap replicate-weight design via
svrep::as_bootstrap_design. - For each replicate, re-construct a temporary design and run
estimator; use
survey::svrVarto compute variance of replicate estimates (with scale/rscales).
- Convert to bootstrap replicate-weight design via
Code mapping:
- Engine:
el_engine(..., family, standardize, trim_cap, variance_method, ...)insrc_dev/engines/el/engine.R - Dispatch:
run_engine.nmar_engine_el(...)insrc_dev/engines/el/run_engine.Radapts the formula and forwards arguments to internalel()methods.-
el.data.frame()/el.survey.design()insrc_dev/engines/el/impl/dataframe.Randsrc_dev/engines/el/impl/survey.Rprepare inputs, callel_estimator_core(), and wrap results.
-
- EL Core:
el_estimator_core(...)insrc_dev/engines/el/impl/core.Rruns:- Construct
via
el_build_equation_system()(src_dev/engines/el/impl/equations.R). - Solve
via
nleqslv(Newton with analytic Jacobian when available, Broyden fallback). - Build EL weights, mean, and diagnostics.
- Construct
via
- Jacobian:
el_build_jacobian(...)insrc_dev/engines/el/impl/jacobian.Rreturns analytic A whenever family suppliesd2mu.deta2(logit, probit). - Variance: Bootstrap variance is implemented in
src_dev/shared/bootstrap.R. - S3 result:
src_dev/engines/el/s3.Rdefines EL-specific print and summary methods (print.nmar_result_el,summary.nmar_result_el). Generic methods such astidy(),glance(),weights(), andcoef()are defined for the parentnmar_resultclass insrc_dev/S3/nmar_result_methods.R.
Practical Notes
- Denominator guard: (default ) across all steps; diagnostics report extreme fractions.
- Eta cap option: you can adjust the
cap via
options(nmar.eta_cap = 60)(default is 50) to suit your data scale and link
Algorithm
We solve the full stacked system
with Newton using the analytic Jacobian
and globalization via nleqslv. Denominator positivity
(),
predictor standardization, and capped
ensure numerical stability. For the IID path the estimating equations
are Qin, Leung and Shao (2002) up to the small numeric guards on
,
,
and
;
for survey designs we use a consistent design-weighted analogue.
Input: Z (response design), X (auxiliary design), mu_x (population means),
a (base weights), family (logit/probit), trim_cap, tolerances.
Initialize: beta = 0 in scaled space (or user-supplied start),
z = logit(observed response rate), lambda_x = 0
(and lambda_W = 0 for survey designs).
Repeat until convergence of F(theta) = 0:
1) Compute eta = Z beta, w = linkinv(eta), W = plogis(z).
- IID (data.frame): set lambda_W = ((N_pop/n_resp_weighted) - 1)/(1 - W).
- survey.design: use the current lambda_W component of theta.
2) Evaluate full stacked equations using guarded denominators
D_i = 1 + lambda_W (w_i - W) + (X_i - mu_x)^T lambda_x.
3) Compute analytic Jacobian A = dF/dtheta (if available; else numeric/Broyden).
4) Newton step: solve A * step = -F with globalization; enforce min D_i >= eps.
5) Update theta <- theta + step.
Return: p_i \propto a_i / D_i and \hat{Y} = Sum p_i Y_i / Sum p_i.
References
- Qin, J., Leung, D., and Shao, J. (2002). Estimation with survey data under nonignorable nonresponse or informative sampling. Journal of the American Statistical Association, 97(457), 193-200.
- Chen, J., and Sitter, R. R. (1999). A pseudo empirical likelihood approach to the effective use of auxiliary information in complex surveys. Statistica Sinica, 9, 385-406.
- Wu, C. (2005). Algorithms and R codes for the pseudo empirical likelihood method in survey sampling. Survey Methodology, 31(2), 239-243.