More issues in the difR package for differential item functioning analysis in R

I wrote last time about the difR package (Magis, Beland, Tuerlinckx, & De Boeck, 2010) and how it doesn’t account for missing data in Mantel-Haenszel DIF analysis. I’ve noticed two more issues as I’ve continued testing the package (version 5.1).

  1. The problem with Mantel-Haenszel also appears in the code for the standardization method, accessed via difR:::difStd, which calls difR:::stdPDIF. Look there and you’ll see base:::length used to obtain counts (e.g., number of correct/incorrect for focal and reference groups at a given score level). Missing data will throw off these counts. So, difR standardization and MH are only recommended with complete data.
  2. In the likelihood ratio method, code for pseudo $R^2$ (used as a measure of DIF effect size) can lead to errors for some models. The code also seems to assume no missing data. More on these issues below.

DIF with the likelihood ratio method is performed using the difR:::difLogistic function, which ultimately calls difR:::Logistik to do the modeling (via glm) and calculate the $R^2$. The functions for calculating $R^2$ are embedded within the difR:::Logistik function.

R2 <- function(m, n) {
  1 - (exp(-m$null.deviance / 2 + m$deviance / 2))^(2 / n)
}
R2max <- function(m, n) {
  1 - (exp(-m$null.deviance / 2))^(2 / n)
}
R2DIF <- function(m, n) {
  R2(m, n) / R2max(m, n)
}

These functions capture $R^2$ as defined by Nagelkerke (1991), which is a modification to Cox and Snell (1989). When these are run via difR:::Logistik, the sample size n argument is set to the number of rows in the data set, which ignores missing data on a particular item. So, n will be inflated for items with missing data, and $R^2$ will be reduced (assuming a constant deviance).

In addition to the missing data issue, because of the way they’re written, these functions stretch the precision limits of R. In the R2max function specifically, the model deviance is first converted to a log-likelihood, and then a likelihood, before raising to 2/n. The problem is, large deviances correspond to very small likelihoods. A deviance of 500 gives us a likelihood of 7.175096e-66, which R can manage. But a deviance of 1500 gives us a likelihood of 0, which produces $R^2 = 1$.

The workaround is simple – avoid calculating likelihoods by rearranging terms. Here’s how I’ve written them in the epmr package.

r2_cox <- function(object, n = length(object$y)) {
  1 - exp((object\$deviance - object\$null.deviance) / n)
}
r2_nag <- function(object, n = length(object$y)) {
  r2_cox(object, n) / (1 - exp(-object$null.deviance / n))
}

And here are two examples that compare results from difR with epmr and DescTools. The first example shows how roughly 10% missing data reduces $R^2$ by as much as 0.02 when using difR. Data come from the verbal data set, included in difR.

# Load example data from the difR package
# See ?difR:::verbal for details
data("verbal", package = "difR")

# Insert missing data on first half of items
set.seed(42)
np <- nrow(verbal)
ni <- 24
na_index <- matrix(
  sample(c(TRUE, FALSE), size = np * ni / 2,
    prob = c(.1, .9), replace = TRUE),
  nrow = np, ncol = ni / 2)
verbal[, 1:(ni / 2)][na_index] <- NA

# Get R2 from difR
# verbal[, 26] is the grouping variable gender
verb_total <- rowSums(verbal[, 1:ni], na.rm = TRUE)
verb_difr <- difR:::Logistik(verbal[, 1:ni],
  match = verb_total, member = verbal[, 26],
  type = "udif")

# Fit the uniform DIF models by hand
# To test for DIF, we would compare these with base
# models, not fit here
verb_glm <- vector("list", ni)
for (i in 1:ni) {
  verbal_sub <- data.frame(y = verbal[, i],
    t = verb_total, g = verbal[, 26])
  verb_glm[[i]] <- glm(y ~ t + g, family = "binomial",
    data = verbal_sub)
}

# Get R2 from epmr and DescTools packages
verb_epmr <- sapply(verb_glm, epmr:::r2_nag)
verb_desc <- sapply(verb_glm, DescTools:::PseudoR2,
  which = "Nag")

# Compare
# epmr and DescTools match for all items
# difR matches for the last 12 items, but R2 on the
# first 12 are depressed because of missing data
verb_tab <- data.frame(item = 1:24,
  pct_na = apply(verbal[, 1:ni], 2, epmr:::summiss) / np,
  difR = verb_difr$R2M0, epmr = verb_epmr,
  DescTools = verb_desc)

This table shows results for items 9 through 16, the last four items with missing data and the first four with complete data.

item pct_na difR epmr DescTools
9 0.089 0.197 0.203 0.203
10 0.085 0.308 0.318 0.318
11 0.139 0.408 0.429 0.429
12 0.136 0.278 0.293 0.293
13 0.000 0.405 0.405 0.405
14 0.000 0.532 0.532 0.532
15 0.000 0.370 0.370 0.370
16 0.000 0.401 0.401 0.401
Some results from first example

The second example shows a situation where $R^2$ in the difR package comes to 1. Data are from the 2009 administration of PISA, included in epmr.

# Prep data from epmr::PISA09
# Vector of item names
rsitems <- c("r414q02s", "r414q11s", "r414q06s",
  "r414q09s", "r452q03s", "r452q04s", "r452q06s",
  "r452q07s", "r458q01s", "r458q07s", "r458q04s")

# Subset to USA and Canada
pisa <- subset(PISA09, cnt %in% c("USA", "CAN"))

# Get R2 from difR
pisa_total <- rowSums(pisa[, rsitems],
  na.rm = TRUE)
pisa_difr <- difR:::Logistik(pisa[, rsitems],
  match = pisa_total, member = pisa$cnt,
  type = "udif")

# Fit the uniform DIF models by hand
pisa_glm <- vector("list", length(rsitems))
for (i in seq_along(rsitems)) {
  pisa_sub <- data.frame(y = pisa[, rsitems[i]],
    t = pisa_total, g = pisa$cnt)
  pisa_glm[[i]] <- glm(y ~ t + g, family = "binomial",
    data = pisa_sub)
}

# Get R2 from epmr and DescTools packages
pisa_epmr <- sapply(pisa_glm, epmr:::r2_nag)
pisa_desc <- sapply(pisa_glm, DescTools:::PseudoR2,
  which = "Nag")

# Compare
pisa_tab <- data.frame(item = seq_along(rsitems),
  difR = pisa_difr$R2M0, epmr = pisa_epmr,
  DescTools = pisa_desc)

Here are the resulting $R^2$ for each package, across all items.

item difR epmr DescTools
1 1 0.399 0.399
2 1 0.268 0.268
3 1 0.514 0.514
4 1 0.396 0.396
5 1 0.372 0.372
6 1 0.396 0.396
7 1 0.524 0.524
8 1 0.465 0.465
9 1 0.366 0.366
10 1 0.410 0.410
11 1 0.350 0.350
Results from second example

References

Cox, D. R. & Snell, E. J. (1989). The analysis of binary data. London: Chapman and Hall.

Magis, D., Beland, S, Tuerlinckx, F, & De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42, 847-862.

Nagelkerke, N. J. D. (1991). A note on a general definition of the coefficient of determination. Biometrika, 78, 691-692.