A Practical Guide to Local Dependence in Latent Class Models

(Back to Latent Class Analysis main page)


Introduction

A common, serious problem with studies that use Latent Class Analysis (LCA) is local dependence. The basic latent class model assumes that manifest variables are independent of each other within latent classes. This assumption is often untrue, and when it is the latent class model must be modified.

Often some manifest variables (for brevity, "items") are related or dependent. Examples include items on related symptoms, multiple indicators, or repetitions of the same item. Such items are termed "conditionally dependent" or "locally dependent" because they are associated within latent classes. If this dependence is not accounted for, model fit indices (such as the G2 squared statistic) will be too high. This will lead one to add latent classes in order to fit the data. For example, one might end up with a six-latent class solution, when, had local dependence been taken into account, a three-latent class solution would have sufficed. The extra latent classes most likely will not reflect genuine (e.g., biologically- or genetically-based) subgroups.

Previous literature has underemphasized this issue. That is changing though, especially as new software makes it easier to consider local dependence models.

It isn't hard to handle local dependence in latent class models (LCMs). Here we consider in simple practical terms how to do so. For more technical discussion see Hagenaars, 1988, or an extended version of this document.

First we will see how to detect locally dependent items. A simple computer program for this is provided. Second, we will see how to incorporate local dependence into LCMs with standard latent class analysis software. Examples and program command files are provided.

The main LCA software considered here is LEM (Vermunt, 1997). LEM's flexible command language is well suited to local dependence LCMs. Another reason for focusing on LEM is that it can be downloaded for free, making it widely available.

The new Latent GOLD program (Vermunt & Magidson, 2000) has special features for handling local dependence in LCMs; using the program's Windows, one can easily detect and specify local dependence for pairs of items. I will update this page once I've had a chance to use Latent GOLD more.

As will be described, local dependence can also be modeled with PANMARK (van de Pol, Langeheine & de Jong, 1998) and MLLSA (Clogg, 1977) by recoding variables or with use of special parameter restrictions. This is satisfactory for simpler problems, but can become cumbersome if there are many locally dependent items or complex dependencies.

For more information on these LCA programs, see the Latent Class Analysis Software page.)


Types of Local Dependence

Figure 1 shows types of associations among variables in a latent class model. X and Y represent latent class variables, and A to H represent items.

An arrow from X or Y to an item indicates that response probabilities for the item depend on the latent class. A diagram with only these arrows (e.g., part i) would portray a standard LCM with local independence.

               X          X          X -------> Y
             / | \      /   \      /   \      /   \
            A  B  C    D<--->E    E     F    G     H

(i) (ii) (iii)
Figure 1 (draft). Diagram illustrating different sources of local dependence among manifest variables.

Diagram parts (ii) and (iii) illustrate two types of local dependence. The two-headed arrow connecting items D and E denotes simple local dependence of these items; no assumption is made about the cause of the dependence. The situation is similar to "correlated error" among continuous variables in a LISREL-type model.

In contrast, items G and H are locally dependent because of mutual dependence on a second latent variable, Y. This model would apply, for example, if items G and H were multiple indicators of a latent variable Y.

For only two items, the model used to represent their local dependence makes little practical difference--the same degree of model fit and the same basic parameter values will be obtained. (Still, one should try to select a model consistent with theory). When there are three or more locally dependent items, however, the model chosen does matter. This will be clearer in later discussion.


Detecting Local Dependence

Attention here focuses on LCMs with dichotomous items. To some extent, the methods here can be generalized to polytomous-item models.

For illustration we consider data on four diagnostic tests for human HIV virus (Table 1) reported by Alvord et al. (1988).

    Table 1.  Four AIDS diagnostic tests used by Alvord
     et al. (1988).
     ----------------------------------------------------
     Test   Label       Description
     ----------------------------------------------------
      A     RIA-ag121   Radioimmunoassay of antigen ag121
      B     RIA-p24     Radioimmunoassay of HIV p24
      C     RIA-gp120   Radioimmunoassay of HIV gp120
      D     ELISA       Enzyme-linked immunosorbent assay
     ----------------------------------------------------

RIA-ag121 tests for a specific antigen, ag121, to the AIDS virus. RIA-p24 and RIA-gp120 test for presence of specific proteins of the virus itself. The ELISA method also tests for directly for the virus' presence.

Table 2 summarizes test results for 428 subjects.

    Table 2.  Results of LCA of four diagnostic
    tests for AIDS virus.
    ------------------------------------------------
                               Model 1      Model 2
                              ---------    ---------
    Test Result*  Observed    Expected     Expected
     A  B  C  D   frequency   frequency    frequency
    -------------------------------------------------
     1  1  1  1     170       169.366      169.714
     1  1  1  2      15        14.837       14.464
     1  1  2  1       0         0.000        0.000
     1  1  2  2       0         0.000        0.000
     1  2  1  1       6         6.250        6.286
     1  2  1  2       0         0.548        0.536
     1  2  2  1       0         0.000        0.000
     1  2  2  2       0         0.000        0.000
     2  1  1  1       4         5.193        4.821
     2  1  1  2      17         9.096       17.000
     2  1  2  1       0         0.000        0.000
     2  1  2  2      83        90.509       83.000
     2  2  1  1       1         0.192        0.179
     2  2  1  2       4        11.520        4.000
     2  2  2  1       0         0.000        0.000
     2  2  2  2     128       120.491      128.000
    ------------------------------------------------
     G-squared                 16.23         3.06
     df                         6            4
     p                          0.01         0.55
    ------------------------------------------------
    *  1 = negative result; 2 = positive result.
    Notes:  Model 1:  Standard 2-class LCM;
            Model 2:  2-class LCM with local
              dependence of tests B and C.

The many unobserved patterns suggest that a more deterministic model for the data might be appropriate. However, for the sake of a computational example, we put this reservation aside.

Standard LCM with conditional independence

Alvord et al. (1988) reasoned there should be two latent classes, corresponding to presence and absence of the HIV virus. Model 1 is a standard two-class LCM with conditional independence. A LEM command file estimate this model is as follows:


   * Example 1
   *
   * Model 1:  Unrestricted 2-class latent class model
   * Data:  Alvord et al., 1988
   *
   * A = Radioimmunoassay (RIA) for antigen ag121
   * B = RIA using purified HIV p24
   * C = RIA using purified HIV gp120
   * D = Enzyme-linked immunosorbent assay (ELISA)
   *
   lat 1
   man 4
   dim 2 2 2 2 2
   lab X A B C D
   mod X A|X B|X C|X D|X
   dat [170 15 0 0 6 0 0 0 4 17 0 83 1 4 0 128]
   sta A|X [ .8 .2 .2 .8  ]
   sta B|X [ .8 .2 .2 .8  ]
   sta C|X [ .8 .2 .2 .8  ]
   sta D|X [ .8 .2 .2 .8  ]

The last four lines, optional, give start values for the conditional response probabilities. Their purpose here is to make Latent Class 1 correspond to the disease-negative cases and Latent Class 2 to the disease-positive cases.

Model 1 poorly fits; the G2 and X2 statistics are 16.2 and 17.1; with 6 df. However, for comparison, we consider its estimated parameter values.

Let X1 and X2 denote the two latent classes and let, say, A1 and A2 denote a negative and positive outcome on Test A. One commonly reports results as the conditional probabilities of a positive item response in each latent class, or P(A2|X1), P(A2|X2), P(B2|X1), etc. Table 3 summarizes these values:


     Table 3.  Conditional Positive Response Probabilities for
     Latent Class Models of AIDS Tests Data
     ---------------------------------------------------
                       Model 1               Model 2

                   Probability of        Probability of
                   positive result       positive result
                    Latent Class          Latent Class
                   ---------------      ----------------
     Test            1        2            1        2
     ---------------------------------------------------
       A           0.030    1.000*       0.028    1.000*
       B           0.036    0.570        0.036    0.570
       C           0.000*   0.913        0.000*   0.911
       D           0.081    1.000*       0.079    1.000*
     ---------------------------------------------------
     Latent
     Class
     Prevalence    0.460    0.540        0.459    0.541
     ---------------------------------------------------
     *Parameter fixed to boundary value in estimation.
     Note:  Models defined in Table 2.

Diagnostics

A number of authors have discussed diagnostics for detecting locally dependent items (Espeland & Handelman, 1988; Garrett & Zeger, 2000; Hagenaars, 1988; Qu et al., 1996; Vermunt & Magidson, 2000). Several methods compare observed and model-predicted crossclassification frequencies for pairs of items; that is, for each (dichotomous) item pair I and J, one compares the observed 2 x 2 crossclassification table for item responses with the corresponding table predicted by the LCM being considered. Higher association of the items in the observed two-way table than in the expected two-way table implies local dependence.

We use a modified version of Garrett and Zeger's (2000) Log-Odds Ratio Check (LORC). (Their complete method involves advanced statistical methods, including Markov Chain Monte Carlo [MCMC] estimation; the simpler version here should usually produce similar conclusions). The modified method, summarized below, can be easily applied with use of a program that is supplied here.

Method. For each pair of items:

  1. Construct the observed and model-predicated two-way cross-classification frequency tables for the two items.

  2. Calculate the log-odds ratio (psi) in both the observed and expected two-way tables.

  3. Calculate the standard error of psi for the expected data, using the formula sigma(psi) = sqrt(1/a + 1/b + 1/c + 1/d) where a, b, c and d denote the four frequencies of the two-way table. The observed-data psi is then expressed as a z-score relative to the expected-data psi. That is,
                    psi(observed) - psi(expected)
              z  =  ----------------------------        (1)
                        sigma[psi(expected)]
    
    
  4. Examine whether the z-value exceeds a critical value of, say +/- 1.645 or +/- 1.96. If so, that is evidence that the items are locally dependent. These z-values, however, are only guides; one should not interpret the p-values too literally. The main issue is their relative magnitude.

    Table 4.  Indices of Local Dependence among HIV Tests
    --------------------------------------------------------
                        Expected           Observed
                        log odds           log odds   z-
     Tests      G^2     ratio      s.e.    ratio      value
    --------------------------------------------------------
     A    B     0.11     3.53     0.412     3.67      0.35
     A    C     0.00     8.02     1.431     8.02      0.00
     A    D     0.04     6.20     0.511     6.30      0.20
     B    C     4.95     2.66     0.280     3.36      2.52 *
     B    D     0.05     3.45     0.421     3.35     -0.23
     C    D     0.00     7.64     1.427     7.64      0.00
    --------------------------------------------------------
    Note:  Tests defined in Table 2.
    * p < .05, two-tailed.    
Results of this method applied to the Alvord et al. data are shown in Table 4. Tests B and C appear to be locally dependent, as indicated by the relatively large z-value associated with them. (For comparison a G2 statistic is also shown for each pair of items; this is the likelihood-ratio chi-squared statistic comparing the observed and expected two-way table for the items [Espeland & Handelman, 1988]; again, this is much larger for Tests B and C.)

Table 4 was constructed by the CONDEP program. Program input consisted of the first 6 columns of Table 2, which were cut-and-pasted from the LEM output into a separate file. Click here to download an executable version of the program and documentation (ZIPped file, about 39k).

Vermunt and Magidson (2000) recently described a promising new way to diagnose local dependence. This estimates the improvement in model fit obtained by allowing local dependence of each pair of items. The Latent GOLD program includes this method.


Modeling Local Dependence

There are several ways to relax local independence assumptions in LCMs. We will illustrate three methods:

with the Alvord et al. (1988) data.

The joint item method

For simple kinds of local dependence, such as among pairs of items, the joint item method is useful. The principle is to replace two or more dependent items with a joint item, created by considering all combinations of levels of the original items. In the present case, we create a new item BC with four levels, as shown in Table 5.

                   Table 5.  Joint Item BC
                    Created by Considering
                  All Combinations of Levels
                       on Tests B and C
                    ---------------------
                               Levels on
                               original
                   Level on      items
                   new item    ---------
                      BC         B   C
                    ---------------------
                      1          1   1
                      2          1   2
                      3          2   1
                      4          2   2
                    ---------------------
One then recodes the data accordingly, and estimates a standard LCM on the new item set--i.e., here, the three items A, BC and D. The LCM estimated is a standard LCM model with no special provisions: the local dependance of B and C is handled by the recoding.

The Alvord et al. data so recoded, in indexed-frequency format with unobserved response patterns dropped, is as follows:


    1   1   1     170
    1   1   2      15
    1   3   1       6
    2   1   1       4
    2   1   2      17
    2   2   2      83
    2   3   1       1
    2   3   2       4
    2   4   2     128


With LEM, however, it is not necessary to recode the data; the joint item method can be used in a way that is mostly transparent to the user.

To illustrate with the Alvord et al. data, let Model 2 denote a two-latent class LCM with local dependence between Tests B and C. The LEM commands to estimate this model via the joint item method is as follows:


   * Example 2
   *
   * Model 2:  2-class LCM with local dependence of Tests B
   * and C modeled by the joint item method
   * Data:  Alvord et al., 1988
   *
   * A = Radioimmunoassay (RIA) for antigen ag121
   * B = RIA using purified HIV p24
   * C = RIA using purified HIV gp120
   * D = Enzyme-linked immunosorbent assay (ELISA)
   *
   lat 1
   man 4
   dim 2 2 2 2 2
   lab X A B C D
   mod X A|X BC|X D|X
   dat [170 15 0 0 6 0 0 0 4 17 0 83 1 4 0 128]
   sta A|X [ .8 .2 .2 .8  ]
   sta D|X [ .8 .2 .2 .8  ]


The main difference between this and the commands for Model 1 is that the model (mod) line substitutes BC|X for B|X and C|X.

For Model 2, LEM obtains a G2 = 3.06 (p = .549) and X2 =4.49 (p = .344) with 4 df. These imply acceptable model fit.

Parameter estimates for the Model are shown in Table 3. The estimated conditional response probabilities for Tests A and D can be obtained from the CONDITIONAL PROBABILITIES output section, for example, LEM shows:

   * P(A|X) *

     1 | 1          0.9724  (0.0122)
     2 | 1          0.0276  (0.0122)
     1 | 2          0.0000  (0.0000) *
     2 | 2          1.0000  (0.0000) *

However, since B and C comprise a joint item, their output is different. In the STATISTICS section, LEM shows:

   * P(BC|X) *

     1 1 | 1        0.9643  (0.0133)
     1 2 | 1        0.0000  (0.0000) *
     2 1 | 1        0.0357  (0.0133)
     2 2 | 1        0.0000  (0.0000) *
     1 1 | 2        0.0716  (0.0172)
     1 2 | 2        0.3584  (0.0315)
     2 1 | 2        0.0172  (0.0086)
     2 2 | 2        0.5527  (0.0327)
The conditional probabilities we seek are marginal sums this table. For example,

     P(B2|X2) = P(B2,C1|X2) + P(B2,C2|X2)
              = .0172 + .5527 = .5699.


LEM supplies these marginal response probabilities in the LATENT CLASS OUTPUT section. There we find:

   *** LATENT CLASS OUTPUT ***

             X  1    X  2
            0.4589  0.5411
     A  1   0.9724  0.0000
     A  2   0.0276  1.0000
     B  1   0.9643  0.4301
     B  2   0.0357  0.5699
     C  1   1.0000  0.0888
     C  2   0.0000  0.9112
     D  1   0.9215  0.0000
     D  2   0.0785  1.0000

So we can directly see that: With PANMARK or MLLSA one must calculate the marginal response probabilities from the output supplied.

Neither LEM nor PANMARK supplies the standard errors for the marginal response probabilities (MLLSA supplies no standard errors at all). This is so regardless of which method is used to model local dependence. They can be calculated by the delta method, but this may entail a fair amount of work. Latent GOLD supplies these standard errors automatically.

Many applied studies using LCA do not report parameter standard errors, so the importance of this is unclear. By analogy, note that one seldom reports standard errors of factor loadings in factor analysis. In any case, this issue should not be seen as a reason to avoid incorporating local dependence into LCMs.

In Table 3 one sees that the parameter estimates for Model 2 are very close to those for Model 1. This similarity of results sometimes occurs with a local dependence LCM and sometimes it does not. However note the substantial drop in the G2 model fit statistic associated with Model 2. Model 1, a standard two-class LCM clearly does not fit the data. To fit the data with a conditional-independence LCM, another latent class would need to be added, which would complicate and potentially distort the results. It would, for example, make it harder to estimate the sensitivity and specificity of each diagnostic test. By allowing for local dependence of Tests B and C, we fit the data with a two latent-class model.

One can also construct joint variables that combine responses three or more items. Further, there can be several joint items in a given model. However there cannot be overlap in the original items that define different joint items. For example we could not have one joint item defined by items A and B and another joint item defined by items B and C because both joint items would include item B.

The multiple indicator method

This method works by modeling locally dependent items as multiple indicators of a common latent variable. For our data, the situation is as shown in Figure 2.
                 X
               / | \
              /  |  \
             /   |   \
            A    Y    D
                / \
               B   C

Figure 2 (draft). Local dependence with items B and C being multiple indicators of the latent variable Y.
The latent variable Y is an unobserved construct or entity which both B and C measure. The principle is familiar from LISREL-type modeling; the difference is that here Y is a discrete latent variable--i.e., a latent class variable with two or more levels.

The LEM command file to estimate the model in Figure 2 is as follows:


   * Example 3
   *
   * Model 2a:  2-class LCM with local dependence of Tests B
   * and C modeled by the multiple-indicator method
   * Data:  Alvord et al., 1988
   *
   * A = Test 1
   * B = Test 2
   * C = Test 3
   * D = Test 4
   *
   lat 2
   man 4
   dim 2 2 2 2 2 2
   lab X Y A B C D
   mod X Y|X A|X B|Y C|Y D|X
   dat [170 15 0 0 6 0 0 0 4 17 0 83 1 4 0 128]
   sta A|X [ .8 .2 .2 .8  ]
   sta B|Y [ .8 .2 .2 .8  ]
   sta C|Y [ .8 .2 .2 .8  ]
   sta D|X [ .8 .2 .2 .8  ]

We could instead use the model (mod) line:
   mod XY A|X B|Y C|Y D|X  
However specifying X Y|X rather than XY has the effect of structuring output in more helpful way.

Because two latent variables are specified (X and Y), LEM will actually estimate a four-class LCM, each class representing a combination levels on X and Y. However with LEM this is mainly transparent and the substantive interpretation of a two-class model with Latent Class 1 = AIDS virus absent and Latent Class 2 = AIDS virus present is not obscured.

When the commands above are submitted to LEM, a model is estimated that fits the data to the same degree as the joint item method. That is expected, since the models are identical at a fundamental level. In the CONDITIONAL PROBABILITIES output section we find that the estimates for latent class probabilities and the conditional response probabilities for items A and D are the same as before.

We must manipulate output results slightly to get the conditional response probabilities for items B and C. Taking item B, for example, the CONDITIONAL PROBABILITIES section shows the following:


   * P(Y|X) *

     1 | 1          1.0000  (0.0000) *
     2 | 1          0.0000  (0.0000) *
     1 | 2          0.0643  (0.0195)
     2 | 2          0.9357  (0.0195)

   * P(B|Y) *

     2 | 1          0.0357  (0.0133)
     2 | 2          0.6066  (0.0336)
From these values we calculate

     P(B2|X1) = P(B2|Y1)*P(Y1|X1) + P(B2|Y2)*P(Y2|X1)
              = .0357 * 1.0 + .6066 * 0 = .036

     P(B2|X2) = P(B2|Y1)*P(Y1|X2) + P(B2|Y2)*P(Y2|X2)
              = .0357 * .0643 + .6066 * .9357 = .570
In this way we get the same conditional response probabilities for items B and C as obtained by the joint item method.

Goodman (1974a) first described how to implement this method with special restrictions on the latent class model (see also Hagenaars, 1988). With that technique, the multiple-indicator method can be used with programs like PANMARK and MLLSA. The restrictions make equal various conditional response probabilities across pairs of latent classes. To see the parameter restriction file used to estimate this model with PANMARK, click here.

The loglinear formulation of LCA

Clogg (1995) distinguished between the classical formulation and the loglinear formulation of LCA. The classical method is the standard parameterization of Lazarsfeld and Henry (1968) and Goodman (1974b), with the basic model parameters being the latent class prevalences and conditional response probabilities. The loglinear formulation (Haberman, 1979) reparameterizes the latent class model as a type of loglinear model. With the loglinear formulation it becomes much simpler to specify certain types of local dependence models.

A LEM command file to model local dependence of Test B and C using the loglinear formulation is as follows:


   * Example 4
   *
   * Model 2b:  2-class LCM with local dependence of Tests B
   * and C modeled via the loglinear formulation
   * Data:  Alvord et al., 1988
   *
   * A = Test 1
   * B = Test 2
   * C = Test 3
   * D = Test 4
   *
   lat 1
   man 4
   dim 2 2 2 2 2
   lab X A B C D
   mod {AX BCX DX}
   dat [170 15 0 0 6 0 0 0 4 17 0 83 1 4 0 128]
   sta A|X [ .8 .2 .2 .8  ]
   sta D|X [ .8 .2 .2 .8  ]

Again, the only difference here is with the mod line. Model effects are specified in the conventional notation for loglinear models (see, for example, Hagenaars, 1990). In particular, the term BCX means that the model includes terms associated with the main effects of tests B and C, the two-way interactions BX, CX and BC, and the three-way interaction between B, C and X.

Again, this model is fundamentally identical to those specified in Examples 2 and 3, and the same model fit and parameter estimates are obtained. Estimated latent class prevalences and conditional response probabilities are found directly in the LATENT CLASS OUTPUT section.

Comparison of methods

When there are only pairs of locally dependent dichotomous variables, all three methods discussed here give the same results. However when there is local interdependence among three or more items the methods give potentially different results.

To appreciate this, consider a two-latent class LCM with a joint variable created for dichotomous manifest variables A, B, and C. The joint variable ABC has eight levels and is associated with seven independent estimated response probabilities per latent class.

Suppose instead that we wish to model local dependence between each pair of items, (A, B), (B, C), and (A, C), as shown in Figure 3. Here there are only six estimated parameters per latent class associated with the response probabilities of these items. (These correspond to the three paths from X to the items and the three paths between the items).

                 X
               / | \
              /  |  \
             /   |   \
            A<-->B<-->C
             \_______/


Figure 3 (draft). Local dependence among each pair of items A, B, and C.
With the loglinear formulation we could specify either model by including or not including the three-way interaction between A, B and C. That is, in the model line we could specify the effects associated with these items either by ABCX or by the three terms ABX, ACX and BCX.

Alternatively we could view A, B and C as multiple indicators, as illustrated in Figure 4.

                 X
                 |
                 Y
               / | \
             A   B   C


Figure 4 (draft). Local dependence with items A, B and C multiple indicators of latent variable Y.
Then their local dependence could be modeled using only four parameters per latent class (that is, assuming only two levels for the latent variable Y).

Thus when there are sets of more than two locally dependent items the choice of models becomes more of an issue. Theory will of course help guide the choice. In some cases one may wish to compare results using different methods.


Summary and Conclusions

The present discussion may be summarized as follows:

References


(Back to Main LCA Page)

This page maintained by:

John Uebersax
jsuebersax@yahoo.com

Last updated: 10 August 2000


Copyright (c) 2000 John S. Uebersax. All rights reserved.