Topics 3 & 4: Analysis of Two-Level Longitudinal Data PDF

Summary

This La Trobe university lecture document describes the analysis of two-level longitudinal data, focusing on the Autism study conducted by Anderson et al (2009). The document explores social and language development in autistic children, examining how these factors relate to one another. The lecture also deals with statistical models and methods.

Full Transcript

Topics 3 & 4: Analysis of Two-Level Longitudinal Data Analysis of Repeated Measures La Trobe STA5ARM 1 / 71 1. The Autism study (Ch 6.2) Description of the Autism study The data we use in Topics 3 and 4 is taken from the Autism study conducted by Anderson et al (2009). We focus on data that wa...

Topics 3 & 4: Analysis of Two-Level Longitudinal Data Analysis of Repeated Measures La Trobe STA5ARM 1 / 71 1. The Autism study (Ch 6.2) Description of the Autism study The data we use in Topics 3 and 4 is taken from the Autism study conducted by Anderson et al (2009). We focus on data that was taken from 158 autism spectrum disorder (ASD) children. The data consists of the following variables: Childid: This is a factor variable that identifies the child. Vsae: This is a continuous variable that measures the social development of a child at ages 2, 3, 5, 9 and 13 years. Note: Not all children were measured at each age. Age: This variable measures the age of the child (2, 3, 5, 9 and 13). Sicdegp: This is a factor variable that measures the language development of the child at age 2. It has 3 levels (1 = Low, 2 = Medium, 3 = High). La Trobe STA5ARM 2 / 71 1. The Autism study (Ch 6.2) Description of the Autism study Table 1 presents the data on each of the variables for two children. Table 1: Autism data for children 1 and 2 Childid 1 1 1 1 1 2 2 2 2 2 La Trobe Vsae 6 7 18 25 27 6 7 7 8 14 STA5ARM Age 2 3 5 9 13 2 3 5 9 13 Sicdegp 3 3 3 3 3 1 1 1 1 1 3 / 71 1. The Autism study (Ch 6.2) Description of the Autism study Our aim is to examine the following research questions: 1 How does social development change over time for autistic children? 2 Does the level of language development at age 2 (initial language development) influence the social development trajectory over time for autistic children? 3 Does the trajectory of social development over time differ between children? These research questions together with numerical and graphical displays of the data enable us to determine the structure of the linear mixed model we will use. Numerical and graphical summaries of the data using the R program allows us to quickly eyeball the relationships between the variables in the study. La Trobe STA5ARM 4 / 71 2. Numerical and graphical analysis (Ch 6.2) Examining research question 1 numerically and graphically Table 2 below provides the number of observations, sample mean and sample standard deviation for Vsae (which measures social development) grouped by Age. Table 2: Summary of Vsae grouped by Age Age 2 3 5 9 13 La Trobe n 156 149 91 119 95 Vsae mean 9.090 15.255 21.484 39.555 60.600 STA5ARM Vsae stdev 3.856 7.978 13.319 32.617 48.920 5 / 71 2. Numerical and graphical analysis (Ch 6.2) Examining research question 1 numerically and graphically Figure 1 displays a scatter plot of the values of Vsae vs Age. Figure 1: Scatter plot of Vsae vs Age 200 Vsae 150 100 50 0 5 10 Age The orange squares represent the observed mean values of Vsae at each Age level. La Trobe STA5ARM 6 / 71 2. Numerical and graphical analysis (Ch 6.2) Examining research question 1 numerically and graphically Table 2 and Figure 1 show, that on average, social development (measured by Vsae) for autistic children improves over time in a linear manner. This suggests that our linear mixed model should include the linear effect of Age on Vsae. Also, note that the variability of Vsae scores increases with Age. La Trobe STA5ARM 7 / 71 2. Numerical and graphical analysis (Ch 6.2) Examining research question 2 numerically and graphically Table 3 below provides the number of observations, sample mean and sample standard deviation for Vsae grouped by Age and Sicdegp (which measures initial language development). Table 3: Summary of Vsae grouped by Age and Sicdegp Sicdegp.f 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 La Trobe Age 2 3 5 9 13 2 3 5 9 13 2 3 5 9 13 n 50 47 29 36 28 66 64 36 48 41 40 38 26 35 26 Vsae mean 7.000 12.021 15.034 25.556 37.107 8.667 14.078 17.694 32.125 58.829 12.400 21.237 33.923 64.143 88.692 STA5ARM Vsae stdev 2.733 6.264 7.921 28.416 35.538 3.536 6.199 7.996 23.399 50.267 3.425 9.379 15.781 34.588 46.340 8 / 71 2. Numerical and graphical analysis (Ch 6.2) Examining research question 2 numerically and graphically Figure 2 displays a scatter plot of the values of Vsae vs Age for each level of Sicdegp. Figure 2: Vsae vs Age for each level of Sicdegp 200 150 Vsae Sicdegp.f 1 100 2 3 50 0 5 10 Age The red, blue and green squares represent the observed mean values of Vsae at each Age level for the Low, Medium and High levels of Sicdegp, respectively. La Trobe STA5ARM 9 / 71 2. Numerical and graphical analysis (Ch 6.2) Examining research question 2 numerically and graphically Table 3 and Figure 2 show that the trajectory of mean social development over time for children is linear for Low and High levels of initial language development but is possibly quadratic for the Medium level. The steeper slope of the green line in Figure 2 suggests that the rate of improvement of social development over time for children who have High initial language development is greater than those who have a lower level of initial language development. These points suggest that our linear mixed model should include the simple quadratic effect of Age, the simple effect of Sicdegp, and the interaction effects of Age × Sicdegp and Age 2 × Sicdegp. There is some doubt, however, whether the quadratic effect of Age and the interaction effect of Age 2 × Sicdegp should be included in the model. We will test for these effects. It also shows that the mean social development of children at age 2 for each level of initial language development are similar. La Trobe STA5ARM 10 / 71 2. Numerical and graphical analysis (Ch 6.2) Examining research question 3 graphically Figure 3 displays each child’s trajectory of Vsae values over time where the trajectories are grouped by the levels of Sicdegp. Figure 3: Vsae trajectories over time for each child 1 2 3 200 Vsae 150 100 50 0 5 10 5 10 5 10 Age Each coloured line belongs to a child’s trajectory of Vsae values over time. La Trobe STA5ARM 11 / 71 2. Numerical and graphical analysis (Ch 6.2) Examining research question 3 graphically Figure 3 shows that the initial starting point (age 2) of the trajectories of social development over time are similar between children. This suggests that a random intercept for each child will not be included in our linear mixed model. Note, we usually test for this but in this example the inclusion of the random intercept caused computational issues. Figure 3 also shows that the trajectories of social development over time differ between children. Some children show no improvement over time, some have linear trajectories over time that vary, while others have quadratic trajectories over time that vary. This suggests that a random linear effect of Age and a random quadratic effect of Age, for each child, should be included in the model. La Trobe STA5ARM 12 / 71 3. Linear mixed model for Autism study (Ch 6.3) Individual observation specification of our full linear mixed model We examine research questions 1 to 3 using the full linear mixed model, Vsaeti = β0 + β1 Age2ti + β2 Age2Sqti + β3 Sicdegp2i + β4 Sicdegp3i + β5 Age2ti × Sicdegp2i + β6 Age2ti × Sicdegp3i + β7 Age2Sqti × Sicdegp2i + β8 Age2Sqti × Sicdegp3i + µ1i Age2ti + µ2i Age2Sqti + εti . (1) Vsaeti is the social development score for child i (i = 1, . . . , 158) at occasion t (t = 1, . . . , 5). Age2ti is the age of child i less 2 years, at occasion t. Age2Sqti is the squared value of Age2ti , for child i at occasion t. Sicdegp2i = 1 if child i has a Medium level of initial language development and 0 otherwise. Sicdegp3i = 1 if child i has a High level of initial language development and 0 otherwise. La Trobe STA5ARM 13 / 71 3. Linear mixed model for Autism study (Ch 6.3) Individual observation specification of our full linear mixed model β0 is the fixed intercept and can be interpreted as the mean of Vsae for children who are 2 years old and have a Low level of initial language development. β1 , β2 , β3 and β4 are the fixed simple effects (or conditional effects) of Age2, Age2Sq, Sicdegp2 and Sicdegp3, respectively. β1 is the linear effect of Age2 on Vsae for children who have a Low level of initial language development. See if you can interpret the other simple effects. Technically, β1 actually refers to the slope of the mean Vsae trajectory, at 2 years of age, for children who have a Low level of initial language development. La Trobe STA5ARM 14 / 71 3. Linear mixed model for Autism study (Ch 6.3) Individual observation specification of our full linear mixed model β5 , β6 , β7 and β8 are the fixed interaction effects of Age2 × Sicdegp2, Age2 × Sicdegp3, Age2Sq × Sicdegp2 and Age2Sq × Sicdegp3, respectively. β5 can be interpreted as the difference in the linear effect of Age2 on Vsae between children who have a Medium level of initial language development and those that have a Low level. See if you can interpret the other interaction effects. La Trobe STA5ARM 15 / 71 3. Linear mixed model for Autism study (Ch 6.3) Individual observation specification of our full linear mixed model µ1i is the random linear effect of Age2 on Vsae, specific to child i. µ2i is the random quadratic effect of Age2 on Vsae, specific to child i. εti is the random error associated with measuring Vsae at occasion t, for child i. La Trobe STA5ARM 16 / 71 3. Linear mixed model for Autism study (Ch 6.3) Matrix specification of our full linear mixed model In matrix form our full linear mixed model is Yi = Xi β + Zi µi + εi , where Yi represents a 5 × 1 vector of social development scores for child i. That is   Vsae1i  Vsae  2i   Yi =  Vsae3i  .  Vsae4i  Vsae5i La Trobe STA5ARM 17 / 71 3. Linear mixed model for Autism study (Ch 6.3) Matrix specification of our full linear mixed model Xi is a 5 × 9 matrix which represents a column of 1’s and the known values of the predictors Age2, Age2Sq, Sicdegp2, Sicdegp3, Age2 × Sicdegp2, Age2 × Sicdegp3, Age2Sq × Sicdegp2 and Age2Sq × Sicdegp3 for child i. That is, for child i,  Xi =  1 1 1 1 1 0 1 3 7 11 0 1 9 49 121 La Trobe Sicdegp2i Sicdegp2i Sicdegp2i Sicdegp2i Sicdegp2i Sicdegp3i Sicdegp3i Sicdegp3i Sicdegp3i Sicdegp3i 0 × Sicdegp2i 1 × Sicdegp2i 3 × Sicdegp2i 7 × Sicdegp2i 11 × Sicdegp2i STA5ARM 0 × Sicdegp3i 1 × Sicdegp3i 3 × Sicdegp3i 7 × Sicdegp3i 11 × Sicdegp3i 0 × Sicdegp2i 1 × Sicdegp2i 9 × Sicdegp2i 49 × Sicdegp2i 121 × Sicdegp2i 0 × Sicdegp3i 1 × Sicdegp3i 9 × Sicdegp3i 49 × Sicdegp3i 121 × Sicdegp3i 18 / 71   3. Linear mixed model for Autism study (Ch 6.3) Matrix specification of our full linear mixed model β is a 9 × 1 vector of 9 unknown fixed effect parameters. That is         β=        La Trobe β0 β1 β2 β3 β4 β5 β6 β7 β8 STA5ARM         .        19 / 71 3. Linear mixed model for Autism study (Ch 6.3) Matrix specification of our full linear mixed model Zi is a 5 × 2 matrix which represents the known values of the predictors Age2 and Age2Sq. That is     i =   Z La Trobe 0 0 1 1 3 9 7 49 11 121 STA5ARM     .   20 / 71 3. Linear mixed model for Autism study (Ch 6.3) Matrix specification of our full linear mixed model µi is a 2 × 1 random effect vector containing the random linear effect of Age2 and the random quadratic effect of Age2, for child i. That is ñ µi = µ1i µ2i ô . We assume that the random effect vectors are independent between children. That is µi is independent of µi 0 for i 6= i 0 . La Trobe STA5ARM 21 / 71 3. Linear mixed model for Autism study (Ch 6.3) Matrix specification of our full linear mixed model We assume that µi follows a multivariate normal distribution with 2 × 1 mean vector 0 and 2 × 2 unstructured covariance matrix D . That is µi ∼ N (0, D ) where ñ 0= D = Var (µi ) = ñ 0 0 ô σµ2 1 σµ1 ,µ2 , σµ1 ,µ2 σµ2 2 The variance-covariance parameter vector of î θD = σµ2 1 , σµ2 2 , σµ1 ,µ2 La Trobe STA5ARM ô . D is óT . 22 / 71 3. Linear mixed model for Autism study (Ch 6.3) Matrix specification of our full linear mixed model εi is a 5 × 1 random error vector of 5 random errors associated with 5 social development scores for child i. That is     εi =    ε1i ε2i ε3i ε4i ε5i     .   We assume that the random error vectors between children are independent of each other. La Trobe STA5ARM 23 / 71 3. Linear mixed model for Autism study (Ch 6.3) Matrix specification of our full linear mixed model We assume that εi follows a multivariate normal distribution with 5 × 1 mean vector 0 and 5 × 5 diagonal covariance matrix R . That is εi ∼ N (0, R ) where     0=   0 0 0 0 0      ,   R    = Var (εi ) =    σ2 0 0 0 0 0 σ2 0 0 0 0 0 σ2 0 0 0 0 0 σ2 0 0 0 0 0 σ2     ,   The variance-covariance parameter vector of R is θR = σ 2 . Note, usually I would of specified an unstructured covariance matrix to begin with, however in this example there were computational issues with specifying a structure that is more liberal than the diagonal structure. La Trobe STA5ARM 24 / 71 3. Linear mixed model for Autism study (Ch 6.3) Matrix specification of our full linear mixed model In practice it is a good idea to compare different structures for the covariance matrix R to see which structure fits the data appropriately. Unfortunately because of computational issues we could not do this for this example. Topic 5 will demonstrate how to examine and compare different structures for the R matrix. We assume the random error vectors ε1 , . . . , ε158 are independent of the random effect vectors µ1 , . . . , µ158 . We assume that there is no correlation between the the predictors and the random errors (level-1 exogeneity). We assume that there is no correlation between the predictors and the random effects (level-2 exogeneity). La Trobe STA5ARM 25 / 71 4. Hypothesis tests for random effects (Ch 6.5 & 2.6) REML-based likelihood ratio test Figure 3 seemed to show that the quadratic effect of Age on Vsae differed between children. We can test for this by testing whether the child-specific random quadratic effect of Age2, µ2i , should be included in model (1). If the random quadratic effect of Age2 is not included in model (1) then our model would be Vsaeti = β0 + β1 Age2ti + β2 Age2Sqti + β3 Sicdegp2i + β4 Sicdegp3i + β5 Age2ti × Sicdegp2i + β6 Age2ti × Sicdegp3i + β7 Age2Sqti × Sicdegp2i + β8 Age2Sqti × Sicdegp3i + µ1i Age2ti + εti . La Trobe (2) STA5ARM 26 / 71 4. Hypothesis tests for random effects (Ch 6.5 & 2.6) REML-based likelihood ratio test We can choose between the reference model (1) and the nested model (2) by testing the null hypothesis H0 : σµ2 2 = 0 vs the alternative hypothesis H1 : σµ2 2 > 0. The null hypothesis H0 : σµ2 2 = 0 is the same as saying that the random quadratic effects µ21 , . . . , µ2158 are all zero since (i) H0 states that there is no variability in the child-specific random quadratic effects and (ii) the mean of the random quadratic effects are equal to zero. The null hypothesis H0 : σµ2 2 = 0 also implies that σµ1 ,µ2 = 0 since the covariance between two variables is 0 if one of the two variables do not vary. We can test the above hypotheses by using the REML-based Likelihood Ratio Test (RLRT). La Trobe STA5ARM 27 / 71 4. Hypothesis tests for random effects (Ch 6.5 & 2.6) REML-based likelihood ratio test In general the RLRT p-value for testing the null hypothesis that the variance of a random effect is zero is given by Ä p-value = 0.5 × P χ2r −1 > 2 ln(LRR ) − 2 ln(LNR ) Ä ä ä + 0.5 × P χ2r > 2 ln(LRR ) − 2 ln(LNR ) . r is the number of random effects in the reference model. LRR is the value of the likelihood function of the Reference model evaluated using REML estimates. LNR is the value of the likelihood function of the Nested model evaluated using REML estimates. See Verbeke & Molenberghs (2000) for further insight into RLRT’s. La Trobe STA5ARM 28 / 71 4. Hypothesis tests for random effects (Ch 6.5 & 2.6) Testing the random quadratic effect of Age in our LMM The log-likelihood of the reference model (model (1)) is -2307.638. The log-likelihood of the nested model (model (2)) is -2349.601. The RLRT p-value for testing the null hypothesis that the variance of the random quadratic effect is zero is Ä ä Ä p-value = 0.5 × P χ21 > 83.927 + 0.5 × P χ22 > 83.927 ä ≈ 0. Thus, we can say that there is sufficient statistical evidence to suggest that σµ2 2 > 0. So we retain the random quadratic effect of Age2 in model (1). In practice we usually keep lower-order linear effects in our model if we keep higher-order quadratic effects (Morrell et al., 1997). Therefore, we retain the child-specific random linear effect of Age2, µ1i , in model (1). La Trobe STA5ARM 29 / 71 5. Hypothesis tests for fixed effects (Ch 6.5 & 2.6) ML-based likelihood ratio test The likelihood ratio tests we use to test fixed effects in a LMM are based on maximum likelihood (ML) estimation. It is not appropriate to use REML estimation for testing fixed effects in LMM (Morrell, 1998; Pinheiro & Bates, 2000; Verbeke & Molenberghs, 2000). Figure 2 showed that the rate of improvement of social development over time for children who have High initial language development is greater than those who have a lower level of initial language development. It also showed that the trajectories of mean social development over time for children is linear for Low and High levels of initial language development but is possibly quadratic for the Medium level. There is some doubt whether the interaction effect of Age2Sq × Sicdegp should be included in the model. Lets test for this. La Trobe STA5ARM 30 / 71 5. Hypothesis tests for fixed effects (Ch 6.5 & 2.6) ML-based likelihood ratio test If the interaction effect of Age2Sq × Sicdegp was not included in the model then our model would be Vsaeti = β0 + β1 Age2ti + β2 Age2Sqti + β3 Sicdegp2i + β4 Sicdegp3i + β5 Age2ti × Sicdegp2i + β6 Age2ti × Sicdegp3i + µ1i Age2ti + µ2i Age2Sqti + εti . (3) We can choose between the reference model (1) and the nested model (3) by testing the null hypothesis H0 : β7 = β8 = 0 vs the alternative hypothesis H1 : β7 6= 0 or β8 6= 0. We test the above hypotheses using the ML-based Likelihood Ratio Test (MLRT). La Trobe STA5ARM 31 / 71 5. Hypothesis tests for fixed effects (Ch 6.5 & 2.6) ML-based likelihood ratio test In general the MLRT p-value for testing the null hypothesis that the fixed effects are equal to zero is given by Ä ä p-value = P χ2s > 2 ln(LRM ) − 2 ln(LNM ) . s is equal to the number of fixed effects in the reference model less the number of fixed effects in the nested model. LRM is the value of the likelihood function of the Reference model evaluated using ML estimates. LNM is the value of the likelihood function of the Nested model evaluated using ML estimates. La Trobe STA5ARM 32 / 71 5. Hypothesis tests for fixed effects (Ch 6.5 & 2.6) Testing the interaction effect of Age2Sq × Sicdegp The ML-based log-likelihood of the reference model (model (1)) is -2305.222. The ML-based log-likelihood of the nested model (model (3)) is -2306.157. The MLRT p-value for testing the null hypothesis that β7 and β8 are equal to zero is Ä ä p-value = P χ22 > 1.87 = 0.393. Thus, we can say that there is insufficient statistical evidence to suggest that either β7 6= 0 or β8 6= 0. Therefore we drop the interaction effect of Age2Sq × Sicdegp in model (1). La Trobe STA5ARM 33 / 71 5. Hypothesis tests for fixed effects (Ch 6.5 & 2.6) Testing the interaction effect of Age2 × Sicdegp Figure 2 suggests that there is a quite obvious interaction effect of Age2 × Sicdegp. Lets test it. If this effect was not included in model (3) then our model would be Vsaeti = β0 + β1 Age2ti + β2 Age2Sqti + β3 Sicdegp2i + β4 Sicdegp3i + µ1i Age2ti + µ2i Age2Sqti + εti . (4) We can choose between the reference model (3) and the nested model (4) by testing the null hypothesis H0 : β5 = β6 = 0 vs the alternative hypothesis H1 : β5 6= 0 or β6 6= 0. La Trobe STA5ARM 34 / 71 5. Hypothesis tests for fixed effects (Ch 6.5 & 2.6) Testing the interaction effect of Age2 × Sicdegp The ML-based log-likelihood of the reference model (model (3)) is -2306.157. The ML-based log-likelihood of the nested model (model (4)) is -2317.848. The MLRT p-value for testing the null hypothesis that β5 and β6 are equal to zero is Ä ä p-value = P χ22 > 23.382 ≈ 0. Thus, we can say that there is sufficient statistical evidence to suggest that either β5 6= 0 or β6 6= 0. Therefore we retain the interaction effect of Age2 × Sicdegp in model (3). La Trobe STA5ARM 35 / 71 5. Hypothesis tests for fixed effects (Ch 6.5 & 2.6) Testing the quadratic effect of Age2 Figures 1 and 2 suggest that there is some doubt over whether the quadratic effect of Age2 should be included in our current model (model (3)). Lets test for this. If the quadratic effect of Age2 is not included in model (3) then our model would be Vsaeti = β0 + β1 Age2ti + β3 Sicdegp2i + β4 Sicdegp3i + β5 Age2ti × Sicdegp2i + β6 Age2ti × Sicdegp3i + µ1i Age2ti + µ2i Age2Sqti + εti . (5) We can choose between the reference model (3) and the nested model (5) by testing the null hypothesis H0 : β2 = 0 vs the alternative hypothesis H1 : β2 6= 0. La Trobe STA5ARM 36 / 71 5. Hypothesis tests for fixed effects (Ch 6.5 & 2.6) Testing the quadratic effect of Age2 The ML-based log-likelihood of the reference model (model (3)) is -2306.157. The ML-based log-likelihood of the nested model (model (5)) is -2309.383. The RLRT p-value for testing the null hypothesis that β2 is equal to zero is Ä ä p-value = P χ21 > 6.452 = 0.011. Thus, we can say that there is sufficient statistical evidence to suggest that β2 6= 0. Therefore we retain the quadratic effect of Age2 in model (3). This result is somewhat surprising. In Topic 10 we will examine the Autism example using the marginal model approach. The quadratic effect of Age2 in this approach is found to be insignificant. La Trobe STA5ARM 37 / 71 6. Our final model Our final model The model we will use for examining the research questions is model (3), Vsaeti = β0 + β1 Age2ti + β2 Age2Sqti + β3 Sicdegp2i + β4 Sicdegp3i + β5 Age2ti × Sicdegp2i + β6 Age2ti × Sicdegp3i + µ1i Age2ti + µ2i Age2Sqti + εti . La Trobe STA5ARM 38 / 71 7. Marginal and conditional values of Vsae Modeling the marginal value of Vsae The expected value of Vsaeti in model (3) is referred to as the marginal value of Vsaeti , and is given by E (Vsaeti ) = β0 + β1 Age2ti + β2 Age2Sqti + β3 Sicdegp2i + β4 Sicdegp3i + β5 Age2ti × Sicdegp2i + β6 Age2ti × Sicdegp3i The marginal value of Vsaeti , at a given age, is the same for children who belong to the same initial language development level. La Trobe STA5ARM 39 / 71 7. Marginal and conditional values of Vsae Modeling the marginal value of Vsae The marginal value of Vsaeti for children, at a given age, who have a Low initial language development level is E (Vsaeti |Sicdegp = Low ) = β0 + β1 Age2ti + β2 Age2Sqti . Sicdegp = Low means that Sicdegp2i = 0 and Sicdegp3i = 0. β1 is the linear effect of Age2 on Vsae for children who have a Low level of initial language development. What is the interpretation of β0 ? La Trobe STA5ARM 40 / 71 7. Marginal and conditional values of Vsae Modeling the marginal value of Vsae Express the marginal value of Vsaeti , at a given age, for each of the Medium and High levels of initial language development. What is the linear effect of Age2 for children who have a High initial language development level? What is the quadratic effect of Age2 for children who have a High initial language development level? Is this effect common for all initial language development levels? What is the mean of Vsae for children who are 2 years old and have a Medium level of initial language development? La Trobe STA5ARM 41 / 71 7. Marginal and conditional values of Vsae Modeling the marginal value of Vsae Find the expression for E (Vsaeti |Sicdegp = Medium) − E (Vsaeti |Sicdegp = Low ) and the expression for E (Vsaeti |Sicdegp = High) − E (Vsaeti |Sicdegp = Low ). The interpretations of β3 , β4 , β5 and β6 should be clear now. La Trobe STA5ARM 42 / 71 7. Marginal and conditional values of Vsae Modeling the conditional value of Vsae The expected value of Vsaeti in model (3), at a given age, conditional on the child-specific random linear and quadratic effects of Age2, is referred to as the conditional value of Vsaeti , and is given by E (Vsaeti |µ1i , µ2i ) = β0 + β1 Age2ti + β2 Age2Sqti + β3 Sicdegp2i + β4 Sicdegp3i + β5 Age2ti × Sicdegp2i + β6 Age2ti × Sicdegp3i + µ1i Age2ti + µ2i Age2Sqti . Each child has their own conditional value of Vsaeti . La Trobe STA5ARM 43 / 71 8. Predictions and residuals (Ch 6.8 & 2.8) The predicted marginal value of Vsae and marginal residuals Let β̂0 , β̂1 , β̂2 , β̂3 , β̂4 , β̂5 and β̂6 denote the GLS estimates of the fixed effects β0 , β1 , β2 , β3 , β4 , β5 and β6 , respectively. The predicted marginal value of Vsaeti is given by ⁄ E (Vsaeti ) = β̂0 + β̂1 Age2ti + β̂2 Age2Sqti + β̂3 Sicdegp2i + β̂4 Sicdegp3i + β̂5 Age2ti × Sicdegp2i + β̂6 Age2ti × Sicdegp3i . A marginal residual for model (3) is the difference between the observed value of Vsae and the predicted marginal value of Vsae and is given by ⁄ ε̂?ti = Vsaeti − E (Vsaeti ). Usually, in practice, marginal residuals are analysed in marginal models (Topic 10) rather than in models that include random effects. La Trobe STA5ARM 44 / 71 8. Predictions and residuals (Ch 6.8 & 2.8) The predicted conditional value of Vsae and conditional residuals Let µ̂1i and µ̂2i denote the empirical best linear unbiased predictions (EBLUPs) for the random effects µ1i and µ2i , respectively, for child i. The predicted conditional value of Vsaeti is given by ¤ E (Vsae ti |µ1i , µ2i ) = β̂0 + β̂1 Age2ti + β̂2 Age2Sqti + β̂3 Sicdegp2i + β̂4 Sicdegp3i + β̂5 Age2ti × Sicdegp2i + β̂6 Age2ti × Sicdegp3i + µ̂1i Age2ti + µ̂2i Age2Sqti . A conditional residual for model (3) is the difference between the observed value of Vsae and the predicted conditional value of Vsae and is given by ¤ ε̂ti = Vsaeti − E (Vsae ti |µ1i , µ2i ). La Trobe STA5ARM 45 / 71 9. Diagnostics of our final model (Ch 6.8, 6.9 & 2.8) Diagnostics of our final model Before we interpret the parameter estimates of model (3), we should first check (i) the agreement between the predicted values of Vsae and the observed values of Vsae and (ii) the assumptions of our LMM. We begin by checking the agreement between the predicted marginal values of Vsae (that come from fitting model (3) to the data) and the observed mean values of Vsae. La Trobe STA5ARM 46 / 71 9. Diagnostics of our final model (Ch 6.8, 6.9 & 2.8) Checking predicted marginal values of Vsae Figure 4 plots the predicted marginal values and observed mean values of Vsae as a function of Age for each level of Sicdegp. Figure 4: Predicted marginal and observed mean values of Vsae 75 Vsae Sicdegp.f 1 50 2 3 25 5 10 Age The smooth lines correspond to the predicted marginal values of Vsae and the squares correspond to the observed mean values of Vsae. It seems the data fits the marginal portion of our model quite well. La Trobe STA5ARM 47 / 71 9. Diagnostics of our final model (Ch 6.8, 6.9 & 2.8) Checking predicted conditional values of Vsae Figure 5 plots the predicted conditional and observed values of Vsae as a function of Age for the first five children in the High level initial language development group. Figure 5: Predicted conditional and observed values of Vsae 200 150 Childid.f Vsae 1 3 100 4 19 21 50 0 5 10 Age The smooth lines correspond to the predicted conditional values of Vsae and the circles correspond to the observed values of Vsae. Again, it seems the data fits the conditional portion of our model quite well. La Trobe STA5ARM 48 / 71 9. Diagnostics of our final model (Ch 6.8, 6.9 & 2.8) Checking agreement between the predicted conditional and observed values of Vsae Figure 6 plots the predicted conditional values of Vsae vs the observed values of Vsae. Figure 6: Predicted conditional Vsae vs Observed Vsae 200 Vsae 150 100 50 0 0 50 100 150 200 Cond_Vsae_hat Overall, the data fits the conditional portion of our model fairly well. There seems to be some outliers. La Trobe STA5ARM 49 / 71 9. Diagnostics of our final model (Ch 6.8, 6.9 & 2.8) Checking agreement between the predicted conditional and observed values of Vsae Figure 7 plots Cook’s distance for each child’s observations. This plot identifies which observations are highly influential when regressing the observed values of Vsae on the predicted conditional values of Vsae. Figure 7: Cook’s distance for each child’s observations 0.08 Cook's distance 47 134 0.04 0.00 0.02 Cook's distance 0.06 537 0 100 200 300 400 500 600 Obs. number lm(Vsae ~ Cond_Vsae_hat) Observations 47, 134 and 537 are influential. These correspond to observations taken at age 9 for child 49, 180 and 124, respectively. La Trobe STA5ARM 50 / 71 9. Diagnostics of our final model (Ch 6.8, 6.9 & 2.8) Examining assumptions associated with the random effects Table 4 below show the EBLUPs of the the random effects µ1i and µ2i for four children. Table 4: EBLUPs of the random effects µ1i and µ2i Childid 1 2 3 4 La Trobe µ̂1i -4.355 -2.651 -5.382 2.309 STA5ARM µ̂2i -0.151 -0.014 -0.093 0.608 51 / 71 9. Diagnostics of our final model (Ch 6.8, 6.9 & 2.8) Examining assumptions associated with the random effects Figure 8 displays a hexagonal 2-D plot of the paired EBLUPs of (µ1i , µ2i ). It provides an indication of the bivariate distribution of [µ1i , µ2i ]T . Figure 8: Hexagonal 2-D plot of (µ̂1i , µ̂2i ) 0.5 value mu2 10.0 7.5 5.0 0.0 2.5 −0.5 −5 0 5 10 15 mu1 There seems to be an outlier at the bottom right hand corner of the plot. This paired value belongs to child 124. La Trobe STA5ARM 52 / 71 9. Diagnostics of our final model (Ch 6.8, 6.9 & 2.8) Examining assumptions associated with the random effects Figure 9 displays the histogram of the EBLUPs of µ1i . It provides an indication of the distribution of µ1i . Figure 9: Histogram of µ̂1i 25 20 Count 15 10 5 0 −5 0 5 10 15 mu1 There seems to be skewness to the right but overall fairly normal. La Trobe STA5ARM 53 / 71 9. Diagnostics of our final model (Ch 6.8, 6.9 & 2.8) Examining assumptions associated with the random effects Figure 10 displays the histogram of the EBLUPs of µ2i . It provides an indication of the distribution of µ2i . Figure 10: Histogram of µ̂2i 40 Count 30 20 10 0 −0.5 0.0 0.5 mu2 The distribution seems fairly normal. La Trobe STA5ARM 54 / 71 9. Diagnostics of our final model (Ch 6.8, 6.9 & 2.8) Examining assumptions associated with the random errors Figure 11 displays a scatter plot of the conditional residuals vs Age. This plot provides an indication of whether Var (εij ) is constant over Age. Figure 11: Scatter plot of conditional residuals vs Age Conditional Residuals 25 0 −25 5 10 Age Other than some outliers at ages 3, 5 and 9 there seems to be no severe violation of constant variance. La Trobe STA5ARM 55 / 71 9. Diagnostics of our final model (Ch 6.8, 6.9 & 2.8) Examining assumptions associated with the random errors Figure 12 displays a histogram of the conditional residuals at each level of Age. This plot provides an indication of whether εij is normally distributed at each level of Age. Figure 12: Histogram of conditional residuals at each Age level 50 40 2 30 20 10 0 50 40 3 30 20 10 0 50 30 5 Count 40 20 10 0 50 40 9 30 20 10 0 50 40 13 30 20 10 0 −25 0 25 Conditional Residuals The distribution at each Age level seems fairly normal and are centred at zero (except at age 2). La Trobe STA5ARM 56 / 71 9. Diagnostics of our final model (Ch 6.8, 6.9 & 2.8) Remarks on the diagnostics of our final model The data seems to fit the marginal and conditional parts of the model quite well (Figures 4 - 6), however there are outliers associated with child 49, 180 and 124 (Figure 7). There seems to be an obvious outlier associated with child 124 when examining the bivariate distribution of the random effects (Figure 8). There seems to be no severe violation of the constant variance and normality assumption for εij as a function of Age (Figures 11 and 12). We refit the model without the observations associated with child 49, 180 and 124 and found that the results are very similar to the primary results (which include all 158 children) that are discussed in the next section. La Trobe STA5ARM 57 / 71 10. Interpreting the parameter estimates of our model (Ch 6.7) Estimate of the variance-covariance matrix D We use the observations belonging to all 158 children to fit model (3). The estimate of the variance-covariance matrix, D “= ñ 14.524 −0.415 −0.415 0.127 D , of model (3) is ô . The estimate of θD is θ̂D = [14.524, 0.127, −0.415]T . La Trobe STA5ARM 58 / 71 10. Interpreting the parameter estimates of our model (Ch 6.7) Estimate of the correlation matrix of µi The estimate of the correlation matrix of the random effect vector, µi , of model (3) is ñ Ÿ Corr (µi ) = 1.000 −0.306 −0.306 1.000 ô . The estimate of the correlation between µ1i and µ2i is -0.306. This suggests that the random quadratic effect of Age2 for a child decreases as the random linear effect of Age2 increases. La Trobe STA5ARM 59 / 71 10. Interpreting the parameter estimates of our model (Ch 6.7) Confidence intervals for σµ1 , σµ2 and Corr (µ1i , µ2i ) Table 5 provides the estimates and approximate 95% confidence intervals for the standard deviations of µ1i and µ2i , and the correlation between µ1i and µ2i . Table 5: Est’s and 95% C.I.’s for σµ1 , σµ2 and Corr (µ1i , µ2i ) Parameter σµ1 σµ2 Corr (µ1i , µ2i ) Lower 3.195 0.289 -0.514 Est. 3.811 0.356 -0.306 Upper 4.545 0.438 -0.065 The approximate 95% confidence interval for the correlation between the random effects does not contain 0. This result together with the fact that the variances of the random “ matrix on slide 58) suggests that the effects are not similar (see D unstructured covariance matrix for the random effect vector was an appropriate choice for this data. La Trobe STA5ARM 60 / 71 10. Interpreting the parameter estimates of our model (Ch 6.7) Estimate of the variance-covariance matrix The estimate of the variance-covariance matrix,  R̂    =   R R , of model (3) is 38.790 0.000 0.000 0.000 0.000 0.000 38.790 0.000 0.000 0.000 0.000 0.000 38.790 0.000 0.000 0.000 0.000 0.000 38.790 0.000 0.000 0.000 0.000 0.000 38.790        The estimate of σ 2 is σ̂ 2 = 38.790. La Trobe STA5ARM 61 / 71 10. Interpreting the parameter estimates of our model (Ch 6.7) Estimate of the variance-covariance matrix of The estimate of the variance-covariance matrix of     ÿ Var ( i ) =    Y Yi Yi , in model (3) is 38.790 0.000 0.000 0.000 0.000 0.000 52.610 39.728 84.617 120.268 0.000 39.728 157.333 273.606 425.247 0.000 84.617 273.606 769.400 1292.980 0.000 120.268 425.247 1292.980 2543.202        The estimate of Var (Yi ) shows that the variability of Vsae scores for a child increases with age. We can see this in Figures 1 to 3. La Trobe STA5ARM 62 / 71 10. Interpreting the parameter estimates of our model (Ch 6.7) Estimates of the fixed effects The estimates of the fixed effects in model (3), together with their corresponding standard errors, degrees of freedom, observed test statistics and p-values are presented in Table 6. Table 6: Estimates of fixed effects in model (3) Fixed effect β0 β1 β2 β3 β4 β5 β6 La Trobe Value 8.476 2.081 0.109 1.365 4.988 0.573 4.068 Std.Error 0.709 0.648 0.043 0.922 1.038 0.796 0.880 STA5ARM DF 448 448 448 155 155 448 448 t-value 11.947 3.210 2.548 1.481 4.805 0.719 4.624 p-value 0.000 0.001 0.011 0.141 0.000 0.472 0.000 63 / 71 10. Interpreting the parameter estimates of our model (Ch 6.7) Confidence intervals for the fixed effects The approximate 95% confidence intervals for the fixed effects in model (3) are presented in Table 7. Table 7: 95% confidence intervals for fixed effects in model (3) Fixed effect β0 β1 β2 β3 β4 β5 β6 La Trobe Lower 7.082 0.807 0.025 -0.456 2.937 -0.992 2.339 STA5ARM Upper 9.870 3.355 0.193 3.185 7.038 2.137 5.797 64 / 71 10. Interpreting the parameter estimates of our model (Ch 6.7) Interpreting the fixed effect estimates The estimate of β0 is 8.48. We estimate that the mean Vsae score for children who are 2 years old and have a Low level of initial language development is 8.48 and is significantly different from 0 (95% CI = [7.08, 9.87], p-value < 0.001). The estimates of β1 and β2 are 2.08 and 0.11, respectively. We estimate that the mean Vsae score for children who have a Low level of initial language development is accelerating as a function of Age (95% CI = [0.81, 3.35], p-value = 0.001, 95% CI = [0.02, 0.19], p-value = 0.011). La Trobe STA5ARM 65 / 71 10. Interpreting the parameter estimates of our model (Ch 6.7) Interpreting the fixed effect estimates The estimate of β3 is 1.36. We estimate that the mean Vsae score for 2 year old children who have a Medium level of initial language development is 1.36 more than 2 year old children who have a Low level of initial language development, however this difference is insignificant (95% CI = [−0.46, 3.19], p-value = 0.141). The estimate of β4 is 4.99. We estimate that the mean Vsae score for 2 year old children who have a High level of initial language development is 4.99 more than 2 year children who have a Low level of initial language development and this difference is significant (95% CI = [2.94, 7.04], p-value < 0.001). La Trobe STA5ARM 66 / 71 10. Interpreting the parameter estimates of our model (Ch 6.7) Interpreting the fixed effect estimates The estimate of β5 is 0.57. We estimate that the linear effect of Age2 on Vsae for children who have a Medium level of initial language development is 0.57 greater than children who have a Low level of initial language development, however this difference is insignificant (95% CI = [−0.99, 2.14], p-value = 0.472). Therefore there is insufficient evidence to say that there is a difference in the trajectory of mean Vsae scores over time between Medium and Low levels of initial language development. The estimate of β6 is 4.07. We estimate that the linear effect of Age2 on Vsae for children who have a High level of initial language development is 4.07 greater than children who have a Low level of initial language development and this difference is significant (95% CI = [2.34, 5.80], p-value < 0.001). Therefore there is evidence to say that the mean Vsae scores for children who have a High level of initial language development accelerates faster, as a function of Age, than children with Low levels. La Trobe STA5ARM 67 / 71 11. Summary of results Summary of results 1 How does social development change over time for autistic children? On average, the social development for autistic children improves linearly over time. 2 Does the level of language development at age 2 (initial language development) influence the social development trajectory over time for autistic children? At two years of age, the mean social development of children who have a High level of initial language development is greater than those children identified at a Low level. Also, the mean social development for children who have a High level of initial language development improves faster over time than those children rated at a Low level. La Trobe STA5ARM 68 / 71 11. Summary of results Summary of results 3 Does the trajectory of social development over time differ between children? The social development of some children show no improvement over time, while the social development of other children improve linearly over time. Furthermore there are children whose social development accelerates over time. These varying social development trajectories over time are found at all levels of initial language development. La Trobe STA5ARM 69 / 71 12. References References Anderson, D., Oti, R., Lord, C., & Welch, K. (2009). Patterns of growth in adaptive social abilities among children with autism spectrum disorders. Journal of Abnormal Child Psychology, 37(7): 1019–1034. Morrell, C. (1998). Likelihood ratio testing of variance components in the linear mixed-effects model using restricted maximum likelihood. Biometrics, 54(4): 1560–1568. Morrell, C., Pearson, J., & Brant, L. (1997). Linear transformations of linear mixed-effects models. The American Statistician, 51(4): 338–343. Pinheiro, J., & Bates, D. (2000). Mixed-Effects Models in S and S-PLUS. Berlin: Springer-Verlag. La Trobe STA5ARM 70 / 71 12. References References Verbeke, G., & Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data. New York, NY: Springer-Verlag. West, B., Welch, K., & Galecki, A. (2015). Linear mixed models: A practical guide using statistical software, 2nd ed. Boca Raton, FL: CRC Press. La Trobe STA5ARM 71 / 71

Use Quizgecko on...
Browser
Browser