Few psychologists would deny that psychological measurement is messy and that observed scores are biased by systematic and random measurement error. It is therefore surprising that also few psychologists are actually trying to do something about this by using multi-method measurement models to control for measurement error. Multi-method measurement models are usually published by psychometricians who are only interested in measurement, but not in substantive theories. As a result, most published work relies on messy measures with unknown validity. This is also true for research on well-being. Given the lack of emphasize on measurement, it is also very difficult to find competent reviewers of work that uses measurement models because most well-being researchers are not trained in measurement.
To avoid the pain of peer-review, I am presenting the result of a complex model of well-being as a blog post. This model addresses many unanswered questions in the well-being literature. This may sound like a good thing, but it is a problem. Psychology articles are designed to answer one question at a time. Hopefully, without any concern about effect sizes. Like, does money buy happiness? Yes or No? Or, does neuroticism predict well-being. Of course, it does. This piecemeal work often culminates in a meta-analysis that shows the average correlation across many studies. Although this information quantifies effects, nobody knows what to do with this quantitative information. Psychologists are not used to think in terms of complex models with multiple variables that can all influence each other. In fact, they despise these models becaues they make causal assumptions with correlational data. What would happen if we allowed that. After all, we pride ourselves on the use of experiments which makes us superior to the social sciences that use correlations. The problem is that nobody has come up with a good experiment to study what makes lives good that could be adminstered on Mturk for 20 cents per participant.
Ok, rant over.
1. A measurement model of well-being.
The most widely used indicators of well-being are life-satisfaction judgments. They have high face validity because they directly ask respondents to evaluate their lives with a subjectively chosen ideal. However, face validity does not imply that a measure is valid or that all of the variance in it is valid variance. The most widely used evidence to claim that self-ratings of life-satisfaction are valid indicators of well-being is that they show convergent validity with informant ratings. Although everybody might be lying, it is more likely that agreement between raters reflects some valid information about individuals’ lives.
Lucas, Suh, and Diener (1996) demonstrated convergent validity of self-ratings with averaged informant ratings.
These data cannot be used to create a measurement model because the self-ratings at time 1 and time 2 are likely to share some method variance. This would explain why they are more highly correlated with each other than with averaged informant ratings that reduce random and systematic measurement error by averaging across raters.
The first measurement model of well-bieng was published in 2013 (Zou, Schimmack, & Diener, 2013). Rather than averaging informants, each rater was treated like an independent method. The data came from the Mississauga Family Study. In this study, students who lived with their biological parents participated in a Round-Robin family study. Thus, there were three targets (students, mothers, fathers). In addition, there were three informants (students, mothers, fathers). These data can be analzyed with a model with planned missing data and four measures (self, student informant, mother informant, father informant), where data are missing for the informant ratings of the same target (informant ratings of student by students).
|Life-Satisfaction Student Informant||0.38|
|Life-Satisfaction Mother Informant||0.31||0.35|
|Life-Satisfaction Father Informant||0.37||0.38||0.47|
The correlations in Table 2 show convergent validity four all four methods. Moreover, there is no evidence that self-ratings are more valid than informant ratings which should produced stronger self-informant correlations than informant-informant correlations. All methods seem to be equally valid. The main deviation was found for the correlation between mothers and fathers as informants. This could reflect some shared method variance in parents’ ratings of students’ well-being. However, a single-factor model fits these data reasonably well, CFI = .983, RMSEA = .048.
Nevertheless, adding a method factor (i.e, a correlation between the error variances) for mothers and fathers as informants improved model fit, CFI = 1.000, RMSEA = .000.
The path coefficient from the factor (unobserved variable, latent variable) to the observed self-rating scores of .597, implies that only about one-third of the variance in self-ratings is valid variance that is also reflected in informant ratings. As these scale scores are based on the 3-item Satisfaction with Life scale (Diener et al., 1985), with a reliability of about 85%, this means that a large portion of the variance in self-ratings is systematic measurement error. This hasn’t stopped well-being researchers from treating single-item life-satisfaction ratings as perfectly valid measures that do not require a measurement model (e.g., World Happiness Reports).
In conclusion, the first part of the model answers three questions. First, do life-satisfaction judgments have some validity? The answer is that they very very likely have some validity. Second, is all of the reliable variance in life-satisfaction judgments valid? The answer is that it is very very unlikely that this is the case. The third question is how much of the variance in life-satisfaction judgments is valid variance? Of course, there is no precise answer to this question, but a reasonable answer is that it is more than a quarter and less than fifty percent.
2. Affect and Well-Being
The second question is what predicts variation in life-satisfaction judgments. One theory of well-being is that the only relevant information is how much pleasure versus displeasure we experience in our lives. This theory is known as hedonism and usually attributed to Bentham who famously proposed that we are all slaves to pleasure and pain. In philosophy the question was whether it can be objectively justified to define well-being in this way. The modern consensus is that well-being cannot be reduced to feelings because individuals might want to do other things with their lives. Thus, a subjective theory of well-being at least allows individuals in principle to have high well-being with a lot of pain and little pleasure. However, it seems likely that people care about their feelings and that they influence how they evaluate their lives. This brings us to two empirical questions. First, how much of the variance in well-being is explained by feelings? Second, what else influences well-being independent of feelings?
Many studies have examined the first question using just self-ratings. However, this creates problems. Feelings are also measured with self-ratings, which creates shared method variance, but there could also be systematic method variance that is unique to judgments of feelings and life-satifaction. To overcome these problems, it is possible to create measurement models for feelings and life-satisfaction judgments and to examine their relationship at the unobserved level that removes measurement error from the observed scores.
To go slowly, I first show the results for a model in which memory-based ratings of happy feelings are used to predict life-satisfaction judgments. Model fit was acceptable, CFI = .989, RMSEA = .036.
The autogenerated model in MPLUS looks pretty crappy, but it does show all of the paths, including the correlated error variances for ratings by the same rater as well as for father and mother informant ratings. The key finding is that the standardized effect size for the effect of happiness on life-satisfaction judgments is .718. In a model with a single predictor, we can square this parameter and see that happiness alone accounts for 51% of the variance in well-being.
We can now do the same for negative affect. Very similar results are obtained with global ratings of unpleasantness or ratings of sadness (.69 vs. 71). Thus, sadness also explains 50% of the variance in well-being.
If happiness and sadness were independent, these results would confirm the hedonist theory of well-being, but happiness and sadness are not independent. It is therefore necessary to include happiness and sadness simultaneously as predictors to see how much of the variance in well-being is explained by affect alone and how much variance is left to be explained by something else. This is of course just multiple regression with the notable difference that regression is conducted at the level of unobserved variables that control for measurement error.
The autogenerated model looks fine because it doesn’t include the correlated errors. The actual model included these errors. The fit of the model was acceptable, CFI = .989, RMSEA = .029. The correlation between happiness and sadness was r = -.59, which is similar to results in the only two multi-method studies of this relationship (Diener, Smith, & Fujita, 1995; Zou et al., 2013). The residual (unexplained) variance in well-being was 37%. Thus, affect explains roughly two-thirds of the valid variance in well-being, but one-third is not explained by affect.
Thus, the first conclusion that we can draw from these results is that well-being cannot be reduced to the balance of pleasure and displeasure. Apparently, we are not slaves to our passions and have ways to embrace our lives even when we are not experiencing pleasure or dislike our lives even if we do. Most important, the remaining unexplained variance is not just measurement error because the model controlled for it.
This brings us to the second question. What else predicts well-being independent of our feelings. And this question will be examined in Part II of this series called “How to build a monster model of well-being.”
Continued here. Part 2