The “Time trade-off” (TTO), is the most widely used method to “quality adjust” life years for “QALYs” in cost utility analysis. In this paper we ask if it is theoretically likely that the TTO is valid for this use. The TTO consists in a trade off between longevity and quality of life. Firstly, we argue that it is impossible to control for all factors that may influence one’s willingness to sacrifice lifetime. Secondly, that longevity and quality of life are too closely interrelated for the hypothetical trade off to reveal real preferences. Thirdly, that the TTO handles the value of a life year inconsistently because it simultaneously assumes that it changes (as an outcome measure) and that it doesn’t change (as a currency unit). Lastly, we ask whether the difficulties stem from an inherent contradiction in trying to quantify quality of life. The problems of theoretical validity and internal consistency, contrast the use of the results as exact measurements. We conclude that cost utility analysis based on TTO cannot be trusted as a tool for setting priorities in health.
Statistics from Altmetric.com
Cost utility analysis is an ambitious approach designed to compare the costs and the benefits of alternative health programmes.1,2 The aim is to make all kinds of health gains directly comparable on a numeric level. The great potential of the cost utility analysis, as well as the controversy, lies in the use of a common “currency” for all health outcomes, such as the quality adjusted life years (QALYs). In terms of QALYs, the value of a long life lived with chronic illness can be directly compared with the value of a short life lived in good health (figure 1). The QALY consists of the number of life years lived, multiplied by a weighting of the value of those life years on a numeric scale from 0.00 (death) to 1.00 (perfect quality of life).
The value of one year lived in the most desirable condition (1.00) is considered equivalent to three years lived with a quality of life weighting of 0.33 or to four years with 0.25. These valuations are called quantified value of health states,3 quality of life weights,4 utilities,5 health values,6 preferences for health states,7 preferences for quality of life,8 and the worth (utility) of perceived quality of life,9 in different publications. We will refer to them generally as quality of life weightings.
The time trade off method (TTO)10,11 is the most commonly used method for eliciting such quality of life weightings for QALYs.12 It is also being considered by the World Health Organisation as an improvement of the protocol employed to adjust life years for disability as in “DALYs”.13,14 The method has been advocated as the best there is,15,16 and is often referred to as a previously validated technique.9,17,18
The TTO method consists in eliciting the respondents’ trade offs between length of life and quality of life. The respondents are asked which proportion of their remaining lifetime they would be willing to sacrifice in return for being relieved of a given health problem. The more life time they are willing to forgo in exchange for perfect health, the bigger is the burden attributed to their health state. The quality of life improvement ascribed to an intervention is the average quality of life weighting after treatment, minus the quality of life weighting before treatment, multiplied by the duration of this effect. In practical decision making, interventions are increasingly judged as being cost effective or not, on the basis of changes in the TTO weights and the resulting cost utility analysis.19–21
How quality of life weightings should be assigned to health states constitutes a large part of the current discussion in health economics. One topic that is frequently investigated and debated is the finding that different methods tend to give different quality of life weightings for the same health state.22–26 A number of technical adjustments have been suggested to explain or compensate for this, including for instance the use of “power functions” when translating the results of one technique into those of another.27–29 The difference between weightings elicited under conditions of uncertainty—that is, in “true gambles”—and under conditions of certainty, such as the TTO, is a central topic.30 Refusal to trade with lifetime has been explained as denial,31 and attempts have been made to adjust for risk aversion by using certainty equivalents.32 There are claims that TTO results depend on the perspective of the respondents,33 and on discussions of whose values count.34 The patient’s perspective is often recommended, but there is concern about the subjective influence of the time perspective and of adaptation when evaluating temporary and chronic diseases.27,29,35 The TTO results are found to depend on age and sex group, and efforts have been made to provide standard population quality of life weightings as a reference.36,37 Both Harris,38 and Arnesen and Nord,39 have also raised ethical discussions about assigning less value to the life years of those who are ill.
What are seldom discussed, however, are the underlying assumptions related to assigning quality of life weightings to diagnostic groups. The aim of this paper is to explore whether it is theoretically likely that the TTO technique, used in real patient populations, will generate valid quality of life weightings for use in cost utility analysis.
In what follows we will address the validity of using the TTO results as quality of life weightings in the construction of QALYs. The validity of the TTO cannot be established against a “gold standard” because none exists.40,41 In general, a method is considered valid if it actually measures what it claims to measure.42,43 The TTO, however, is said to measure a whole range of phenomena, and assessing validity is not straightforward. Instead of looking at the different definitions of what the TTO intends to capture, one may look at the way the results are used. The use of the results of the TTO in the construction of QALYs is always the same: namely, as inputs on the y axis in figure 1. The quality of life weightings on the y axis are handled as fixed units, in precisely the same way as the life years on the x axis.
For the purpose of this study, we shall formulate the TTO as a hypothesis:
The proportion of remaining lifetime that respondents, on average, are willing to sacrifice in exchange for being relieved of a specific health problem can be used as a quality of life weighting for this health problem.
This hypothesis is not directly falsifiable or verifiable in itself. One may, however, discuss the validity of the hypothesis by discussing the premises on which it is built (as in the axiomatic-deductive method43). We identify the following four assumptions embedded in this hypothesis:
Willingness to trade lifetime for improved health reflects current health.
The hypothetical trade off reveals “true” preferences.
Sacrificed life years constitute a stable “currency”.
Quality of life can be measured on a numeric level.
We shall now address the validity of the TTO by discussing whether these assumptions are reasonable in real world conditions.
DISCUSSION OF ASSUMPTIONS BEHIND THE TTO METHOD
Does willingness to trade lifetime for improved health reflect current health?
When developing the TTO method, Torrance assumed that “The less desirable is the health state, the greater is the amount of life that the subject will trade off in order to be free of the health state.”11 In other words, he assumes that the loss of quality of life associated with an illness can be measured by asking the respondents how much life time they would sacrifice to be cured of that illness. Is this a reasonable assumption?
We know that people react differently to illness. While the knowledge of a lethal condition makes some people value their days and minutes more than before, it makes others perceive a loss of meaning of their remaining days. It is, however, not the individual valuations, but the average quality of life in a diagnostic group that is of interest for use in cost utility analysis. It may be reasonable to anticipate some kind of association between the average willingness to trade with time and the average burden of health problems. But is it possible to know how much of this willingness is caused by the health problem at study, and how much is due to other factors?
In empirical studies, some respondents refuse to trade any lifetime in exchange for improvements in health. The rate of “zero traders” has no obvious relation to the severity of the disease. For instance, in two studies of metastatic symptomatic cancer44 and elderly people experiencing a loss of physical autonomy,45 more than half of the patients refused to trade a single day of life in exchange for perfect quality of life. This gave a median quality of life weight of 1.00 which should reflect perfect health. As a contrast, three quarters of women with mild menopausal symptoms said they would sacrifice considerable amounts of lifetime in exchange for being relieved of their problem.46 Box 1 shows some published reasons for refusing to trade with lifetime, in connection with the TTO in patient populations.
Box 1 Reasons for refusing to trade lifetime in exchange for health improvements
“Life is sweet”, “there are still things to enjoy”, “I choose to live day by day”, “I like to feel positive”, “this is not an acceptable question”, “individuals have no right to manipulate life expectancy”.44
“Time with my family is too precious”, “my handicapped son needs me to take care of him”.46
“Only God says when we die”, “impossible to answer”.41
“Silly question”, “too hard to imagine”, in conflict with religion, personal, or philosophical beliefs.47
The question was too hypothetical, in contrast with religion, older patients failed to grasp what was asked.48
“One should not be making deals with God”.49
Among studies of large samples of the general population, the proportions of respondents who are unwilling to exchange length of life for health improvements vary considerably. For instance, Lundberg et al found a rate of zero traders of 2% while and Wells et al found 85%.37,50 From these examples, it seems that either TTO questions are posed so differently that their respective results are not comparable, or the reluctance to trade with lifetime is primarily affected by factors other than the severity of the condition. These possibilities are further explored in separate papers (Arnesen T, Trommald M. Roughly right or precisely wrong? A systematic review of quality of life weights elicited with the time trade-off. Unpublished data, 2003).51
Fowler et al suggested that the TTO is a measure of reluctance to give up life rather than of the severity of their health state.52 We would add that all components affecting the will to live have, in principle, a direct influence on the trade off between lifetime and health improvements. Aspects of life such as having children, friends, and social esteem may, in many cases, have a greater impact on the TTO than the health problem being studied. The ability to enjoy life may be more important for the TTO result than any illness. Responsibilities for others may override concerns for one’s own health. Religious beliefs may forbid any thought of such negotiation. In this complex situation, how can one check whether a sample of patients is representative for the diagnostic group they represent? Controlling for all the factors that might influence the TTO—even if were possible—would require huge, random samples of the population at study, because the confounding factors are as complex as human beings themselves.
Does a hypothetical trade between longevity and quality of life reveal true preferences?
Trade offs between quantity and quality of life do occur in the real world. Some patients and their advisors are forced to choose between higher chances of survival, or dying sooner with less pain. The TTO gives its respondents a similar kind of choice, but presented hypothetically or as a mind game. There will be no execution of the deal, and their choice will not influence their fate. The TTO presents an interesting and thought provoking question, but does it reveal the preferences that would have materialised if the respondents really had to make that very choice?
The TTO approach is rooted in consumer theory in which the value of an item is decided by the customer’s actions in the market. There are no “true” values other than the price obtained. The price of a good is normally paid in money but similar trade offs can be performed between any two goods. For instance, the value of a kilogram of oranges may be expressed in terms of the number of apples that the respondent is willing to “pay” to have them. The difference between the strengths of the respondent’s preferences for a kilo of oranges and for a kilo of pears may thus be compared numerically in terms of apples paid for each. The question is whether the same logic is valid for a trade off between quantity of life and quality of life. Can one use a hypothetical trade off to compare the value of life lived in different health states in terms of the number of life years the respondent is willing to pay to improve those states?
It is difficult to conceptualise a trade of lifetime against quality of life because they are so deeply interrelated. Quality and quantity are different dimensions of existence, and neither of them can substitute for the other. One needs to be alive to enjoy quality of life, and good health normally leads to longevity. It is therefore not certain that there exists a true point of indifference between preferences for living well and preferences for living long.
Even if we assumed that such a true indifference point did exist, it is still questionable whether we would be able to discover it. Choices involving trade offs between length of life and quality of life cannot be observed in a real market. Therefore health economists have opted for hypothetical choices as the second best solution.53 The reliance on health state valuations derives its force and assumed validity from the theory of revealed preferences. However, according to Slovic, a main theme in behavioural decision research during the past two decades has been that “people’s preferences are often constructed in the process of elicitation”.54 In a study using TTO, Lenert et al concluded “The results suggest that utility values are heavily influenced by, if not created during, the process of elicitation.”55 Kaplan et al comment that “human information processors do poorly at integrating complex probability information when making decisions that involve risk”,56 and Naylor and Llewellyn-Thomas put it this way: “How many individuals who make poor purchases in a clothing store face ‘immediate painless death’ as penalty function for their temporary lack of taste?”.57 Another question concerns what would happen if the respondents knew that the results would be used for setting priorities between their own diagnostic group and other patients.58
A trade off between living good or living long, may be easier to perform the more hypothetical it is. The results of the TTO elicitation process seem to be highly dependent on whose lifetime the respondent is “playing” with. Experts and relatives tend to be more willing to trade off lifetime in exchange for quality of life than the patients themselves.33 This could be an argument for eliciting valuations from those affected by a condition rather than from healthy experts. Changing the perspective of the respondent would not, however, give us a way of verifying whether TTO measures a genuine “willingness to pay” or rather the “willingness to play”.
Can sacrificed life years be used as a stable currency?
In the TTO approach, the burden of a health state is established by a “price” paid in life years. The more years the respondent is willing to pay to get well, the bigger the (perceived) potential health gain. A constant, proportional trade off is assumed. Use of life years as a currency to express the value of a health state has a democratic flavour compared to using the willingness to pay money, since life years are more equally distributed within society. However, on a closer look, we find that there are problems with the approach. The unit of payment, the value of a life year, may not be a stable currency. In some versions of the TTO, the respondents are specifically reminded that they should consider each year to have the same value.55,59 However, most versions of the TTO do not specify the kind of life years to be traded with.
Some people would never exchange a single day of their lifetime for any other benefit. In this case, the value of a life year is immeasurable, or, in a mathematical sense, infinitely positive. Other people may have reached a state where they are satisfied with their lifetime and are well at ease with facing death. To them, length of life has a neutral value. Some people find life extremely painful, and religious people may see death as an entrance into a better life than this. For people who would rather be dead than living, length of life has negative value.
Most people have a different attitude to the idea of trading time immediately in the present, compared with trading at the end of their life. This goes beyond the tendency of a greater willingness to “pay” if the payment is postponed, which could be dealt with through discounting. When considering whether or not to sacrifice the last years of their lives, the respondents’ imagination of those years will influence the trade. Do they see themselves in their current state of health with a social life like now? Or rather in an institution somewhere—forgotten, crippled, and sad? The same respondent may be willing to trade ten bad years or onegood year for the same health gain.
The above discussion reveals that the TTO method has a problem of internal consistency: On the one hand, it is assumed that lifetime has a stable value so that life years can be used as interchangeable units of measurement or currency. On the other hand, the whole point of the TTO method is to find out how much illness reduces—that is, changes—the value of life years. In other words, the method simultaneously assumes that the value of a life year changes (as an outcome measure) and that it doesn’t change (as a currency unit).
Is it possible to quantify quality of life?
A first requirement for assessing validity is that it be known what is being measured. Gill and Feinstein found that out of 75 papers with “quality of life” in their titles, only 15% defined quality of life.60 As mentioned in the introduction, the TTO has been said to measure a wide range of concepts. When launching the TTO, Torrance et al presented it as a measure of utilities, quality of life, and of health. The definition of health they employed was “a state of complete physical, mental and social well being”.10 With many different concepts in use, it is difficult to keep track of what is being measured.
The problem of defining a common concept of quality of life goes deeper than terminology. The question of what constitutes “the good life” is one of the basic philosophical questions. In some cultures, a good life is what is good for the community, while in other cultures “the good life” consists in following the prescriptions of religion and tradition. In the western world, the emphasis tends to be on individual happiness and personal self realisation. Concepts of quality of life vary not only between cultures but also between people in the same culture, and even for the same individual according to situation and age.
In the vast and rapidly growing literature about TTO and similar methods, there has been relatively little attention paid to the apparent contradiction in the attempt to quantify quality of life. In other scientific fields, one typically distinguishes between topics appropriate for quantitative and qualitative research methods. Traditionally, quantities are counted and qualities are judged. Crossing the borders for what can be quantified often strikes us as going against common sense. For instance, The Beatles poked fun at attempts to quantify happiness when they sang: “…some kind of happiness is measured out in miles…”. Aristotle observed that one cannot exceed the level of precision inherent in the subject itself: “Our subject would be sufficiently articulated if it should achieve the level of clarity that is appropriate to its subject matter”. Before using quantifications of the most complex quality of all, the quality of life, one should at least be able to define what one has measured, and be able to defend theoretically the claim that the numbers obtained reflect this quality.
In empirical studies and reviews, the TTO has been found to give unexpected results. For example, “metastatic symptomatic cancer” was assigned a median quality of life weight of 1.00 corresponding to a perfect health related quality of life,44 while osteoarthrosis of the hip got a mean quality of life weight of 0.29 corresponding to a very poor health state.61 One specific health state, “end stage renal disease on in-centre haemodialysis”, elicited with the TTO on the patients’ own behalf, was assigned quality of life weights between 0.39 and 0.87.62,63 The TTO results assigned to diagnostic groups in empirical studies are reviewed in a separate article.64
Unexpected empirical results could be caused by inappropriate use of the TTO methodology. Dolan et al found that the quality of life results varied more across variant than across method.27 There is extensive variation in the TTO methodology employed. For instance the hypothetical life expectancy, used as a frame for the time trade off, varies from one month44 to 30 years.65 This is further explored in a review of the use of TTO methodologies.51
But another possible explanation for such variation is a lack of theoretical validity. This is what we have discussed here. We find that the problems of theoretical validity may well explain unexpected empirical results. Our discussion is limited to the use of quality of life weightings for QALYs, and we have not addressed other potential uses of the TTO, such as clinical decision making at an individual level. However we are not aware that anybody has seriously advocated the TTO for this use, because disease specific instruments are preferred when comparison across diagnostic groups is not needed.
Although the TTO is the most widely used method for assigning quality of life weightings to QALYs, the fact that it has problems is not news to the health economic community, even though there is no consensus about what the problems consist in. Given the difficulties, one may therefore ask why the method is so widely used.
One part of the explanation may be that the demand for handy, quantified health outcome measures is strong and increasing. If effects on quality of life are not quantified, they cannot be included in any economic analysis requiring quantification. Accurate, numerical quality of life weights would accordingly make it easier to take quality of life effects into account when scarce resources are distributed. The great uses one might make of quality of life weightings were they available does not, however, imply that it is possible to elicit them.
Bernard Williams has described the dilemma in this way:
Again and again defenders of such values are faced with the dilemma of either refusing to quantify the value in question, in which case it disappears from the sum altogether, or else of trying to attach some quantity to it, in which case they misrepresent what they are about…66
He suggests that “Some of the effort should rather be devoted to learning—or learning again—how to think intelligently about conflicts of values which are incommensurable.”66
In the field of valuations of health states there is, in our opinion, a disproportion between the abundance of empirical and technical papers and a lack of a more conceptual debate. Comparing cost and effectiveness of health services could have been a fruitful common area of research for economists, health workers, philosophers, and social scientists. As it is, however, the research field remains largely “unseen” by those who are not health economists. Discussions within the same paradigmatic framework tend to turn around technical solutions, rather than questioning the underlying assumptions. Quality of life assessment on the numerical level thus runs the risk of being a closed research field where critiques from “outsiders” and fundamental or ethical questions are dismissed as beginners’ faults.
This lack of contact between research fields does not necessarily lead to a blank dismissal of each other’s research. It may also lead to a mutual overestimation of the precision of each other’s work. For instance, we find that health workers are more prone to use cost utility analysis as an exact science than are health economists themselves, and that health economists handle diagnostic groups such as short stature3,67 or depression68 as more definite and homogeneous unities than medical doctors would ever do. One may say that the “typical patient” in health economics has a single well defined illness, for which there exists one set of treatment options with well established effects and risks, while the “typical patient” in clinical medicine has a combination of chronic and acute, medical and social problems for which the effects of treatment are difficult to predict precisely.
Cross disciplinary discussion is often abandoned because there is little agreement on what the problems are, what are the interesting questions are, and what are the appropriate methods and limits for scientific solutions to these problems. Yet it is often in the clash between different traditions that the best solutions are found. We therefore believe that a broader approach, which also involves clinicians, decision makers, and social scientists, is necessary to address difficult questions of how to measure, value, and compare health outcomes.
In this paper we have discussed the validity of the assumptions underpinning what is presumably the best available method for assessing quality of life weightings for cost utility analysis—the time trade off method. We have identified four assumptions underpinning the method and discussed each of them. We argue that the assumptions are both unrealistic and inconsistent in real world conditions. We find a number of reasons for doubting that the average hypothetical sacrifice of lifetime in a diagnostic group is a valid measure of the average quality of life in that group. We conclude that we find no justification for using the results of the TTO as quality of life weightings in the construction of QALYs.
The field of health economics has contributed substantially to the topic of how to measure health for economic analysis. If the TTO is the best method available, however, our discussion bears upon the validity of the cost utility approach as such. We believe that there is now a need for new perspectives in a conceptual and normative debate about how best to compare the cost and effects of health interventions, and about how to strike the right balance between giving a true answer and giving one that is practical in use.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.