We continue on with the series with Knowledge Bomb #7. The purpose and motivation for this series is outlined in the first entry. As a reminder, the blog entry is a summary of the conference presentation meant to serve as a reference for #FOAMED and allow for discussion.
Thanks to Dr. Chad Shirk, PGY2, for his work producing the following BOMB. For this entry Dr. Dan Robinson, Attending and Medical Education Fellow, has contributed an extensive response to help explain some of the details of the paper – enjoy.
A 66-year-old female, with a history of COPD, presents to the ED with a chief complaint of heart palpitations intermittently for the past week. EKG reveals atrial fibrillation. The patient denies any chest pain and her troponin level is normal. Remainder of history and workup are normal. The patient is feeling better and asks if she can go home. Do you send her home? Do you recommend admission? Luckily, you’re attending hands you a new article that looks to resolve this dilemma. Will you use it?
- A retrospective cohort study
- Done in Ontario, Canada (from a large database on all ED visits)
- Included patients aged 20-105, with a first-time ED visit, that had to have a primary diagnosis of a-fib
- Primary outcome: 30-day all-cause mortality after the index ED visit
- Created two different models
- Complex (logistic regression models)
- Simple (variables selected a priori)
Two different models were created for predicting 30-day death after visits to the ED for a-fib. A simpler model was made that can be easily committed to memory and acts as a starting point for risk-stratifying patients. It uses a mnemonic TrOPs-BAC, which represents the variables that combine to create a risk score divided into low (0-1), moderate (2-3) and high (>4) (see below – left). If the score is in a moderate risk zone or a more in-depth analysis is desired, a complex model with a complicated scoring system requiring computer-aided calculations is recommended to be used for admission decisions (see below – right).
This study provides an easy to use tool (much like the HEART score) to risk-stratify patients with a-fib. It can be used to clarify disposition decisions and decrease unnecessary hospital admissions, which can be very helpful in a busy ED. However, some limits are noted such that it does not specify or separate new onset vs. chronic a-fib. Also, it is not yet prospectively validated.
Application to my practice (PGY2 Perspective)
This article is not quite yet practice changing. It was a small study outside the U.S., and it has not been prospectively validated. Also, it does not comment on whether the a-fib was new onset within a certain time-frame or has been a previously known chronic diagnosis. Nonetheless, this is a great start to being able to risk-stratify this group of patients, and when used in the right setting, can be a great way to support your decision to admit or discharge a patient with atrial fibrillation.
Back to the case – according to the simple model, she scores a 2 which indicates use of the complex model, where she scores a 5. In this instance, a patient that may seem lower risk at first glance, actually comes out as higher risk on the complex model, suggesting a safer route to be admission and observation of the patient. Other cases where patients are moderate risk on both scoring systems require doctor-patient discussions to determine disposition.
Click to Read Attending Response - Dr. Dan Robinson - Medical Education Perspective
Dr. Robinson has written a great explanation on the various types of validity followed by analysis of the paper and his perspective on its application.
What is validity?
Validity is the evidence (an argument) that you can use to support or refute the claims/meaning/interpretation of data.1,2 In our case: are the claims of this paper (the predictive model) support the data that it came from? Below is a discussion of different types of validity and how they relate to the AFTER study.
You should not say that a model or test is valid. Rather, it has validity evidence to support it. 1,2
Medical education and research gurus do NOT talk about face validity anymore as a way of saying something has validity evidence.2 Face validity is the weakest/non-existent of all types of validity. Face validity is a concept that tests whether a person thinks the model does what it says that it is supposed to do. In this article, they talk about face validity when comparing the simple/pragmatic model to the complex model – were there variables in the simple model not included in the in complex model that changes the sensitivity of the complex when they were added? It did not. Thus it has “face validity.”
Internal validity is whether the assumed predictor(s) caused or did not cause the outcome, or was it something else (confounder).3 Or another way of saying this is: were there any confounding variables that were not measured? I feel that this model has good internal validity evidence. Without other knowledge of potential confounders this model looks appropriate, but this type of methodology often has many unknown confounders. It is a retrospective paper and as such we cannot conclude cause and effect; we can infer it though.
Are the findings of this study relevant to other populations, settings, patients, etc.? Cross-validation methodology (the methods used in this study) add some element of external validity evidence.3
Is this measure valid?
No test is “valid” across all domains, it can only show validity evidence. This model has validity evidence for the population in Canada. Does it have validity evidence for our population here in Chicago: there is no external validity evidence? We can make inferences from the Table 1 as to whether our populations are similar or different. Without a new study in our population there would not be much validity evidence for use to use this model.
Below I am going to discuss different sections of the paper and explain the methodology/interpretation of these results as well as issues I have with the paper.
Cross Validation Method:
This type of methodology uses a subset of a larger population to derive a predictive model and the other subsample to estimate how well the model fits the data. This is a measure of external validity evidence. There is inherent bias in this approach: the dataset is the same for both (it was subdivided – remember this for application below!!!). Therefore the validation dataset will obviously fit the predictive model better than it will when that model is taken elsewhere because they are using the same data to test the model!3
The cross validation is an example of external validity because you are saying that the validation sample data is similar to other populations and what you would expect these other populations to be like. Thus you are using it as a proxy for true external data. But it does give you some estimate. Thus, cross validation is external validity evidence and NOT internal validity (internal validity is whether you think that the presumed predictor actually caused the outcome or was it another variable (confounders). Cross-validation does not test this.).
Issues in methods
The authors excluded secondary a fib diagnosis due to significantly higher mortality rate. Is this important for our population? Do our physicians prioritize their diagnosis lists to have this model work for us?
There appears to be a few differences between the derivation and validation samples without comment by the authors. There is no discussion on how they randomized these groups from the total population. This leads us to assume the authors considered and made correct interpretation; assumptions are dangerous.
There is no discussion of power calculations; would be nice to have it reported.
Notes on the Results:
Important to discuss use of c statistic (concordance). 0.5 is random concordance and 1.0 is perfect concordance. This study showed good concordance with a c statistic of 0.81 in the validation cohort.
Is this practice changing? No, not yet. I think that the methods are mostly sound, but as I brought up above there are issues that I have with the methods and their reporting. The exclusion criteria are especially important to look at if trying to implement in your own setting. The study shows benefit.
But! I have reservations. This model has face validity (not supposed to use this anymore) and some internal and external validity evidence. There needs to be further research applying the model to a population external to the original to show if this model is appropriate for Prime Time. The cross validation method is useful, but it is not the only way to show external validity evidence. It shows external validity evidence only if you have a similar population as the original population. I would like this study to be replicated in other settings and institutions (maybe here in Chicago) to see if the results are similar. One last thing: a prospective analysis would allow researchers to see if there is a change in outcomes, which would give much more validity evidence to this model.
- Downing SM, Yudkowsky R. Assessment in health professions education: Routledge; 2009.
- Yudkowsky R. Internal Validity. In: Robinson D, ed. Phone conversation2016.
- Lineberry M. Internal Validity. In: Robinson D, ed. email2016.
Dr. Dan Robinson is finishing his fellowship in Simulation Medicine and Medical Education at UIC. He is also completing his Masters in Health Professions Education at UIC. Dr. Robinson is interested in the use of simulation as a novel way to teach in medical education. His interests also include curriculum development and research in medical education.