"The OMOP Common Data Model allows for the systematic analysis of disparate observational databases. The concept behind this approach is to transform data contained within those databases into a common format (data model) as well as a common representation (terminologies, vocabularies, coding schemes), and then perform systematic analyses using a library of standard analytic routines that have been written based on the common format." - Observational Health Data Science and Informatics (OHDSI)
The Common Data Model (CDM), can support electronic health record data, claims data, and more. Once a source of data is converted to the CDM schema, the data can be analyzed using standardized tools. OHDSI develops and maintains open-source tools for data quality, characterization, product safety surveillance, comparative effectiveness, quality and others. Some of the most famous include ATHENA, ATLAS and HADES.
OHDSI makes the Medicare DeSYNPuf data available in OMOP format. I used this data to create and analyze a cohort of 203 individuals who were exposed to amiodarone hydrochloride for more than 30 days.
First I had to locate concept IDs for RxNorm drug codes containing the ingredient to ensure they were present. Then, I queried the drug_exposure table for persons who had been exposed to any medication containing amiodarone for at least 30 days.
I used this cohort to get post-period visits and procedures associated with these 203 individuals. Click on the thumbnail to view the full SQL on GitHub.
The three graphs below show the proportion of gender (left), race (center) and ethnicity (right) in the amiodarone-exposed patients.
As expected, the age distribution of the exposed cohort patients aligns with that of the typical Medicare population.
The geographical distribution of the patients does roughly coincide with areas that have the most Medicare beneficiaries, and the largest populations in general. However, based on demographics alone, I would have expected a larger number of patients in Illinois, New York and the DC area.
The Visits data contained diagnostic codes and visit information for 5,595 visits for the 203 exposed cohort patients. These visits took place on dates >= the beginning of their amiodarone treatment. There were an average of 3 diagnosis codes per visit.
The Procedures data contained diagnostic and CPT codes and procedure information for 5,817 visits for exposed cohort patients. These procedures took place on dates >= the beginning of their amiodarone treatment, and there were an average of 2.7 diagnosis codes per procedure.
LUNG
793.19 Abnormal findings on diagnostic imaging of lung
786.30, 786.39 Cough with hemorrhage, respiratory passage hemorrhage
794.2 Abnormal PFTs
794.8 Abnormal LFTs
518.82 ARDS
514, 518.4 Pulmonary edema
518.3 Pulmonary eosinophilia NOS
515, 518.89, 516.34 Other interstitial diseases
136.3 Acute interstitial disease, Pneumocystosis
786, 786.05, 786.09, 786.39, 786.52, 786.7, 786.9 Symptoms involving respiratory system and other chest symptoms, including:
Shortness of breath
Other respiratory abnormalities
Other hemoptysis
Painful respiration
Abnormal chest sounds
Other symptoms involving respiratory system and chest
THYROID
794.5 Abnormal thyroid testing
245.4 Drug-induced thyroiditis
244.2, 244.3 Hypothyroidism, medication-induced
242.90 Other thyrotoxicosis
LIVER
573.8 Other liver disease
OTHER
V58.69 Other long term (current) drug therapy
The mean interval between amiodarone initiation and condition was 302.6 days, nearly a year.
"The overall prevalence of adverse effects from amiodarone therapy is as high as 15% within the first year of use and 50% for long-term use."
Lung, Liver and Thyroid AEs related to amiodarone therapy are some of the most common side effects of the drug:
Pulmonary toxicity
"Pulmonary toxicity typically presents in the first year of use and most commonly resembles interstitial lung disease. However, pulmonary toxicity also can present as organizing pneumonia, pleural effusion, acute respiratory distress syndrome, or diffuse alveolar hemorrhage. Mortality from amiodarone-induced pulmonary toxicity has been reported to be close to 10%."
Thyroid disease & thyrotoxicosis
"Amiodarone therapy may result in hypo- or hyperthyroidism, with hypothyroidism being almost twice as common. Toxicity usually is related to thyroiditis. Amiodarone-associated thyrotoxicosis can be difficult to treat and carries a high risk of mortality."
Liver toxicity & GI effects
"There is a 1% annual incidence of liver toxicity in patients treated with amiodarone. Most cases resolve after stopping the drug; however, toxicity can occasionally progress to end-stage liver disease and cirrhosis. IV amiodarone may cause acute liver injury within one day of infusion. Nausea, anorexia, and constipation are the most common gastrointestinal side effects."
While this analysis looks only at Lung, Liver and Thyroid conditions, other conditions related to amiodarone toxicity include:
Torsades de Pointe (TdP)
"Amiodarone may induce TdP in the first 48 hours of IV administration"
Neurological toxicity
"Neurologic toxicity can occur in up to 27.5% of patients, ranging from cognitive impairment to peripheral neuropathy, ataxia, and in some rare cases, quadriplegia."
Other impacts
"Dermatologic effects include blue skin discoloration ("smurf-skin") and photosensitivity. In rare instances, amiodarone may cause epididymitis and erectile dysfunction."
Reference:
Florek JB, Girzadas D. Amiodarone. [Updated 2023 Feb 6]. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2023 Jan-. Available from: https://www.ncbi.nlm.nih.gov/books/NBK482154/
While there were 57% females and 43% males in the exposure cohort, there were 69% females and 28% males in the impacted cohort.
The exposure cohort was 88% white, much whiter than the total # of Americans on Medicare in 2010 (59% white). The impacted cohort was 75% white, and 17% Black, indicating that proportionally more Black patients were impacted.
100% of the impacted cohort were non-Hispanic, however the exposure cohort only had 3 Hispanic patients.
Impacted patient ages at sentinel visits skewed much older than the exposed cohort's age distribution, with mean age at diagnosis of 77 years and median age of 80 years.
Impacted and non-impacted state proportions seemed similar across the board, but each state was tested against the average of all other states for statistically significantly higher #s of exposed patients with AEs.
A. Was there a significant relationship / dependence between biological sex and amiodarone AEs in the visit data?
statistic=4.010
p-value=0.045
dof=1
expected frequency=females/males, original cohort [67.35960591, 38.64039409],
females/males, sentinel cohort [61.64039409, 35.35960591]]
Yes, Chi-squared results indicate we should reject the NULL hypothesis at the 95% confidence interval (alpha = 0.05, and p < alpha at 0.045), and conclude that female biological sex was statistically significantly associated with having amiodarone AEs after exposure.
B. Was there a significant relationship / dependence between self-identified race and amiodarone AEs in the visit data?
statistic=3.15
p-value=0.207
dof=2
expected frequency=white/Black/unspecified, original cohort [85.11330049, 14.09852217, 6.78817734],
white/Black/unspecified, sentinel cohort [77.88669951, 12.90147783, 6.21182266]
No, self-identified race and the # of amiodarone AEs demonstrated statistical independence.
C. Was there any significant difference in the mean ages of patients with amiodarone exposure who had at least one AE diagnosis at a visit, and those who didn't?
Descriptive Statistics, Age at Non-AE Visits vs. AE Visits:
Welch's T-Test Results:
statistic=-5.244
p-value=.00000167
df=3332
Yes, the cohorts' mean ages are significantly different.
Older people exposed to amiodarone may be more likely to have a lung, liver or thyroid-associated AE. (Although, they may, also, just be more likely to have lung, liver or thyroid conditions due to age.)
D. Were there significant differences in the # of exposed patients with AEs vs. without AEs by state?
There were only 38 states to compare, because there were no exposed patients in some of the states.
No, after comparing cross tabs for each state, there were no significant geographical differences.
A code snippet in python used for testing each crosstab using Fisher's Exact Test is embedded below.
277 visits document these sentinel lung, liver and thyroid-related diagnoses & symptoms.
The majority of patients who suffered a suspected complication did so within the first year of amiodarone treatment.
54% (150) of these AE diagnoses occurred within the first 6-7 months (30 weeks) of therapy.
90.6% of all diagnoses were lung / breathing related diagnoses.
Distribution of Sentinel Condition Visits
Distribution of AE Types
Most Common Visit Conditions - Amiodarone-Related Lung Toxicity
I used seasonal decomposition and significant covariates to compare Poisson and NB2 regression models for forecasting the # of AE-related visits per week.
First, sentinel visit data were analyzed to isolate Trend and Seasonality. The moving average window was 19 weeks.
There is a seasonal peak approximately every May, and a seasonal trough approximately every July. The trend peaks in late January - early February of 2010. The residuals are perfectly random.
Because this is weekly visits data, a Poisson model was fit first, using the decomposed trend & seasonality, plus two covariates that were highly correlated by not collinear with the outcome variable: total # of procedures for the visits, and total # of diagnoses for the visits. The model fit and diagnostics are presented below:
The visit_count is the endogenous variable here, and visit_dx_count (diagnosis count), visit_proc_count (procedures count), trend and seasonality are exogenous. The intercept indicates that the visit_count is .5 (+/- 0.23) when all the other values are zero. The model was calculated using maximum likelihood estimation on 68 observations (observations where the trend was null due to the MA window were of necessity excluded) assuming a Poisson distribution with log link, and had 63 (=68-4-1) degrees of freedom.
The visit_dx_count and trend variables were significant at the 95% CI, but both had small coefficients / marginal effects.
Below is a graph showing the fit of the data (blue line) to the training data (purple line) and the predictions for the test data (red line):
The mean average prediction error (MAPE) for the training data was 0.30155 , and for the test data was 0.096277. The predictions were quite good, however the fit of the initial model had 30% error and Pseudo R-Squared of only 0.26.
One of the reasons for the lack of a good fit could have been the use of a Poisson distribution on over dispersed data.
The mean of the weekly visit counts is 5.6 and the variance is 10.9, about double, where Poisson distributed data should have mean ~= variance.
Since mean != variance, I applied a Negative Binomial 2 model, where mean + α * mean^2 = variance.
To correctly calculate the model, we must supply the value for the α (alpha) parameter, which handles the over dispersion of the data.
Steps to calculate α
STEP 1: Fit a Poisson model on the data set, to obtain the vector of fitted rates, λ
STEP 2: Fit the aux OLS regression model on the data set using the lambda vector to obtain the value of α
STEP 3: Check the t-value of α to ensure it is statistically significant
STEP 4: Use α to fit the NB2 regression model
Then, make your predictions and test the model's fit.
Reference: Date, S. (2019) https://towardsdatascience.com/negative-binomial-regression-f99031bb25b4
STEPS 1-3
A Poisson model was applied to all the covariates with the outcome of visit counts, using a dmatrix supplied by Patsy containing only the training data. The model used maximum likelihood estimation and had 63 degrees of freedom. Pseudo R-Squared was 0.73.
This model produced a lambda vector with length of 68. The vector was fit to an OLS model with no intercept to obtain the value of alpha, estimated as -0.09240869904973292 with a t-value of -7.563859. This t-value was significant, as it exceeded the threshold for 63 degrees of freedom at the 99% CI of +/- 3.454485.
The NB2 model was fit to the data, using alpha of 0.092408699.
STEP 4
Once again, visit_count is the endogenous variable and visit_dx_count (diagnosis count), visit_proc_count (procedures count), trend and seasonality are exogenous. The intercept indicates that the visit_count is approximately .5 (+/- 0.27) when all the other variable values are zero. The model was calculated using maximum likelihood estimation on 68 observations (observations where the trend was null due to the MA window were of necessity excluded) assuming a NB distribution with log link, and had 63 (=68-4-1) degrees of freedom.
The visit_dx_count variable was significant at the 95% CI, and seasonality was just short of significant with a p-value of 0.54. Trend was just short of significant at the 90% CI, with p-value of 0.109. All variables had small coefficient sizes / marginal effects.
Below is a graph showing the fit of the data (blue line) to the training set (purple line) followed by the predictions obtained for the test data (red line):
The mean average prediction error (MAPE) for the training data for the NB2 model was improved to 0.2887, and for the test data MAPE was a similar 0.109. The predictions appear improved over the Poisson model, and Pseudo R-Squared improved to 0.59 - much greater than the Poisson model's 0.26.
Accounting for dispersion to fit a GLM to the most accurate distribution improves model fit whether we're using it in a typical regression problem or time series regression analysis!