My drive to help others and insatiable curiosity about the world have fueled a life-long learning journey.
Aug 2020 - May 2022
University of New Hampshire Graduate School, Durham NH
College of Health and Human Services
Department of Health Management and Policy
GPA: 4.0
Fall 2021 Recipient, Merit-based Dinesh Thakur Health Analytics Scholarship Award
Thesis citation: Rasku, Kyle Partridge, "Using Machine Learning Techniques on Real-World Data to Understand the Characteristics of the Manchester, NH Health Care for the Homeless Patient Population for Risk Factor Identification and Intervention Improvement" (2022). Master's Theses and Capstones. 1572.
Download Paper & References: https://scholars.unh.edu/thesis/1572
Project Summary: Using Machine Learning to Improve Care for Homeless Patients
Code Highlights:
Accomplishments:
Established a new academic / institutional partnership between the University of New Hampshire Department of Health Management & Policy and Catholic Medical Center.
Designed and conducted a retrospective real world evidence (RWE) study using electronic health record data (EMR), claims data and Collective Medical portal data.
Study protocol written and approved in Spring 2020, and Institutional Review Board (IRB) approval received in Fall 2021.
Collaborated with Health Care for the Homeless' (HCH) leadership to identify and outline the research's business case and goals.
Created analysis plan in collaboration with thesis committee.
Coded, evaluated and presented a biostatistics and machine learning analysis capable of identifying and describing patient service groups within the larger clinic population.
Successfully presented the results to key leaders and stakeholders at Catholic Medical Center and Health Care for the Homeless.
Successfully defended my approach, methods and results to thesis committee members and UNH Health Data Science faculty.
This in-depth, collaborative analysis helped the Health Care for the Homeless team by providing actionable insights capable of improving tailored care for homeless patients.
Washington State University, Spokane WA (Remote)
Internship under Dr. Erin Griffin, Director of Evaluation & Scholarly Associate Professor
Jul 2021 - Mar 2022
Mentored analysts in Evaluation and Assessment departments in the R programming language and basic statistical methods
Designed and created an 8 module course with videos, presentations and R notebook assignments to get analysts started with R programming, help them understand the programmer's context - including computing, the internet and the cloud, and teach them the basics of nonparametric statistical computing using the 14th most popular scripting language in the world
Summary and References: Teaching R to Analysts
Course Materials: GitHub repository, R-cs-course
Created, deployed and analyzed qualitative / Likert instruments in E-Flo for measurement of medical student's self-assessment and program assessment under the direction and guidance of the Director of Evaluation
Wrote SQL extraction jobs to create daily .csv extracts from E-Flo system
Wrote python code to:
Scrape data from the web-based curriculum catalog
Automatically generate graphics for inclusion in compliance reports
Mine comments collected in qualitative instruments to perform network analysis, topic modeling and sentiment analysis with Latent Dirichlet allocation (LDA) and the Natural Language Toolkit (NLTK)
Automate the cleaning, merging, manipulation and summary of results collected from surveys within the E-Flo and Qualtrics systems
Under the direction and guidance of the Director of Evaluation, authored detailed course review reports for both compliance / administration and course designers and instructors, including succinct summary sections synthesizing key analysis findings.
Heath Data Science (HDS) 804
Professor: John McInally MBA
Installed and configured Apache cTAKES (clinical Text Analysis Knowledge Extraction System) on both Ubuntu and Windows platforms.
Set up and configured API access to the National Institute of Health's Unified Medical Language System (UMLS), a compendium of many controlled vocabularies in the biomedical sciences, to extract the ICD-10 vocabulary, and create, configure, and install a cTAKES dictionary able to derive meaning from extracted ICD-10 codes.
Contributed code improvements to an existing python library on GitHub for creating and processing pandas Data Frames mined from cTAKES output.
Developed an instructional video and guide for students using the cTAKES and ICD-10 dictionary tools, along with the python ingestion library, to set up a data pipeline for natural language processing (NLP) of clinical text, including admission notes and discharge summaries.
Created installation procedure documentation for UNH's IT group to easily install Data Science Lab instance of cTAKES.
August 2019 - June 2021
University of California, Davis
Continuing and Professional Education
GPA: 4.0
Capstone Citation: Kapoor, D. & Rasku, K. (2021). "Geographical Variations in the United States Health Care System: A Multivariate Analysis of Cost-Related Factors in Aggregated Medicare Claims Data"; UC Davis Continuing and Professional Education.
Capstone Summary & References: Explaining Medicare Costs
Analysis Partner: Dhara Kapoor, MSPT
Code Highlights: GitHub repository, medicare-analysis (currently private, please request access)
Team Accomplishments:
Analyzed Medicare data, including beneficiary demographics, chronic conditions and inpatient and outpatient claims data including costs and ICD-9 codes
Produced Elixhauser scores for each visit, using CMS SAS macro
Performed extensive feature development on the data set with python pandas, using all the data to produce detailed aggregate profiles of the health of 2,909 out of 3,143 U.S. counties (92.6%), and added in the RWJF County Health Rankings data as additional measures of county-level health factors
Examined U.S. counties with no Medicare beneficiaries, discovering that in many of these counties average life expectancy is well below the age of eligibility
Produced a risk adjustment Gamma regression model against the outcome of cost per beneficiary, and identified 1,109 counties with higher than expected costs using SAS GLM
Performed a detailed factor analysis in python using factor_analyzer on 1,042 of the counties identified as having higher than expected costs, and discovered three factors accounting for 52% of this higher than expected cost:
A hospitalization factor, accounting for the largest amount of cost variance (58%)
An illness / primary care factor, and
A high utilization factor
We produced a wholly innovative analysis of Medicare data, and were able to place our discoveries within the larger context of important previous U. S. health care cost analyses, including the Dartmouth Atlas Project and Steven Brill's The Bitter Pill.
Fall 2023- Spring 2024 Course Development; Spring 2024 Course Debut
Program: UC Davis CPE Health Professions Post-Baccalaureate
Developing a new online course in Health Care Statistics for UC Davis CPE's Health Professions Post-Baccalaureate program, focusing on how health care professionals can understand, apply, and constructively critique methods commonly used in clinical and health outcomes research.
Developing curriculum and lessons using Canvas in partnership with UC Davis CPE program directors and instructional designers, to include a variety of learning modalities including:
Videos and lectures
Discussion assignments
Interpretation assignments
Self-grading quizzes
Google Colab Notebooks with hands-on python examples, including:
Bayes Theorem, applied to a healthcare example
Random Walks and the Central Limit Theorem
The Age Distribution of Cancer Incidence (Distribution Testing)
Descriptive Statistics & Measures of Central Tendency with the Cleveland Heart Attacks data set
T-Testing & ANOVA with the Cleveland Heart Attacks data set
Hypothesis Testing and Confidence Intervals
Correlation and Linear Regression
Multivariate Analysis - Logistic, Poisson Regression, MANOVA
Survival Analysis using the Framingham Data
Categorical Techniques: Chi-Squared, CMH, Correspondence Analysis
Program: UC Davis CPE Healthcare Analytics
Spring 2023 and Fall 2023 (Present)
Continued to perform all duties performed as TA in Spring 2022 (see below)
Created review videos teaching on topics of Regularization, using LASSO for variable selection, Logistic Regression, risk and odds ratio calculation and interpretation, and deriving model metrics from the Confusion Matrix
Created lesson on Algorithmic Bias to provide students with current information on this important topic, including insight into reasons for ongoing bias and possible ways to mitigate bias in both models and data presentations
Summary & References: About Algorithmic Bias
Program: UC Davis CPE Healthcare Analytics
Professor: Dr. Brian Paciotti
Spring 2022
Mentored students in SAS programming and statistical methods via weekly drop-in Zoom tutoring sessions
Coached students during Capstone analysis planning and communication of their statistical model interpretations
Assisted students with cohort creation planning and provided SAS code as needed
Provided example code in SAS for creating training, validation and testing sets and performing model evaluations with confusion matrices
Provided example code in SAS performing descriptive and predictive modeling using a variety of statistical methods including CMH and chi-squared testing, linear and logistic regression, regularization and survival analysis
Program: UC Davis CPE Healthcare Analytics
Professor: Dr. Erin Griffin
July 2020 - Dec 2020
Mentored students in SAS programming and statistical methods via weekly drop-in Zoom tutoring sessions.
Improved assignments and added additional SAS code examples as needed.
Provided feedback for students on how to communicate statistical and model interpretations and improve precision in analytics writing.