Approximate Bayesian model selection as an alternative to classical hypothesis testing: Writing Outside of the Statistical Literature
Thursday, October 26, 2023
Christopher T. Franck, Virginia Tech
By now, statisticians and the broader research community are aware of the controversies surrounding traditional hypothesis testing and p-values. Many alternative viewpoints and methods have been proposed, as exemplified by The American Statistician's recent special issue themed "World beyond p<0.05." However, it seems clear that the broader scientific effort may benefit if alternatives to classical hypothesis testing are described in venues beyond the statistical literature. This paper addresses two relevant gaps in statistical practice. First, we describe three principles statisticians and their collaborators can use to publish about alternatives to classical hypothesis testing in the literature outside of statistics. Second, we describe an existing BIC-based approximation to Bayesian model selection as a complete alternative approach to classical hypothesis testing. This approach is easy to conduct and interpret, even for analysts who do not have fully Bayesian expertise in analyzing data. Perhaps surprisingly, it does not appear that the BIC approximation has yet been described in the context of "World beyond p<0.05." We address both gaps by describing a recent collaborative effort where we used the BIC-based techniques to publish a paper about hypothesis testing alternatives in a high-end biomechanical engineering journal.
Dr. Franck is an Associate Professor in the Department of Statistics at Virginia Tech. He is an application-oriented methodologist who focuses on statistical problems in health applications, behavioral economics, probabilistic forecasting, bioinformatics, and other areas. His work includes Bayesian statistical methodologies that can be implemented automatically and/or with objective prior information, Bayesian model selection and averaging approaches and practical methods by which to assess their sensitivities, and finally, in cases where historical data is inadequate and contemporary information is available, he develops Bayesian methods that allow researchers to formally incorporate subjective information into their analyses and predictions.
Leading with Impact: Enhancing Effective Leadership in Biostatistics Research Teams
Friday, November 3, 2023
Stefanie Robel, University of Alabama at Birmingham
Kendra Sewall, Virginia Tech
Shwetal Mehta, Barrow Neurological Institute
Running an impactful biostatistics research program involves more than a solid plan for developing and applying mathematical and statistical techniques. Effective leadership and team management are critical yet frequently overlooked components, essential for research project success. Biostatisticians leading or part of a team often face diverse and challenging personalities, different motivation levels, and varying expertise, leading to potential conflict, miscommunication, and decreased productivity.
"Leading with Impact" is a webinar tailored for biostatisticians aiming to bolster their leadership skills. The session begins with a presentation that comprehensively covers three critical leadership aspects:
Participants will gain practical strategies for navigating these challenges, enhancing team performance and satisfaction.
The session continues with a panel discussion featuring Drs. Kendra Sewall, Stefanie Robel, and Shwetal Mehta. All panelists actively run research programs, hold administrative leadership positions, or act as leadership coaches. They will share their real-world experiences and practical solutions and address audience questions, further empowering attendees to emerge as confident and effective team leaders.
Kendra Sewall is an Associate Professor of Biological Sciences and an affiliate of the School of Neuroscience at Virginia Tech. Research in the Sewall lab addresses the impacts of ecological and social conditions on the brain and behavior using songbirds as a model system. Her work focuses on the impacts of urbanization on the behavior and underlying brain mechanisms of wild song sparrows. Kendra began researching evidence-based approaches to lab management and leadership when she started her lab. Her team is a testing ground for strategies developed in the private sector, by psychologists, and by neuroscience, allowing her group to refine these strategies in an academic research setting. Her goal is to change the culture of academic science to encompass multiple prototypes and paths for ‘success,’ as a means of supporting healthier approaches to our work and greater diversity in the scientific community.
Dr. Shwetal Mehta is Associate Professor in the Department of Translational Neuroscience at Barrow Neurological Institute. She is also the Deputy Director and Chief Operating Officer of the Ivy Brain Tumor Center at Barrow Neurological Institute. Dr. Mehta received her MS from Tata Institute of Fundamental Research, India in Molecular Biology. Following her PhD in Molecular Genetics and Microbiology from the University of Texas at Austin, she joined Dana-Farber Cancer Institute & Harvard Medical School to pursue postdoctoral work in neurooncology. In 2013, Dr. Mehta was recruited to Barrow Neurological Institute, where her laboratory is focused on understanding the molecular basis of treatment resistance in brain cancer. She is a recognized leader in translational science for brain tumors and has helped develop an array of early-phase pharmacodynamic- and pharmacokinetic-driven studies for patients with malignant brain tumors.
Statistical Challenges in Clinical Trials for COVID-19
Friday, June 2, 2023
10 a.m. to 12 noon
National Institute of Allergy and Infectious Diseases
Lori Dodd, PhD, is a biostatistician and section chief for the Clinical Trials Research Section within the Biostatistics Research Branch, Division of Clinical Research, National Institute of Allergy and Infectious Diseases, where she primarily collaborates on infectious disease clinical trials. Lori served as principal statistician for the Pamoja Tulinde Maisha (PALM; “Together, Save Lives”) randomized controlled trial of Ebola virus disease therapies and the Adaptive COVID-19 Treatment Trial (ACTT) series of randomized controlled trials. Prior to joining NIAID, she worked as a mathematical statistician at the National Cancer Institute. Dr Dodd earned her PhD from the Department of Biostatistics at the University of Washington.
One Uncertainty Too Many? Managing Unknowns in Clinical Trials of Outbreak Diseases
The West African Ebola virus disease outbreak taught us the importance of starting clinical trials rapidly, to find effective intervention before the outbreak ends, while interventions may be useful to reduce disease burden and suffering. However, during novel infectious disease outbreaks, there are often no precedents for trial design and limited data exist to design an optimal trial. Waiting for more certainty about key design characteristics may delay trial implementation too significantly, creating a dilemma. Start now with an imperfect design or delay study start until the design is better defined? In this talk, I will review my experiences from designing studies for Ebola and COVID-19 treatment studies and emphasize the importance of incorporating sufficient design flexibility without compromising scientific rigor.
National Institutes of Health
Dr. Sally Hunsberger has worked at the National Institutes of Health for 30 years and has focused on clinical trials. She began her career at the National Heart, Lung, and Blood Institute and then moved to the National Cancer Institute after 10 years. She worked at the National Cancer Institute for 12 years, specializing in breast cancer and pediatric clinical trial research. Dr. Hunsberger currently works at the National Allergy and Infectious disease institute. She designs phase I, II and III studies, analyses data for natural history studies and phase I, II and III studies. She is the executive secretary for the co-infections and complications DSMB which reviews COVID-19 treatment studies and was the executive secretary for the COVID-19 vaccine DSMB. She also serves on the pediatric heart network DSMB for the NHLBI.
Issues Encountered While Monitoring the US Government-Supported Covid-19 Vaccine Trials
Operation Warp Speed (OWS) was a partnership among vaccine companies, government agencies and academia created during the Covid-19 global pandemic. A primary goal was to accelerate the development of Covid-19 vaccines. A fundamental principle that the OWS program established was that the National Institutes of Health would oversee a single Data and Safety Monitoring Board (DSMB) to review and monitor all OWS vaccine trials. This was implemented by having a statistician from NIAID be the executive secretary for the Board. The formation of OWS and its requirement for a single DSMB played an important role in the rapid development and delivery of effective Covid-19 vaccines. In this talk I will describe the unique issues and challenges faced while monitoring these trials and provide suggestions for future similar endeavors.
Biostatistics Research Branch, NIAID
Michael Proschan received his Ph.D. in Statistics from Florida State University in 1989. He has been a Mathematical Statistician in the Biostatistics Research Branch at the National Institute of Allergy and Infectious Diseases since January of 2006. Prior to coming to NIAID, he spent 16 years at the National Heart, Lung, and Blood Institute. He has co-authored three books: Statistical Monitoring of Clinical Trials: A Unified Approach, with Gordon Lan and Janet Wittes (Springer, 2006); Essentials of Probability Theory for Statisticians, with Pamela Shaw (CRC Press, 2016), and Statistical Thinking in Clinical Trials (CRC Press, 2022), and is a Fellow of the American Statistical Association. Dr. Proschan is also an Adjunct Professor for the Advanced Academic Programs at Johns Hopkins University.
Multiplicity Issues in Platform Trials of COVID-19
The COVID-19 pandemic led to platform trials comparing several active arms to a common control arm. An important question is whether to make a multiple comparison adjustment, as is traditionally done. Pressure to find an effective treatment quickly for a dangerous disease argues against such an adjustment, but statistical concerns seem to support an adjustment. We will take a close look at the statistical arguments and give both sides of the debate. Then we will consider a multiple comparison issue arising from the first trial to show benefit of an intervention in COVID-19, namely the Adaptive Coronavirus Treatment Trial (ACTT-1) of Remdesivir versus placebo. There was a statistically significant effect on the primary endpoint of time to recovery and on an important secondary endpoint of the World Health Organization’s 8-point ordinal scale at Day 15. Results for mortality did not quite reach statistical significance, but there was an apparent effect in ordinal scale 5, patients requiring supplemental oxygen (not including high-flow or invasive mechanical ventilation). Is the apparent effect of remdesivir on mortality in OS-5 real, or is it just the play of chance? If it is real, could it be part of an overall benefit of Remdesivir on mortality, or is it OS-5-specific? We will address these important questions.
Powerful Data Presentations with Success = (PD)2
Friday, April 28, 2023
1 p.m. to 3 p.m.
Jennifer H. Van Mullekom, PhD
After a 20-year career in industry, Dr. Jennifer Van Mullekom joined Virginia Tech in Fall 2016 as the Director of the Statistical Applications and Innovations Group (SAIG) and an Associate Professor of Statistical Practice. In addition to directing SAIG, she teaches collaboration skills as well as a design of experiments course to graduate students.
Formerly, she was a Senior Consulting Statistician and Certified Six Sigma Master Black Belt in DuPont's Applied Statistics Group. supporting the DuPont Protection Technologies business. At DuPont, she provided statistical leadership to the Tyvek® Medical Packaging Transition Project in the areas of product development, commercialization, and regulatory.
Jen is active in professional societies holding leadership roles in the American Statistical Association and the American Society for Quality. She holds three US Patents and has also worked at Lubrizol and Capital One. Jen is a regular participant on topics such as communication, collaboration, leadership, and ethics at the Conference on Statistical Practice. She holds an MS and PhD in Statistics from Virginia Tech and a BS in Mathematics and Mathematics Education from Concord University.
The amount of available data has exploded in the past ten years. Those holding quantitative roles in academia, government, and industry have been called upon to analyze and interpret it. We must turn data into information so that researchers can advance science and decision makers can act upon these advances. This is true whether you are assessing the performance of a new therapy, the impact of public health policy, or modeling disease transmission. It is also true for those presenting new statistical analysis methods. The point of a presentation is to synthesize facts in a meaningful, digestible way for the audience. Yet, many presentations serve to create confusion and because they lack clarity. Even worse, the lack of clarity leads to poor decisions built on erroneous interpretations.
Developing good communication skills around scientific and statistical presentations is essential for those in quantitative fields. This talk will discuss a four phased process for excelling at these types of presentations. The four phases are Prepare, Design, Practice and Deliver or (PD)2 for short. In the context of the process, emphasis will be placed on explaining complex concepts, formatting your results for clarity, designing your slides to facilitate interpretation, and engaging a non-quantitative or mixed level audience. The material will also cover incorporating story into a scientific presentation and effective principles of data visualization. You will leave the talk with an overall framework for tackling your next formal presentation as well as tips and tricks you can immediately use in informal team interactions.
Statistical Issues in Responsible Conduct of Research
Friday, March 3, 2023
10 a.m. to 12 p.m.
Sarah J. Ratcliffe, PhD
University of Virginia School of Medicine
Sarah J. Ratcliffe, PhD, is Professor and Director of the Division of Biostatistics, and Senior Vice Chair for Research in the Department of Public Health Sciences at the University of Virginia School of Medicine. She is Director of the Research Methods core of iTHRIV (CTSA), PI of the U24 data coordinating center grant for the multi-site DIVA trial, and MPI of an R01 developing prediction algorithms in transplant patients. Her background is in statistics and computing, with specific training and expertise in the analysis of correlated data, especially longitudinal, time series and functional data, predictive modeling, missing data, as well as expertise in data and analysis ethics. She was the 2019 ENAR President and currently serves on the IBS Executive Board.
Technology has made it easier to share and analyze data. As statisticians, we are responsible for the data that is in our “custody,” and how it is used. This WebENAR will discuss some of the ethical issues that arise in collaborative research, the importance of reproducible analyses, and the impact that both can have on “good science.”
Rethinking Race-ethnicity: Introducing Novel Survey-based Measures of Lifetime Experience of Discrimination and Stress
Friday, November 4, 2022
2 to 3 p.m. Eastern
Felicity T. Enders, PhD, MPH
Dr. Felicity Enders is a Professor of Biostatistics at Mayo Clinic. She is a consulting statistician with over 170 publications with an H-index of 42. Dr. Enders’ personal research focuses on educating researchers. For about 15 years, this took the form of statistics education, in which she leveraged her award-winning expertise as a statistics educator to develop a national statistics education research team. While this team is ongoing, Dr. Enders’ research interests have evolved to 1) hidden curriculum for research, a topic that provides a novel lens to understand and overcome barriers for research for people who are diverse and 2) life course measures of discrimination and stress, which she has developed and is testing. Both are aligned with Dr. Enders’ joint leadership for education and for diversity. In education, Dr. Enders is the Associate Program Director for Mayo’s TL1 program, Program Director for the Minnesota Learning Health System K12, and Program Director for the Kern Scholars program. Dr. Enders was recently named as the Director of the Mayo Clinic Office for Research Equity, Inclusion, and Diversity. Dr. Enders is well known nationally, where she is a fellow of the American Statistical Association and is serving on the Board of Directors for the Association for Clinical and Translational Science.
Though race-ethnicity is not a biological variable, race-ethnicity is included in nearly every medical study and often very statistically & meaningfully significant. New measures are critically needed that will allow biomedical researchers to disentangle race-ethnicity from the true individual, interpersonal, and structural causes of health disparities. In this talk, we introduce novel measures using simple survey items to capture self-reported experience of discrimination and stress spanning the life course agnostic to source. Early results from an employee survey will presented.
Collab: Heterogeneous Causal Effects of Neighborhood Policing in New York City with Staggered Adoption of the Policy and Evaluating Methods to Estimate the Effect of State Laws on Opioid-related Outcomes in the Presence of Confounding
Friday, October 21. 2022
10 a.m. to 12 p.m. Eastern
Joseph L. Antonelli
University of Florida
Joseph L. Antonelli is an assistant professor of statistics at the University of Florida, who works on causal inference, high-dimensional modeling, and Bayesian nonparametric methodology. He is motivated by applications in criminology and air pollution epidemiology.
In New York City, neighborhood policing was adopted at the police precinct level over the years 2015-2018, and it is of interest to both (1) evaluate the impact of the policy, and (2) understand what types of communities are most impacted by the policy, raising questions of heterogeneous treatment effects. We develop novel statistical approaches that are robust to unmeasured confounding bias to study the causal effect of policies implemented at the community level. We find that neighborhood policing decreases discretionary arrests in certain areas of the city, but has little effect on crime or racial disparities in arrest rates.
Beth Ann Griffin
Beth Ann Griffin is a Senior Statistician at RAND where she co-directs the RAND Center for Causal Inference and the RAND-USC Option Policy Center for research excellence. She is devoted to finding ways to improving our ability to draw more robust causal inference using observational study data.
The nation is in the midst of an opioid-related public health crisis. In response, states have enacted a heterogeneous collection of policies aimed at reducing mortality and morbidity, producing a state policy landscape that is complex and dynamic. Understanding how best to estimate policy effects is important and several unanswered questions remain, particularly about optimal methods for handling confounding bias that will result when states implementing a policy are different from states that do not. Using simulations, we examined the statistical properties of several statistical methods to estimate the effects of state-level opioid policies to empirically identify the best methods for handling confounding. Findings from these simulations can help identify which models are best to robustly estimate state-level policy effects on opioid-related outcomes. Identifying robust and powerful methods are needed to help ensure future policy decisions are based on results from well-designed evaluations that yield accurate policy effects.
Recent Developments in the Analysis of Adaptive Designs and Their Relevance to Platform Trials
Friday, June 3, 2022
10 a.m. to 12 p.m.
Novartis Pharma AG, Basel, Schweiz
Ekkehard Glimm, PhD, is Senior Director in Biostatistics at Novartis Pharma in Basel, Switzerland. He has a PhD in mathematics from the University of Magdeburg in Germany on a topic from multivariate statistics. Ekkehard joined Novartis in 2005, working first in Oncology Biostatistics and in the Statistical Methodology group since 2006. Since 2021, he is also an adjunct professor for biostatistics at the Medical Faculty of the University of Magdeburg, Germany.
Since joining Novartis, Ekkehard's work has focused on adaptive clinical trial designs, methods for multiple endpoints in clinical trials and analysis of rare adverse events in clinical projects. He has authored and co-authored around 50 papers published in peer-reviewed scientific journals and is an associate editor of Pharmaceutical Statistics and the Biometrical Journal.
Confirmatory platform trials are recently gaining popularity in the pharmaceutical industry. Several such trials have been initiated in the past years; many of these (such as GBM Agile and PANCAN Precision Promise) are multi-sponsor trials. In contrast to similar master protocol designs (such as basket trials) which are well-established in earlier phases of clinical development, the use of platform trial designs for confirmatory studies is still subject to some debate.
In particular, there is currently no consensus on the importance of type I error control. In the spectrum from strict control of the Familywise-error rate control (FWER) across the entire platform to no type I error control at all, few researchers take the extreme positions, but within this range, opinions vary. Among the alternative concepts, compromises such as population-wise and treatment-wise error rate control have been suggested.
This talk will discuss some of these concepts. Subsequently, we will consider how type I error rate control can be guaranteed in platform trials when techniques such as response adaptive randomization or permitting new treatments into an ongoing platform trial are used. It turns out that adaptive design methodology can be adapted to such uses, but a power loss cannot be avoided.
Combining the Three Cultures of Quantitative Decision Making in Drug Development
Friday, May 20, 2022
1-3 pm EST
Dr. David Ohlssen
Dr. David Ohlssen is currently Advanced Exploratory Analytics head, within the Novartis Advanced Methodology and Data Science group, based in East Hanover New Jersey. Since joining Novartis in 2007, he has developed a broad range of experience in applying novel quantitative approaches within a drug development setting. His current focus involves driving the appropriate application of data science, machine learning and advanced modeling in a drug development setting. As part of the Novartis data digital transformation, he is heavily involved in large scale collaborations with the Oxford Big Data Institute, Carnegie Mellon University, and the Food and Drug Administration. Each of these projects examine databases that comprise of a combination of clinical, omics and imaging data, with the aim of gaining a better understanding of disease progression and a more personalized approach to treatment by using combinations of statistics, machine learning and causal inference.
Previously, after completing his PhD in Biostatistics at the University of Cambridge, he worked as a research fellow at the MRC Biostatistics Unit (Cambridge UK), where his interests included: diagnostics for Bayesian models, novel clinical trial design and statistical methods for the profiling of health-care providers. In 2016 he received the Novartis leading scientist award for his contributions to quantitative decision making in drug development and in 2021 he became a Fellow of the American Statistical Association for advancing the role of statistical and data sciences in pharmaceutical industry.
In this talk we shall argue that approaches to quantitative decision making can be divided into three areas or cultures: First, approaches based classical statistical thinking that aim to use the tools of statistical inference and experimental design to provide solutions with well understood operating characteristics. Second, those based on modeling to provide a good approximation and then potentially use simulation to propagate uncertainty around a target, leading to a basis for decision making. Finally, those based on machine learning that is often used to reduce complex high dimensional problems and provide the basis for prediction.
We shall review problems in a drug development setting using a case-study from a psoriatic arthritis drug development program to illustrate the richness and complexity of data collected during clinical development. Next, we shall show how each of the cultures can provide excellent solutions to certain problems. However, with increased use of data from a variety of sources, there is often a need to combine these three cultures to bring an appropriate solution to a problem. To illustrate this, we shall look at a class of problems where two treatments or options need to be compared but a randomized experiment is not possible. We shall review solutions that combine machine learning, statistical inference and potentially modeling. In addition, we shall review the problem of identifying prognostic and predictive factors using the knockoff approach, which combines control of operating characteristics with realistically complex modeling.
How Early Career Survival Skills Often Turn Into Mid-career Bottlenecks: A Researcher’s Journey Perspective
Friday, May 6, 2022
1-3 pm EST
Dr. Morgan Giddings
Dr. Giddings was trained in the fields of Physics, Computer Science at The University of Utah (cum laude) and in Bioinformatics at The University of Wisconsin-Madison for her PhD. She founded a lab focused on proteomics and systems biology at The University of North Carolina-Chapel Hill, which she quickly grew to having two consistently funded R01 grants, with an RC2 (Grand Opportunities) and U24 awards in addi-tion. After building the lab to over 16 people and $1M/year in funding and being promoted with tenure to Associate professor, she decided to start a business helping other faculty achieve similar successes with grant funding and career development.
In 2010, she left UNC Chapel Hill for a position as a full professor at Boise State University, and in 2013 resigned from that position to focus on her faculty training business full-time. In 2012 she became a #1 Amazon author for her book Four Steps to Funding. She was involved in the ENCODE project which has led to several very highly cited papers. She has been involved in multiple successful commercial ventures, and she has supported faculty clients across the world in obtaining increased grant funding, along with other areas of career success. She has been invited to locales as disparate as Italy, Sweden, and all over the United States to give seminars on grant writing, productivity, and career fulfillment.
“Work (very) hard and you will succeed.” Or is it “Don’t work too hard, because you need balance and don’t want to burn out.” As researchers, we hear many messages -- often conflicting — about how to “succeed” in these highly challenging careers. Which do we listen to, and when? What are the most im-portant skills to succeed — while still maintaining a semblance of balance between work, family, and other pursuits? At a time when the mental health of researchers is a growing concern, there’s a need to find better an-swers. A key impediment in that search is that many skills and strategies we adopt to survive at one stage of a career can be counter-productive at later stages of a career. For example, as a graduate stu-dent it is important to listen to authority. As a mid-career researcher, it’s important to be the authority. Yet between the common one-size-fits-all advice, and the natural human tendency to keep doing what worked before, we often create our own bottlenecks to balanced and fulfilling advancement in a career. We developed the Researcher’s Journey framework to highlight the major stages of a research career, the key differences in the challenges presented at each stage, and the unique skills needed for overcom-ing those challenges. Using this framework, we will examine some specific differences between early and mid-career researchers, looking at both the stage-specific required skills, and the shifts in perspec-tive required to address them. Our goal is to provide the audience with a framework to better under-stand where they are in the journey and what new skills and strategies may be needed, in order to move through the bottlenecks and create a more balanced and fulfilling career trajectory.
A Novel Approach for the Analysis of Randomized Clinical Trials
Friday, April 22, 2022
10 a.m. to 12 p.m.
Devan V. Mehrotra
Biostatistics and Research Decision Sciences, Merck & Co., Inc.
Devan V. Mehrotra, PhD, is Vice President, Biostatistics, at Merck Research Laboratories (MRL). Over the past 30 years, he has made significant contributions towards the research, development and regulatory approval of medical drugs and vaccines across a broad spectrum of therapeutic areas. He was awarded an MRL Presidential Fellowship in 2012. Dr. Mehrotra is also an Adjunct Associate Professor of Biostatistics at the University of Pennsylvania and an elected Fellow of the American Statistical Association. He has served as a subject matter expert for the Bill and Melinda Gates Foundation, the US National Academy of Sciences, the Coalition for Epidemic Preparedness Innovations, and the International Council on Harmonization. His current research focus is on statistical innovation for enabling personalized medicine.
Randomized clinical trials use either stratified or unstratified randomization. For the former, the stratification factors are typically categorical baseline covariates (region, age group, ECOG status, etc.) that are presumed to influence the clinical endpoint of interest. We caution that uncertainty at the trial design stage can contribute to "ineffective" stratification and the corresponding stratified analysis can lead to an adversely biased or imprecisely estimated treatment effect, especially for trials designed to assess whether a test treatment prolongs survival relative to a control treatment. To mitigate this non-trivial risk, we show how “effective” stratification can be achieved using a pre-specified treatment-blinded algorithm applied to the clinical trial outcomes, followed by a power-boosting stratified analysis after treatment unblinding. We illustrate the utility of our proposed ‘5-STAR’ approach relative to current practice using a graphical summary of p-values and hazard ratio estimates from 23 real data examples. We also discuss alignment of our novel proposal with FDA guidance on covariate-adjusted analyses, and with related publications by John Tukey, Stuart Pocock, and others. (An R package to implement 5-STAR is available at https://github.com/rmarceauwest/fiveSTAR)
Collaboration: Pairwise Survival Analysis and Causal Inference for Infectious Disease Epidemiology and Understanding Transmission Dynamics of Emerging Infectious Diseases from Contact-tracing Data
Friday, February 25, 2022
10 a.m. to 12 p.m. ET
Ohio State University
Eben Kenah is an associate professor of biostatistics in the College of Public Health at the Ohio State University in Columbus, Ohio. His research interests include statistical methods for infectious disease epidemiology, epidemiologic methods, survival analysis, causal inference, epidemic models, and networks.
Pairwise survival analysis and causal inference for infectious disease epidemiology: Causal inference for infectious disease transmission is complicated because outcomes in different individuals are inherently dependent, which leads to interference or spillover of treatment effects. For example, individuals who are not vaccinated are partly protected when individuals around them are vaccinated. An established approach to this problem is to define causal effects in populations (e.g., a vaccination program in a village) and then attempt to measure these directly. An alternative approach is to define causal effects in pairs of individuals and estimate them using methods from pairwise survival analysis. This approach is likely to yield results that generalize more easily between populations, and it allows more detailed mechanistic insight into the effects of interventions. These pairwise causal effects can be used as the basis of epidemic models that allow estimation of the causal effect of an intervention in a population. This approach places greater emphasis on the longitudinal study of transmission in close contact groups than has been evident in the ongoing COVID-19 pandemic.
University of Florida
Yang Yang is an associate professor of biostatistics in the College of Public Health and Health professions as well as Emerging Pathogens Institute at the University of Florida. His research focuses on statistical methods for disease transmission dynamics, efficacy evaluation, missing data and surveillance bias. He also works on ecological modeling and genetic association for clinical outcomes.
Understanding transmission dynamics of emerging infectious diseases from contact-tracing data: Contact-tracing data provide crucial and reliable information for understanding transmissibility, risk drivers and intervention efficacies for newly emerging infectious diseases. Analysis of such data is often challenging mainly due to surveillance bias, missing data and lack of biological understanding, which have been further exacerbated by COVID-19. We examine several of these challenges: (1) diagnostic bias towards symptomatic infections; (2) presymptomatic infectivity, i.e., the latent period is shorter than the incubation period; and (3) reporting bias, where only confirmed cases are reported but uninfected close contacts remain unknown. These issues, if left unaddressed, can lead to erroneous estimation of key epidemiological parameters. I will discuss our experiences in the analysis of household transmission of SARS-CoV-2 in Wuhan, China and nosocomial transmission of MERS-CoV in the Kingdom of Saudi Arabia several. I will introduce some statistical adjustments we have adopted to address the aforementioned challenges.
Estimands, Estimators, and Estimates: Aligning Target of Estimation, Method of Estimation, and Sensitivity Analysis, with Application to the COVID-19 Pandemic
Friday, November 19, 2021
10 a.m. to 12 p.m. ET
Bharani Dharan is a Global Group Head, Biostatistics in the Oncology development analytics unit at Novartis Pharmaceuticals, East Hanover. He has managed multiple compounds in late Phase clinical trials in oncology and has experience across multiple disease indications. He has more than 20 years of experience in Pharmaceutical industry. Prior to joining Novartis, he was a project statistician at GlaxoSmithKline. In addition to his current role at Novartis, he also leads the internal cross-functional estimand workstream. His areas of interest include estimands, group sequential designs, adaptive designs and multiplicity.
Frank Bretz is a Distinguished Quantitative Research Scientist at Novartis. He has supported the methodological development in various areas of pharmaceutical statistics, including adaptive designs, dose finding, estimands, and multiple testing. Frank is currently holding adjunct professorial positions at the Hannover Medical School (Germany) and the Medical University of Vienna (Austria). He was a member of the ICH E9(R1) Expert Working Group on 'Estimands and sensitivity analysis in clinical trials.' Frank is a Fellow of the American Statistical Association.
Kelly van Lancker
Johns Hopkins University
Kelly Van Lancker recently obtained her PhD in statistics at Ghent University. At the beginning of September, she started a postdoctoral research position at the Johns Hopkins Bloomberg School of Public Health. Kelly's research focuses on the use of causal inference methods in clinical trials.
Stijn Vansteelandt is an expert in statistical methodology for causal inference. He has authored over 200 peer-reviewed publications in international journals on a variety of topics in biostatistics, epidemiology and medicine, such as the analysis of longitudinal and clustered data, missing data, mediation and moderation/interaction, instrumental variables, family-based genetic association studies, analysis of outcome-dependent samples and phylogenetic inference. He has recently finished a term as Co-Editor of Biometrics, the leading flagship journal of the International Biometrics Society, and has previously served as Associate Editor for the journals Biometrics, Biostatistics, Epidemiology, Epidemiologic Methods and the Journal of Causal Inference. In 2020, he has joined the editorial board of the Journal of the Royal Statistical Society - Series B. His recent work focuses on strategies for obtaining valid inference for statistical and causal effect estimands when the analysis involves data-adaptive methods, such as variable selection or machine learning. Motivated by applications in (personalised) medicine, additional strands of work focus on intercurrent events in clinical trials, and on causal prediction based on electronic health records.
The ICH E9(R1) Addendum on 'Estimands and Sensitivity Analysis in Clinical Trials' introduced a framework to align planning, design, conduct, analysis, and interpretation of clinical trials. When defining the clinical question of interest, clarity is needed about 'intercurrent events' that affect either the interpretation or the existence of the measurements associated with the clinical question of interest, such as discontinuation of assigned treatment, use of an additional or alternative treatment and terminal events such as death. The description of an estimand should reflect the clinical question of interest in respect of these intercurrent events, and the Addendum introduces strategies to reflect different questions of interest that might be posed. The choice of strategies can influence how more conventional attributes of a trial are reflected when describing the clinical question, for example the treatments, population or the variable (endpoint) of interest.
In this seminar we briefly introduce the estimand framework according to the ICH E9(R1) Addendum and describe various strategies for addressing intercurrent events when defining the clinical question of interest. We then reflect on the experience and lessons learned of implementing the Addendum through an internal cross-functional and cross-divisional working group that encompasses various estimand initiatives. Next, we discuss in detail the hypothetical estimand strategy, where a scenario is envisaged in which the intercurrent event would not occur. The Addendum acknowledges that a wide variety of hypothetical scenarios can be envisaged, but it also clarifies that some scenarios are likely to be of more clinical or regulatory interest than others. We share our experiences and try to provide some guidance on their use in clinical trial practice. Finally, we demonstrate how the estimand framework can usefully be applied to clinical trials impacted by the COVID-19 pandemic to address potential pandemic-related trial disruptions and embed them in the context of study objectives and design elements. We introduce different hypothetical estimand strategies and review various causal inference and missing data methods such as multiple imputation and (augmented) inverse probability weighting for the estimation step. To clarify, we describe the features of a stylized trial in neuroscience, and how it may have been impacted by the pandemic. This stylized trial will then be re-visited by discussing the changes to the estimand and the estimator to account for pandemic disruptions.
Novel Applications of Real-world Data to Support Clinical Trials
Friday, June 11, 2021
1 p.m. to 3 p.m. ET
Ram Tiwari, PhD
Bristol Myers Squibb
Ram C. Tiwari, Ph.D. is the Head of Statistical Methodology at BMS since February 1, 2021. His prior services include serving as Director of Division of Biostatistics at CDRH (2016-2020), Associate Director for Science and Policy in the Office of Biostatistics, CDER (2006-2016) at FDA, Mathematical Statistician and Program Director at NCI/NIH (2000-2006), and Professor and Chair of the Department of Mathematics at the University of North Carolina at Charlotte (1986-2000). He received his MS and PhD degrees from Florida State University in Mathematical Statistics. He is a Fellow of the American Statistical Association and a past President of the International Indian Statistical Association. Dr. Tiwari has over 200 publications on statistical methods, and a forthcoming book on “Signal Detection for Medical Scientists: Likelihood ratio Test-based Methodology” published by Francis &Taylor.
Wendy Wang, PhD
Wendy Wang is a Quantitative Scientist at Flatiron Health, where her research focuses on leveraging real-world data to improve cancer care among patients. Her work extends across various areas, including enhanced survival extrapolation, racial disparities in treatment and end-of-life treatment in cancer care. Prior to joining Flatiron, she received her PhD in Epidemiology from the University of Washington in 2017, and completed her post-doctoral training at Fred Hutch, with a focus in statistical genetics and cancer epidemiology.
Devin Incerti, PhD
Devin Incerti is a Principal Data Scientist at Genentech. He received his PhD from Princeton University’s School of Public and International Affairs and worked as an economist specializing in estimating the value of health technologies prior to joining Genentech. He enjoys working across disciplines and has collaborated with researchers in many fields such as bioinformatics, medicine, statistics, computer science, epidemiology, economics, and political science. His research interests generally lie in the application and development of quantitative methods and software for problems in healthcare. He is currently working on a number of topics related to health technology assessment and analyses of real-world data, including software for health economic simulation modeling, causal inference methods for supplementing randomized and single arm clinical trials with observational data, and prognostic survival modeling with high dimensional data.
Katherine Tan, PhD
Katherine Tan, PhD is a Quantitative Scientist at Flatiron Health. She is currently leading projects in real-world control arms, hybrid control arms, endpoints, and imaging (scans), where her work has highlighted ways to apply robust statistical design thinking when working with heterogeneous observational data sources such as real-world healthcare data. She received her PhD in Biostatistics from the University of Washington, Seattle.
Real-world data (RWD) have played an increasingly important role in healthcare decisions, for example supporting the design, analysis, and contextualization of clinical trials. In this webinar, we invite panelists from industry with backgrounds in statistics, clinical trials, data science, epidemiology, and health outcomes research to discuss novel applications where RWD can be used to support clinical trials.
We discuss propensity-score based methods to leverage RWD as an external data source to augment single-arm clinical trials, enhanced extrapolation of long-term clinical trial survival outcomes using electronic health record (EHR)-derived RWD, and enrollment projection for a prospective pragmatic trial design that utilizes RWD. Finally, we tie the three topics together with a panel discussion.
Incorporating Diversity, Equity and Inclusion in Biostatistics Courses
Friday, April 23, 2021
10 a.m. to 12 p.m. Eastern
Dr. Scarlett Bellamy
Dornsife School of Public Health
Dr. Scarlett Bellamy is a Professor of Biostatistics and Director of the Graduate Programs in Biostatistics, Department of Epidemiology and Biostatistics at Drexel University. She also serves as the Associate Dean of Diversity and Inclusion at Drexel’s Dornsife School of Public Health. Prior to her current position at Drexel, she was a Professor of Biostatistics in the Perelman School of Medicine at the University of Pennsylvania. She is also Co-Director of the Biostatistics and Informatics Core (BIC) and serves as a senior biostatistician for the Center for Health Equity Research and Promotion (CHERP) at the Corporal Michael J. Crescenz VA Medical Center. Dr. Bellamy’s research interests are in the design, analysis and implementation of cohort and longitudinal studies, particularly group/cluster randomized trials. She has published in a number of clinical and public health disciplines including: statistical methods; behavioral economics; HIV risk reduction; clinical investigations of HIV, cancer, cardiovascular health, obesity and physical activity, adult lung injury and lung transplantation; and critical care.
Dr. Reneé Moore
Reneé H. Moore, PhD (she/her) is Research Associate Professor and Director of the Biostatistics Collaboration Core at Emory University. She earned a Bachelor of Science in mathematics and completed the secondary mathematics education program at Bennett College and earned her PhD in Biostatistics from Emory University. In her first faculty position at the University of Pennsylvania, Dr. Moore was actively involved in designing and implementing clinical trials via Data Coordinating Centers and was the faculty statistician in the Center for Weight and Eating Disorders. Next Dr. Moore taught up to seven classes per year and continued her obesity research at North Carolina State University, Department of Statistics. In 2015, Dr. Moore returned to Emory University. She spends her time mentoring, teaching, and collaborating with clinical investigators from Penn, UNC, Emory, and beyond. Dr. Moore is a Fellow of the American Statistical Association (2017). She is the current Treasurer of ENAR. Dr. Moore is a past chair of the ASA Committee on Minorities in Statistics (past chair of StatFest), past co-chair of the ENAR Fostering Diversity in Biostatistics Workshop, and remains very active in these and other initiatives within ENAR and ASA.
Andrea Lane (she/her) is a biostatistics PhD candidate at Emory University. Prior to entering the PhD program, Andrea graduated from UNC Chapel Hill with bachelor’s degrees in biostatistics and mathematics. Her dissertation work is in mediation modeling with primary application to DNA methylation data.
As we embrace conversations about improving diversity, equity, and inclusion (DEI) in the field of biostatistics, ideally, these perspectives should appear in every aspect of the profession, including incorporating these principles into how we teach our trainees. By incorporating DEI into biostatistics pedagogy, instructors and trainees can cultivate a more holistic understanding of both historical background and current challenges in the field by enabling all students to see themselves in the content and how they might contribute to making important contributions to both statistical theory and application.
We will begin this WebENAR by putting this into historical context to establish the importance of incorporating DEI into biostatistics training and coursework. We will then introduce practical examples from our own experiences of how to introduce these concepts into courses without compromising course objectives and without requiring additional time for these modifications. The session will conclude with open discussion where we encourage all those attending the WebENAR to share their own experiences and ideas for making biostatistics courses more diverse, equitable, and inclusive.
Slamming the Sham: A Bayesian Model for Adaptive Adjustment with Noisy Control Data
Friday, March 26, 2021
10 a.m. to 12 p.m. ET
Andrew Gelman is a professor of statistics and political science at Columbia University. He has received the Outstanding Statistical Application award three times from the American Statistical Association, the award for best article published in the American Political Science Review, and the Council of Presidents of Statistical Societies award for outstanding contributions by a person under the age of 40. His books include Bayesian Data Analysis (with John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Don Rubin), Teaching Statistics: A Bag of Tricks (with Deb Nolan), Data Analysis Using Regression and Multilevel/Hierarchical Models (with Jennifer Hill), Red State, Blue State, Rich State, Poor State: Why Americans Vote the Way They Do (with David Park, Boris Shor, and Jeronimo Cortina), A Quantitative Tour of the Social Sciences (co-edited with Jeronimo Cortina), and Regression and Other Stories (with Jennifer Hill and Aki Vehtari).
Andrew has done research on a wide range of topics, including: why it is rational to vote; why campaign polls are so variable when elections are so predictable; why redistricting is good for democracy; reversals of death sentences; police stops in New York City, the statistical challenges of estimating small effects; the probability that your vote will be decisive; seats and votes in Congress; social network structure; arsenic in Bangladesh; radon in your basement; toxicology; medical imaging; and methods in surveys, experimental design, statistical inference, computation, and graphics.
It is not always clear how to adjust for control data in causal inference, balancing the goals of reducing bias and variance. We show how, in a setting with repeated experiments, Bayesian hierarchical modeling yields an adaptive procedure that uses the data to determine how much adjustment to perform. The result is a novel analysis with increased statistical efficiency compared to the default analysis based on difference estimates. We demonstrate this procedure on two real examples, as well as on a series of simulated datasets. We show that the increased efficiency can have real-world consequences in terms of the conclusions that can be drawn from the experiments. We also discuss the relevance of this work to causal inference and statistical design and analysis more generally. This is joint work with Matthijs Vákár. Before attending the talk, people are encouraged to read our paper: http://www.stat.columbia.edu/~gelman/research/unpublished/chickens.pdf
Revisiting ICH E9 (R1) During the COVID-19 Pandemic
Friday, January 22, 2021
10 a.m. to 12 p.m. ET
Yongming Qu, Eli Lilly
Yongming Qu is currently a Sr. Research Fellow at Eli Lilly and Company. He received his PhD in Statistics from Iowa State University. He has provided key leadership in various stages drug clinical development at Lilly. He has been passionate in developing new statistical methods for better clinical trial design and data analysis that impact drug development. He published more than 70 articles in statistical, medical and mathematical journals, and is an ASA Fellow.
Ilya Lipkovich, Eli Lilly
Ilya Lipkovich is a Sr. Research Advisor at Eli Lilly and Company. Ilya received his PhD in Statistics from Virginia Tech in 2002 and has more than 15 years of statistical consulting experience in pharmaceutical industry. He is an ASA Fellow and published on subgroup identification in clinical data, analysis with missing data, and causal inference. He is a frequent presenter at conferences, a co-developer of subgroup identification methods, and a co-author of the books "Analyzing Longitudinal Clinical Trial Data. A Practical Guide" and "Estimands, Estimators and Sensitivity Analysis in Clinical Trials."
Abstract The current COVID-19 pandemic poses numerous challenges for ongoing clinical trials and provides a stress-testing environment for the existing principles and practice of estimands in clinical trials. The pandemic may increase the rate of intercurrent events (ICEs) and missing values, spurring a great deal of discussion on amending protocols and statistical analysis plans to address these issues. In this article we revisit recent research on estimands and handling of missing values, especially the ICH E9 (R1) on Estimands and Sensitivity Analysis in Clinical Trials. Based on an in-depth discussion of the strategies for handling ICEs using a causal inference framework, we suggest some improvements in applying the estimand and estimation framework in ICH E9 (R1). Specifically, we discuss a mix of strategies allowing us to handle ICEs differentially based on the causes of ICEs. We also suggest ICEs should be handled primarily by hypothetical strategies and provide examples of different hypothetical strategies for different types of ICEs as well as a road map for estimation and sensitivity analyses. We conclude that the proposed framework helps streamline translating clinical objectives into targets of statistical inference and resolves many issues with defining estimands and choosing estimation procedures arising from unanticipated events such as the current pandemic.
Role of Statisticians in a Pandemic
Friday, November 13, 2020
10 a.m. to 11:30 a.m. Eastern
Bhramar Mukherjee, PhD
Department of Biostatistics, School of Public Health
University of Michigan
Bhramar Mukherjee is John D. Kalbfleisch Collegiate Professor and Chair, Department of Biostatistics; Professor, Department of Epidemiology, University of Michigan (UM) School of Public Health; Research Professor and Core Faculty Member, Michigan Institute of Data Science (MIDAS), University of Michigan. She also serves as the Associate Director for Quantitative Data Sciences, The University of Michigan Rogel Cancer Center. She is the cohort development core co-director in the University of Michigan's institution-wide Precision Health Initiative. Her research interests include statistical methods for analysis of electronic health records, studies of gene-environment interaction, Bayesian methods, shrinkage estimation, analysis of multiple pollutants. Collaborative areas are mainly in cancer, cardiovascular diseases, reproductive health, exposure science and environmental epidemiology. She has co-authored more than 240 publications in statistics, biostatistics, medicine and public health and is serving as PI on NSF and NIH funded methodology grants. She is the founding director of the University of Michigan's summer institute on Big Data. Bhramar is a fellow of the American Statistical Association and the American Association for the Advancement of Science. She is the recipient of many awards for her scholarship, service and teaching at the University of Michigan and beyond. Including the Gertrude Cox Award, from the Washington Statistical Society in 2016 and most recently the L. Adrienne Cupples Award, from Boston University in 2020.
Jeffrey S. Morris, PhD
Department of Biostatistics, Epidemiology and Informatics
Perelman School of Medicine, University of Pennsylvania
Jeffrey S. Morris is Professor and Director of the Division of Biostatistics at the Perelman School of Medicine at the University of Pennsylvania, moving in 2019 after 19 years at the University of Texas M.D. Anderson Cancer Center. He obtained his PhD in Statistics from Texas A&M University under the supervision of Raymond J. Carroll in 2000. His research involves a combination of biomedical collaborative research and statistical methodological research, with a focus on developing flexible methods for integrating information across modern, complex big data including multi-platform genomics data, biomedical imaging data, and wearable devices, with statistical focus in functional data analysis and Bayesian modeling. Additionally, he has gotten involved in numerous COVID-19 related research projects at University of Pennsylvania, and authors the website http://covid-datascience.com. This website contains a blog in which he attempts to use his perspective and skills as a statistical data science to evaluate constantly emerging COVID-19 information, filter out biases, aggregate information together, identify key insights along with a sense of their uncertainty, and communicate them in an accessible balanced way. This blog contains more than 160 posts with upward of 100k views.
Xihong Lin, PhD
Department of Biostatistics
Harvard T.H. Chan School of Public Health
Xihong Lin is Professor and Former Chair of Biostatistics, Coordinating Director of the Program in Quantitative Genomics of Harvard TH Chan School of Public Health, and Professor of Statistics at Harvard University, and Associate Member of the Broad Institute of MIT and Harvard. Dr. Lin's research interests lie in development and application of scalable statistical and computational methods for analysis of massive data from genome, exposome and phenome, such as large scale Whole Genome Sequencing studies, integrative analysis of different types of data, biobanks, and complex epidemiological and observational studies. She is an elected member of the US National Academy of Medicine. Dr. Lin received the 2002 Mortimer Spiegelman Award from the American Public Health Association, the 2006 Presidents' Award and the 2017 FN David Award from the Committee of Presidents of Statistical Societies (COPSS). She is the PI of the Outstanding Investigator Award (R35) from the National Cancer Institute, and the contact PI of the Harvard Analysis Center of the Genome Sequencing Program of the National Human Genome Research Institute. She has been active in COVID-19 research.
Usha Govindarajulu, PhD
Center for Biostatistics
Icahn School of Medicine at Mount Sinai
Usha Govindarajulu is an Associate Professor in the Center for Biostatistics in the Department of Population Health Sciences of the Icahn School of Medicine at Mount Sinai. She earned an AB from Cornell University, an MS in Natural Resources from University of Michigan, and MS in Biostatistics from George Washington University, and a PhD in Biostatistics from Boston University After this she spent two years as a postdoctoral fellow at Harvard School of Public Health. She then worked for a year as research faculty at Yale University before moving back to Boston and working at Brigham & Women's and Harvard Medical School. After being there about 5 years, she moved to New York and took as a position as an Assistant Professor of Biostatistics at SUNY Downstate School of Public Health. She was there approximately 7 years before leaving to be in her current position. Her research interests are in survival analysis, frailty models, causal inference, genetic epidemiology, and machine learning. She is currently the 2020 Chair-Elect of the Section on Statistical Computing of the American Statistical Association.
Natalie Dean, PhD
Department of Biostatistics, College of Health & Health Professions
University of Florida
Dr. Natalie Dean is an assistant professor in the Department of Biostatistics at the University of Florida specializing in infectious disease epidemiology and study design. She is principal investigator on an NIH R01 to develop and evaluate innovative trial and observational study designs for assessing the efficacy of vaccines targeting emerging pathogens. Dr. Dean received her PhD in Biostatistics from Harvard University. She has been active in science communications during the COVID-19 pandemic, with recently published pieces in the New York Times, Washington Post, Medscape, Boston Review, and BMJ Opinion.
While the topic is very broad, we shall try to: (1) highlight some specific unique challenges based on the nature of the pandemic, e.g. our lack of knowledge about the virus coming in, the urgency to learn act quickly, yet the necessity to think careful and rigorously to avoid false steps and conclusions. (2) clearly communicate the importance of our profession and people with our quantitative skill sets to engage and have a seat at the table to have our perspective heard, both by policymakers and the media, during this crisis.
People in our profession need to have better communication with policymakers, and many in our field might not recognize their potential or the importance of our skillset and perspective to the big decisions going on in society. We hope our panel discussion can inspire more statisticians to get engaged in this way.
(Almost) All of Entity Resolution
October 2, 2020
10 a.m. to 12 p.m. Eastern
Rebecca C. Steorts
Assistant Professor, Department of Statistical Science
Rebecca C. Steorts received her B.S. in Mathematics in 2005 from Davidson College, her MS in Mathematical Sciences in 2007 from Clemson University, and her PhD in 2012 from the Department of Statistics at the University of Florida under the supervision of Malay Ghosh, where she was a U.S. Census Dissertation Fellow and was a recipient for Honorable Mention (second place) for the 2012 Leonard J. Savage Thesis Award in Applied Methodology. Rebecca was a Visiting Assistant Professor in 2012--2015, where she worked closely with Stephen E. Fienberg.
Rebecca is currently an Assistant Professor in the Department of Statistical Science at Duke University. She is affiliated faculty in the Departments of Computer Science and Biostatics and Bioinformatics, the information initiative at Duke (iiD), and the Social Science Research Institute.
Rebecca was named to MIT Technology Review's 35 Innovators Under 35 for 2015 as a humanitarian in the field of software. Her work was profiled in the September/October issue of MIT Technology Review and she was recognized with an invited talk at EmTech in November 2015. In addition, Rebecca is a recipient of a NSF CAREER award, a collaborative NSF award, a collaborative grant with the Laboratory of Analytic Sciences (LAS) at NC State University, a Metaknowledge Network Templeton Foundation Grant, the University of Florida (UF) Graduate Alumni Fellowship Award, the U.S. Census Bureau Dissertation Fellowship Award, and the UF Innovation through Institutional Integration Program (I-Cubed) and NSF for development of an introductory Bayesian course for undergraduates. Her research interests are in large scale clustering, record linkage (entity resolution or de-duplication), privacy, network analysis, and machine learning for computational social science applications.
Whether the goal is to estimate the number of people that live in a congressional district, to estimate the number of individuals that have died in an armed conflict, or to disambiguate individual authors using bibliographic data, all these applications have a common theme - integrating information from multiple sources. Before such questions can be answered, databases must be cleaned and integrated in a systematic and accurate way, commonly known as record linkage, de-duplication, or entity resolution. In this article, we review motivational applications and seminal papers that have led to the growth of this area. Specifically, we review the foundational work that began in the 1940's and 50's that have led to modern probabilistic record linkage. We review clustering approaches to entity resolution, semi- and fully supervised methods, and canonicalization, which are being used throughout industry and academia in applications such as human rights, official statistics, medicine, citation networks, among others. Finally, we discuss current research topics of practical importance.
The Role of Statistics in Transforming EHR Data into Knowledge
Friday, June 19, 2020
10 a.m. to 12 p.m. Eastern
Rebecca Hubbard, PhD
Associate Professor of Biostatistics
University of Pennsylvania
Dr. Rebecca Hubbard is an Associate Professor of Biostatistics at the University of Pennsylvania. Her research focuses on development and application of statistical methods to improve the validity of analyses using real world data sources including electronic health records and claims data. These methods have been applied across a broad range of research areas including health services research, cancer epidemiology, aging and dementia, and pharmacoepidemiology.
The widespread adoption of electronic health records (EHR) as a means of documenting medical care has created a vast resource for research on health conditions, interventions, and outcomes. Informaticians have played a leading role in the process of extracting “real world data” from EHR, with statisticians playing a more peripheral part. However, statistical insights on study design and inference are key to drawing valid conclusions from this messy and incomplete data source. This webinar will describe the basic structure of EHR data, highlight key challenges to research arising from this data structure, and present an overview of some statistical methods that address these challenges. The discussion of issues related to the structure and quality of EHR data will include: data types and methods for extracting variables of interest; sources of missing data; error in covariates and outcomes extracted from EHR and claims data; and data capture considerations such as informative visit processes and medical records coding procedures. The overall goal of this webinar is to illustrate the unique contribution of statistics to the process of generating knowledge from EHR data and equip participants with some tools for doing so.
Spatial Statistics for Disease Ecology
Friday, April 17. 2020
10 a.m. to 12 p.m. Eastern
Lance A. Waller, Ph.D.
Department of Biostatistics and Bioinformatics
Rollins School of Public Health
Lance A. Waller, Ph.D. is a Professor in the Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University. He is a member of the National Academy of Science Board on Mathematical Sciences and Analytics and has served on National Academies Committees on applied and theoretical statistics, cancer near nuclear facilities, geographic assessments of exposures to Agent Orange, and standoff explosive technologies. His research involves the development of statistical methods for geographic data including applications in environmental justice, epidemiology, disease surveillance, spatial cluster detection, conservation biology, and disease ecology. His research appears in biostatistical, statistical, environmental health, and ecology journals and in the textbook Applied Spatial Statistics for Public Health Data (2004, Wiley). Dr. Waller has also lead two separate T32 training grants, one from NIGMS and the other from NIEHS, and served as the Director of the NHLBI Summer Institute for Research Training in Biostatistics (SIBS) site at Emory for the past 9 years.
The field of disease ecology involves exploration of the multiple, dynamic interactions between pathogens, hosts, and the environment that result in the transmission of disease. Many of these interactions involve spatial or spatiotemporal components that determine the course of an outbreak and may offer potential interventions to stem the extent and duration of an outbreak at the population level. In this webinar, we provide a brief overview of the field of disease ecology, the motivating questions of interest, the nature of data involved, and the interaction of statistical and mathematical modeling addressing these questions with available data. We offer two illustrations relating to the monitoring and analysis of zoonic disease, namely: identifying geographic drivers of the spread of raccoon rabies along the Eastern United States and utilizing environmental data to enhance animal surveillance for plague in California. Both examples utilize concepts and analytic tools from spatial statistics to better understand and monitor geographically-referenced zoonotic diseases in wild animal populations.
Collaboration: Applications of RWE in drug development and methodologies for confounding control and A statistical roadmap for journey from real-world data to real-world evidence
Friday, February 28. 2020
10 a.m. to 12 p.m. Eastern
Hongwei Wang, PhD
Dr. Hongwei Wang is currently a Director of Global Medical Affairs Statistics, Data and Statistical Sciences at AbbVie. He held a PhD in Statistics from Rutgers University and his research interests include designing and analyzing real-world studies, network meta-analysis and advanced analytics. Before AbbVie, he worked at Merck and Sanofi.
Applications of RWE in Drug Development and Methodologies for Confounding Control: Real-world evidence (RWE) is playing an increasingly important role in drug development, from early in discovery throughout clinical development program to life-cycle management. RWE can augment randomized clinical trials for regulatory approval, establish the effectiveness and safety profile in routine clinical practice to support reimbursement decision, and constitute an integral part of scientific communication overall. Due to its noninterventional nature, a key challenge of robust RWE generation is to establish causal relationship between exposure and outcome. This talk focuses on several main methodologies for causal inference, consisting of IPTW, MLE, AIPTW, and TMLE using full data and matched data that is derived from propensity score matching, respectively. Following the RWE roadmap outlined in the first talk, practical considerations are given to facilitate the series of decisions for confounding control, such as defining estimand, usage of matching, and choice among different analytic frameworks.
Yixin Fang, PhD
Dr. Yixin Fang is director of Global Medical Affairs (GMA) Statistics at AbbVie. Since he joined AbbVie in January 2019, he has focused his research on real-world studies, comparative effectiveness research, and causal inference. After he received his PhD in statistics from Columbia University in 2006, he had been working in academia for 12 years, teaching young statisticians and doing research in different fields such as machine learning, high-dimensional data analysis, and big data analysis. Motivated by the research of Professor Mark van der Laan, he is promoting the applications of targeted learning in real-world data research, combining his experiences in both machine learning and causal inference.
A statistical roadmap for journey from real-world data to real-world evidence: Randomized controlled clinical trials (RCTs) are the gold standard for evaluating the safety and efficacy of pharmaceutical drugs, but in many cases their costs, duration, limited generalizability, and ethical or technical feasibility have caused some to look for real-world studies as alternatives. On the other hand, real-world data may be much less convincing due to the lack of randomization and the presence of confounding bias. In this presentation, we propose a statistical roadmap to translate real-world data (RWD) to robust real-world evidence (RWE). The roadmap consists of three main stations: (1) defining an estimand translating the research objective into a precise definition of the treatment effect that is to be estimated, (2) constructing an efficient estimator (minimum-variance unbiased estimator) for the estimation of the estimand, and (3) conducting sensitivity analysis to explore the robustness of the inference to deviation from the underlying no-unmeasured confounding assumption. The Food and Drug Administration (FDA) is working on guidelines, with a target to release a draft by 2021, to harmonize RWD applications and monitor the safety and effectiveness of pharmaceutical drugs using RWE. The proposed roadmap aligns with the newly released framework for FDA's RWE Program in December 2018 and we hope this statistical roadmap is useful for statisticians who are eager to embark on their journeys in the real-world research.
The Central Role of Personalized Solution in the Era of Digital Health
Friday, January 10, 2020
10 a.m. to 12 p.m. Eastern
University of Wisconsin, Madison
Enterprise Lead of Machine Learning and Artificial Intelligence Team
Dr. Haoda Fu is a senior research advisor and a enterprise lead for Machine Learning, Artiﬁcial Intelligence, and Digital Connected Care from Eli Lilly and Company. Dr. Haoda Fu is a Fellow of ASA (American Statistical Association). He is also an adjunct professor of biostatistics department, Indiana university school of medicine. Dr. Fu received his Ph.D. in statistics from University of Wisconsin - Madison in 2007 and joined Lilly after that. Since he joined Lilly, he is very active in statistics methodology research. He has more than 90 publications in the areas, such as Bayesian adaptive design, survival analysis, recurrent event modeling, personalized medicine, indirect and mixed treatment comparison, joint modeling, Bayesian decision making, and rare events analysis. In recent years, his research area focuses on machine learning and artiﬁcial intelligence. His research has been published in various top journals including JASA, JRSS, Biometrics, ACM, IEEE, JAMA, Annals of Internal Medicine etc.. He has been teaching topics of machine learning and AI in large industry conferences including teaching this topic in FDA workshop. He was board of directors for statistics organizations and program chairs, committee chairs such as ICSA, ENAR, and ASA Biopharm session.
Digital health is an important pharmaceutical industry trend in recent years, and it can bring significant disruptive innovation to transform healthcare industry. In this talk, we will provide an introduction on digital health and associated analytics challenges and opportunities. In particular, we will focus on the central role of personalized intervention in the era of digital health.
New Statistical Learning Methods for Optimizing Dynamic Treatment Decision Rules Leading Toward Personalized Health Care
Friday, December 6, 2019
10 a.m. to 12 p.m. Eastern
Department of Biostatistics
University of Michigan
Dr. Lu Wang is Associate Professor of Biostatistics at the University of Michigan, Ann Arbor, Associate Editor for the Journal of the American Statistical Association. She received her Ph.D. in Biostatistics from Harvard University in 2008 and joined the faculty at the University of Michigan in the same year. Dr. Wang's research focuses on statistical methods for evaluating dynamic treatment regimes, personalized health care, nonparametric and semiparametric regressions, missing data analysis, functional data analysis, and longitudinal (correlated/clustered) data analysis. She has been collaborating with investigators at M.D. Anderson Cancer Center, University of Michigan Medical School, and Harvard School of Public Health during the past 12 years.
In this talk, we present recent advances and statistical developments for evaluating Dynamic Treatment Regimes (DTR), which allow the treatment to be dynamically tailored according to evolving subject-level data. Identification of an optimal DTR is a key component for precision medicine and personalized health care. Specific topics covered in this talk include several recent projects with robust and flexible methods developed for the above research area. We will first introduce a dynamic statistical learning method, adaptive contrast weighted learning (ACWL), which combines doubly robust semiparametric regression estimators with flexible machine learning methods. We will further develop a tree-based reinforcement learning (T-RL) method, which builds an unsupervised decision tree that maintains the nature of batch-mode reinforcement learning. Unlike ACWL, T-RL handles the optimization problem with multiple treatment comparisons directly through a purity measure constructed with augmented inverse probability weighted estimators. T-RL is robust, efficient and easy to interpret for the identification of optimal DTRs. However, ACWL seems more robust against tree-type misspecification than T-RL when the true optimal DTR is non-tree-type. At the end of this talk, we will also present a new Stochastic-Tree Search method called ST-RL for evaluating optimal DTRs.
Subgroup Identification for Differential Treatment Effects
Friday, November 8, 2019
10 a.m. to 12 p.m. Eastern
Department of Statistics
University of Wisconsin, Madison
Wei-Yin Loh is Professor of Statistics at the University of Wisconsin, Madison. He received his PhD from Berkeley in 1982. His major research interests are in bootstrap methods and classification and regression trees. He is a fellow of the American Statistical Association and the Institute of Mathematical Statistics and a consultant to government and industry.
Many subgroup identification methods exist but they have not been compared together. To better understand the relative strengths and weaknesses of the methods, we briefly review those with publicly available software (FindIt, GUIDE, Interaction Trees, MOB, Outcome Weighted Estimation, PRIM, ROWSi, Sequential Batting, SIDES, and Virtual Twins) and then compare their performance on seven criteria: (i) variable selection bias, (ii) probability of false discovery, (iii) probability of correct variable identification, (iv) bias in subgroup treatment effect estimates, (v) expected subgroup size, (vi) expected size of subgroup treatment effects, and (vii) subgroup stability. We conclude with a bootstrap solution to performing post-selection inference on the selected subgroups.
Lessons and Strategies for a Career in Academia: A Conversation
Friday, December 14, 2018
10:00 am – 12:00 pm Eastern
Dr. Leslie McClure
Professor & Chair, Department of Epidemiology and Biostatistics
Dornsife School of Public Health at Drexel University
Dr. Elizabeth Stuart
Associate Dean for Education and Professor of Biostatistics, Mental Health, and Health Policy and Management
Johns Hopkins Bloomberg School of Public Health
As a Biostatistician, there are many paths to a successful career. Each has benefits and drawbacks and will depend on an individual's own skills and preferences. In this webinar, Drs. Elizabeth Stuart and Leslie McClure will host a conversation about their academic careers, including providing some strategies for success and describing some of the challenges they've faced. They'll consider important questions, such as: What to look for in a job? How to develop meaningful collaborations (and get out of those that are not productive)? How to prioritize activities with an eye towards promotion (e.g., collaborative and methodological projects)? How to balance teaching, research, and grant requirements? And how to balance all of that with things outside of work? However, the exact direction of the conversation will depend on the questions and engagement from webinar participants.
Machine Learning for Health Care Policy
Friday, November 30, 2018
10:00 am – 12:00 pm Eastern
Dr. Sherri Rose
Associate Professor of Health Care Policy (Biostatistics)
Harvard Medical School
Health care research is moving toward analytic systems that take large health databases and estimate quantities of interest both quickly and robustly, incorporating advances from statistics, machine learning, and computer science. Pressing questions in prediction and causal inference are being answered with machine learning techniques. I will give an overview of the specific challenges related to developing and deploying these statistical algorithms for health policy, including examples from the areas of health plan payment, mental health outcomes, cancer staging, and medical devices. This webinar will be accessible for graduate students with most technical derivations provided in references
Biostatistical Methods for Wearable and Implantable Technology
Friday, October 26, 2018
10:00 am – 12:00 pm Eastern
Dr. Ciprian Crainiceanu
Professor, Department of Biostatistics
Johns Hopkins University
Wearable and Implantable Technology (WIT) is rapidly changing the Biostatistics data analytic landscape due to their reduced bias and measurement error as well as to the sheer size and complexity of the signals. In this talk I will review some of the most used and useful sensors in Health Sciences and the ever expanding WIT analytic environment. I will describe the use of WIT sensors including accelerometers, heart monitors, glucose monitors and their combination with ecological momentary assessment (EMA). This rapidly expanding data eco-system is characterized by multivariate densely sampled time series with complex and highly non-stationary structures. I will introduce an array of scientific problems that can be answered using WIT and I will describe methods designed to analyze the WIT data from the micro- (sub-second-level) to the macro-scale (minute-, hour- or day-level) data.
Sensitivity analysis in observational research: introducing the E-value
Friday, September 28, 2018
10:00 am – 12:00 pm Eastern
Dr. Tyler VanderWeele
Professor of Epidemiology
Harvard School of Public Health
Sensitivity analysis is useful in assessing how robust an association is to potential unmeasured or uncontrolled confounding. This webinar introduces a new measure called the "E-value," which is related to the evidence for causality in observational studies that are potentially subject to confounding. The E-value is defined as the minimum strength of association, on the risk ratio scale, that an unmeasured confounder would need to have with both the treatment and the outcome to fully explain away a specific treatment–outcome association, conditional on the measured covariates. A large E-value implies that considerable unmeasured confounding would be needed to explain away an effect estimate. A small E-value implies little unmeasured confounding would be needed to explain away an effect estimate. The speaker and his collaborators propose that in all observational studies intended to produce evidence for causality, the E-value be reported or some other sensitivity analysis be used. They suggest calculating the E-value for both the observed association estimate (after adjustments for measured confounders) and the limit of the confidence interval closest to the null. If this were to become standard practice, the ability of the scientific community to assess evidence from observational studies would improve considerably, and ultimately, science would be strengthened.
Reference: VanderWeele, T.J. and Ding, P. (2017). Sensitivity analysis in observational research: introducing the E-value. Annals of Internal Medicine, 167:268-274.
Online E-value Calculator: https://mmathur.shinyapps.io/evalue/
Friday, June 22, 2018
10:00 am – 12:00 pm Eastern
Associate Director SMC
Novartis Pharma AG
A Bayesian approach provides the formal framework to incorporate external information into the statistical analysis of a clinical trial. There is an intrinsic interest of leveraging all available information for an efficient design and analysis of clinical trials. The use of external data in trials are nowadays used in earlier phases of drug development (Trippa, Rosnerand Muller, 2012; French, Thomas and Wang, 2012; Hueber et al., 2012), occasionally in phase III trials (French et al., 2012), and also in special areas such as medical devices (FDA, 2010a), orphan indications (Dupont and Van Wilder, 2011) and extrapolation in pediatric studies (Berry, 1989). This allows trials with smaller sample size or with unequal randomization (more subjects on treatment than control). In addition, the Bayesian statistical paradigm is a natural approach for combining information across heterogeneous sources, such as different trials or the adult and pediatric data. In this webinar we'll provide a statistical framework to incorporate trial external evidence with real life examples.
During the first part of the webENAR we will introduce the meta-analytic predictive (MAP) model (Neuenschwander, 2010). The MAP model is a Bayesian hierarchical model which combines the evidence from different sources (usually studies). The MAP model provides a prediction for a future study based on available information while accounting for inherent heterogeneity in the data. This approach can be used widely in different applications of biostatistics.
In the second part of the webENAR we will focus on three key applications of the MAP approach in biostatistics, which are (i) the derivation of informative priors from historical controls, (ii) probability of success and (iii) extrapolation from adult data to pediatrics. These applications will be demonstrated using the R package RBesT, the R Bayesian evidence synthesis tools, which are freely available from CRAN. The aim of the webinar is to teach the MAP approach and enable participants to apply the approach themselves with the help of RBesT.
Friday, April 13, 2018
11:00 am – 1:00 pm Eastern
Adjunct Professor, Statistical Computing Consultant
Department of Statistics
University of California, Berkeley
This webinar will introduce attendees to the NIMBLE system for programming with hierarchical models in R. NIMBLE
(r-nimble.org) is a system for flexible programming and dissemination of algorithms that builds on the BUGS language for declaring hierarchical models. NIMBLE provides analysts with a flexible system for using MCMC, sequential Monte Carlo and other techniques on user-specified models. It provides developers and methodologists with the ability to write algorithms in an R-like syntax that can be easily disseminated to users. C++ versions of models and algorithms are created for speed, but these are manipulated from R without any need for analysts or algorithm developers to program in C++.
While analysts can use NIMBLE as a drop-in replacement for WinBUGS or JAGS, NIMBLE provides greatly enhanced functionality in a number of ways. The webinar will first show how to specify a hierarchical statistical model using BUGS syntax (including user-defined function and distributions) and fit that model using MCMC (including user customization for better performance). We will demonstrate the use of NIMBLE for biostatistical methods such as semiparametric random effects models and clustering models. We will close with a discussion of how to use the system to write algorithms for use with hierarchical models, including building and disseminating your own methods.
Friday, December 1, 2017
11:00 am - 1:00 pm Eastern
Telba Irony, PhD
Center for Biologics Evaluation and Research
Regulatory authorities and patient advocacy groups have been paving the way towards engaging patients in medical product development and regulatory review. These efforts gave rise and relevance to the development of the Science of Patient Input, or SPI. SPI consists of scientifically valid qualitative and quantitative methods for capturing patient perspective information to incorporate it into product development and regulatory decision making. Two types of patient input, Patient Reported Outcomes (PRO) and Patient Preference Information (PPI) are expected to be captured in accordance with applicable scientific and statistical standards and best practices, and statisticians have a large role to play.
A PRO is a measurement based on a report of a patient’s health status that comes directly from the patient, without interpretation of the patient’s report by a clinician or anyone else. Some symptoms or other unobservable concepts known only to the patient, such as pain or fatigue, can only be measured by PRO measures.
PPI is a patient’s expression of desirability or value of one course of action or selection in contrast to others. It focuses on assessing the importance, or preferences, that patients place on the benefits, harms and other aspects of treatments.
In this Webinar we will introduce key elements concerning elicitation and use of patient preferences (PPI) to inform regulatory decision making. As an example, we will present a study commissioned by the FDA to elicit obese patients’ preferences in choosing weight-loss devices and show how these preferences can be used to inform regulatory decision making. We will describe the weight-loss device survey and present the survey results, which have been used to develop a decision-aid tool for regulatory reviewers. The tool provides estimates of patients’ benefit-risk tradeoff preferences and also stratifies patients according to their risk-tolerance. We will conclude the Webinar by sharing experiences in using patient preferences in the regulatory process and talking about best statistical practices for eliciting and using patient preference information.
Friday, November 17, 2017
11:00 am – 1:00 pm Eastern
Dr. Mithat Gönen
Chief, Biostatistics Service
Memorial Sloan Kettering Cancer Center
Cancer clinical trials have traditionally been designed specific to a disease site (breast, lung, colon etc). This paradigm is being challenged by the advent of targeted treatments, regimens targeting molecular alterations in cancer cells. Since targeted treatments are not site-specific the trials evaluating them increasingly include multiple sites where the target is expressed. These trials are often called basket trials. In this WebENAR we will present several possible designs for basket trials: parallel design, aggregation design and hierarchical model-based design; comparing their operating characteristics, strengths and weaknesses. Although their applications have mostly been in oncology so far, basket trials can be used in any disease where targeted treatments can be used in molecularly defined subgroups. We will give examples of publicly available software that can be used to design and analyze basket trials.
Friday, October 20, 2017
10:00 am – 12:00 pm Eastern
Dr. Christopher Jackson
Senior Statistician, MRC Biostatistics Unit, University of Cambridge
School of Clinical Medicine, Cambridge Institute of Public Health
Multi-state models are stochastic processes which describe how an individual moves between a set of discrete states in continuous time. They have been used for two broad classes of data. Firstly, for "panel data": intermittent observations of the state at a finite series of times, for a set of individuals, where transition times are not known. Secondly, for times to multiple events for a set of individuals, so that the state at any time is known. Combinations or slight variants of these two data types are also possible.
Friday, June 30, 2017
10:00 am – 12:00 pm Eastern
Dr. Eric Laber
North Carolina State University
In this webinar we will cover the methodological and computational underpinnings of data-driven decision making with application to precision medicine. Planned topics covered include: (i) formalizing optimal decision making through potential outcomes; (ii) regression and classification-based methods for single-stage decision problems; (iii) approximate dynamic programming and direct-search methods for multi-stage decision problems; (iv) interpretability and the research-practice gap; and (v) current research topics and open problems. We do not assume that participants have any prior exposure to these topics; however, it is assumed that participants have at least a masters-level understanding of biostatistics.
Friday, May 19, 2017
10:00 am – 12:00 pm Eastern
Kenneth G. Kowalski, MS
Kowalski PMetrics Consulting, LLC
Wenping Wang, PhD
Novartis Pharmaceuticals Corporation
This WebENAR will be presented in two parts. The first part will focus on a commentary presented by Ken Kowalski discussing the overlap between statisticians and pharmacometricians working in clinical drug development. Individuals with training in various academic disciplines including pharmacokinetics, pharmacology, engineering and statistics, to name a few, have pursued careers as pharmacometricians. While pharmcometrics has benefitted greatly from advances in statistical methodology, there is considerable tension and skepticism between biostatisticians and pharmacometricians as they apply their expertise to drug development applications. This talk explores some of the root causes for this tension and provides some suggestions for improving collaborations between statisticians and pharmcometricians. The talk concludes with a plea for more statisticians to consider careers as pharmacometrics practitioners. The second part of the WebENAR will highlight a case study presented by Wenping Wang illustrating the application of pharmacokinetic-pharmacodynamic modeling of the time to first flare to support dose justification of Canakinumab in a sBLA submission. The case study will conclude with some observations regarding team interactions between statisticians and pharmacometricians that resulted in a successful sBLA submission.
Friday, April 21, 2017
11:00 am – 1:00 pm Eastern
Dr. Layla Parast
The use of surrogate markers to estimate and test for a treatment effect has been an area of popular research. Given the long follow-up periods that are often required for treatment or intervention studies, appropriate use of surrogate marker information has the potential to decrease required follow-up time. However, previous studies have shown that using inadequate markers or making inappropriate assumptions about the relationship between the primary outcome and the surrogate marker can lead to inaccurate conclusions regarding the treatment effect. Many of the available methods for identifying, validating and using surrogate markers to test for a treatment effect tend to rely on restrictive model assumptions and/or focus on uncensored outcomes. In this course, I will describe different approaches to quantify the proportion of treatment effect explained by surrogate marker information in both a non-survival outcome setting and censored survival outcome setting. One described approach will be a nonparametric method that can accommodate a setting where individuals may experience the primary outcome before the surrogate marker is measured. I will illustrate the procedures using an R package available on CRAN to examine potential surrogate markers for diabetes with data from the Diabetes Prevention Program.
Friday, February 24, 2017
10:00 am – 12:00 pm Eastern
Dr. Michael O'Kelly
Dr. Bohdana Ratitch
Dr. Ilya Lipkovich
Center for Statistics in Drug Development
Most experiments have missing data. When there are missing data, it is useful to provide sensitivity analyses to allow the reader of the account of the research to assess the robustness to the missing data of any conclusions made. Using the pattern-mixture framework, a variety of assumptions can be implemented with regard to categories of missing outcomes. Assumptions that would tend to undermine the alternative hypothesis can be especially useful for assessing robustness of conclusions. Multiple imputation (MI) is one quite straightforward way of implementing such pattern-mixture approaches. While MI is a standard tool for continuous outcomes, recently researchers have come up with ways of implementing MI for other outcomes, such as time-to-event and recurrent-event outcomes. This webinar describes a number of these new applications of the MI idea. The strengths and weaknesses of these approaches are described and illustrated via examples and simulations.
This webinar was not recorded and is not available for on-demand purchase.
Friday, February 3, 2017
10:00 am – 12:00 pm Eastern
Dr. Alex Dmitrienko
Founder & President
This webinar focuses on a broad class of statistical problems related to optimizing the design and analysis of Phase II and III trials (Dmitrienko and Pulkstenis, 2017). This general topic has attracted much attention across the clinical trial community due to increasing pressure to reduce implementation costs and shorten timelines in individual trials and development programs.
The Clinical Scenario Evaluation (CSE) framework (Benda et al., 2010) will be described to formulate a general approach to clinical trial optimization and decision-making. Using the CSE approach, main objectives of clinical trial optimization will be formulated, including selection of clinically relevant optimization criteria, identification of sets of optimal and nearly optimal values of the parameters of interest, and sensitivity assessments. Key principles of clinical trial optimization will be illustrated using a problem of identifying efficient and robust multiplicity adjustment strategies in late-stage trials (Dmitrienko et al., 2009; Dmitrienko, D’Agostino and Huque, 2013; Dmitrienko, Paux and Brechenmacher, 2015).
Software tools for applying optimization methods will be presented, including R software (Mediana package) and Windows applications with a graphical user interface (MedianaFixedDesign application).
Friday, December 9, 2016
10:00 am – 12:00 pm Eastern
Dr. Nilanjan Chatterjee
Bloomberg Distinguished Professor
Department of Biostatistics, Bloomberg School of Public Health
Department of Oncology, School of Medicine
Johns Hopkins University
Decreasing cost of large scale genotyping and sequencing technologies is fuelling investigation of association between complex traits and genetic variants across the whole genome using studies of massive sample sizes. Recent genome-wide association studies (GWAS) focused on common variants have already led to the discoveries of thousands of genetic loci across variety of complex traits, including chronic diseases such as cancers, heart diseases and type-2 diabetes. Future studies of less common and rare variants hold further promise for discovery of new genetic loci and better understanding of causal mechanisms underlying existing loci. The webinar will provide brief review of some state of the art design and analysis issues faced in the field. The topics will include sample size requirement and power calculations, methods for single- and multi-marker association testing, estimation of heritability and effect-size distribution, techniques for pleiotropic and Mendelian randomization analyses and genetic risk prediction.
Friday, October 28, 2016
10:00 am – 12:00 pm Eastern
Dr. David Dunson
Arts & Sciences Professor of Statistical Science, Mathematics and Electrical & Computer Engineering
This webinar will provide an introduction to the practical use of nonparametric Bayesian methods in the analysis and interpretation of data from biomedical studies. I will start with a very brief review of the Bayesian paradigm, rapidly leading into what is meant by "Nonparametric Bayes." I'll then describe some canonical nonparametric Bayes models, including Dirichlet process mixtures and Gaussian processes. Basic practical properties and approaches for computation will be sketched, and I'll provide a practical motivation through some biomedical applications ranging from genomics to epidemiology to neuroscience. I'll finish up by describing some possibilities in terms of more advanced models that allow the density of a response variable to change flexibly with predictors, while providing practical motivation and implementation details.
Friday, May 20, 2016
11:00 am- 1:00 pm Eastern
David M. Murray, Ph.D.
Associate Director for Prevention
Director, Office of Disease Prevention
Office of the Director
National Institutes of Health
This webinar will review key issues and their solutions for pragmatic trials in public health and medicine. Pragmatic trials are used increasingly in health care settings to help clinicians choose between options for care. They often involve group- or cluster-randomization, though alternatives to randomized trials are also available. Many current trials rely upon electronic health records as the major source for data. These studies face a variety of challenges in the development and delivery of their interventions, research design, informed consent, data collection, and data analysis. This webinar will review these issues both generally and using examples from the Health Care Systems Collaboratory. The HCS Collaboratory is an NIH funded consortium of nine pragmatic trials that address a variety of health issues and outcomes, all conducted within health care systems, all relying on electronic health records as their primary source of data, with most implemented as a group- or cluster-randomized trial.
Friday, April 15, 2016
10:00 am- 12:00 pm Eastern
F. DuBois Bowman
Dr. Daniel Drake
Dr. Ben Cassidy
Department of Biostatistics, Mailman School of Public Health
Brain imaging scanners collect detailed information on brain function and various aspects of brain structure. When used as a research tool, imaging enables studies to investigate brain function related to emotion, cognition, language, memory, and responses to numerous other external stimuli, as well as resting-state brain function. Brain imaging studies also attempt to determine the functional or structural basis for psychiatric or neurological disorders and to examine the responses of these disorders to treatment. Neuroimaging data, particularly functional images, are massive and exhibit complex patterns of temporal and spatial dependence, which pose analytic challenges. There is a critical need for statisticians to establish rigorous methods to extract information and to quantify evidence for formal inferences. In this webinar, I briefly provide background on various types of neuroimaging data (with an emphasis on functional data) and analysis objectives that are commonly targeted in the field. I also present a survey of existing methods aimed at these objectives and identify particular areas offering opportunities for future statistical contribution.
Friday, February 12, 2016
10:00 am - 12:00 pm Eastern
Adrian E. Raftery
Professor of Statistics and Sociology
University of Washington
Projections of countries' future populations, broken down by age and sex, are widely used for planning and research. They are mostly done deterministically, but there is a widespread need for probabilistic projections. I will describe a Bayesian statistical method for probabilistic population projections for all countries. These new methods have been used by the United Nations to produce their most recent population projections for all countries.
December 4, 2015
10:00 am - 12:00 pm Eastern
Dr. Mohamed Alosh & Dr. Kathleen Fritsch
Division of Biometrics III, Office of Biostatistics, OTS, CDER, FDA
For a confirmatory clinical trial that established treatment efficacy in the overall population, subgroup analysis aims to investigate the extent of benefits from the therapy for the major subgroups. Consequently, findings from the subgroup analysis play a major role in interpreting the trial results. This presentation focuses on two areas related to subgroup analysis in a confirmatory clinical trial: (i) investigating consistency of treatment effect across subgroups, and (ii) designing a clinical trial with the objective of establishing treatment efficacy in a targeted subgroup in addition to the overall population. The presentation also outlines the regulatory guidelines for subgroup analysis in such trials and provides examples of clinical trials where subgroup analysis played a role in determining the population for treatment use.
Friday, November 20, 2015
10:00 am - 12:00 pm Eastern
Dr. Keith Baggerly
The University of Texas MD Anderson Cancer Center
The buzz phrase "Reproducible Research" refers to studies where the raw data and code supplied are enough to let a new investigator exactly match the reported results without a huge amount of effort. "Replicable Research" refers to studies whose methods, when applied to new data, give rise to qualitatively similar results. Particularly as experiments get bigger, more involved, and more expensive, reproducibility should precede replication. Unfortunately, more attention is now being focused on such issues due to some high-profile failures.
In this talk, we first illustrate the issues with some case studies from oncology showing the types of things that can go wrong, the simple nature of the most common mistakes, and what the implications can be: e.g., treating patients incorrectly. We then give some point estimates of how widespread the problems of reproducibility and replicability are thought to be, and discuss some additional problems associated with the replication. We survey tools introduced in the past few years which have made assembling reproducible studies markedly easier, discuss considerations to be applied when considering replication, and give pointers to some resources for further information.
Friday, June 26, 2015
10:00 am to 12:00 pm Eastern
Department of Biostatistics
Johns Hopkins University
Functional Magnetic Resonance Imaging (fMRI) is a non-invasive technique for studying brain activity. During the past two decades fMRI has provided researchers with an unprecedented access to the inner workings of the brain, leading to countless new insights into how the brain processes information. The field that has grown around the acquisition and analysis of fMRI data has experienced a rapid growth in the past several years and found applications in a wide variety of areas. This webinar introduces fMRI and discusses key statistical aspects involved in the analysis of fMRI data. Topics include: (a) an overview of the acquisition and reconstruction of fMRI data; (b) overview of the physiological basis of the fMRI signal; (c) common experimental designs; (d) pre-processing steps; (d) methods for localizing areas activated by a task; (e.) connectivity analysis; and (f.) prediction and brain decoding.
Friday, May 8, 2015
10:00 am to 12:00 pm Eastern
Cynthia Rudin, PhD, Associate Professor of Statistics, MIT CSAIL and Sloan School of Management, Massachusetts Institute of Technology
Possibly *the* most important obstacle in the deployment of predictive models is the fact that humans simply do not trust them. If it is known exactly which variables were important for the prediction and how they were combined, this information can be very powerful in helping to convince people to believe (or not believe) the prediction and make the right decision. In this talk I will discuss algorithms for making these non-black box predictions including:
Friday, February 20, 2015
11:00 am to 1:00 pm EST
Sharon-Lise Normand, Department of Health Care Policy, Harvard Medical School & Department of Biostatistics, Harvard School of Public Health
Comparative Effectiveness Research (CER) refers to a body of research that generates and synthesizes evidence on the comparative benefits and harms of alternative interventions to prevent, diagnose, treat, and monitor clinical conditions, or to improve the delivery of health care. The evidence from CER is intended to support clinical and policy decision making at both the individual and the population level. While the growth of massive health care data sources has given rise to new opportunities for CER, several statistical challenges have also emerged. This tutorial will provide an overview of the types of research questions addressed by CER, review the main statistical methodology currently utilized, and highlight areas where new methodology is required. Inferential issues in the "big data" context are identified. Examples from cardiology and mental illness will illustrate methodological issues.
Friday, January 30, 2015
11:00 am to 1:00 pm EST
Presenter: Rafa Irizarry, PhD
Professor of Biostatistics and Computational Biology at the Dana Farber Cancer Center
Professor of Biostatistics at the Harvard School of Public Health
In this webinar I will give an overview of genomics technologies and the challenges arising when analyzing the data they produce. Specifically, I will focus on microarrays and next generation sequencing technologies. We will cover statistical issues related to preprocessing, normalization, detecting differential expression, and dealing with batch effects.
December 5, 2014
11:00 am to 1:00 pm (EST)
Presenter: Marie Davidian, PhD North Carolina State University
Description: Treatment of patients with chronic diseases or disorders in clinical practice involves a series of decisions made over time. Clinicians adjust, change, modify, or discontinue therapies based on the patient's observed progress, side effects, compliance, and so on, with the goal of "personalizing" treatment to the patient in order to provide the best care. The decisions are typically based on synthesis of the available information on the patient using clinical experience and judgment.
A "dynamic treatment regime," also referred to as an "adaptive treatment strategy," is a set of sequential rules that dictate how to make decisions on treatment of a patient over time. Each rule corresponds to a key decision point at which a decision on which treatment action to take from among the available options must be made. Based on patient information, the rule outputs the next treatment action. Thus, a dynamic treatment regime is an algorithm that formalizes the way clinicians manage patients in practice.
In this presentation, we introduce the notion of a dynamic treatment regime and an appropriate statistical framework in which treatment regimes can be studied. We demonstrate how statistical inference may be made on the effects of different regimes based on data from so-called sequential, multiple assignment, randomized trials (SMARTs). We conclude with a discussion of current challenges, including the development of "optimal" treatment regimes. The material presented is ideal background for the shortcourse on personalized medicine and optimal dynamic treatment regimes to be offered at the ENAR Spring Meeting in March 2015.