Evaluation of a geriatrics primary care model using prospective matching to guide enrollment

https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01360-4

Abstract

Background

Few definitive guidelines exist for rigorous large-scale prospective evaluation of nonrandomized programs and policies that require longitudinal primary data collection. In Veterans Affairs (VA) we identified a need to understand the impact of a geriatrics primary care model (referred to as GeriPACT); however, randomization of patients to GeriPACT vs. a traditional PACT was not feasible because GeriPACT has been rolled out nationally, and the decision to transition from PACT to GeriPACT is made jointly by a patient and provider. We describe our study design used to evaluate the comparative effectiveness of GeriPACT compared to a traditional primary care model (referred to as PACT) on patient experience and quality of care metrics.

Methods

We used prospective matching to guide enrollment of GeriPACT-PACT patient dyads across 57 VA Medical Centers. First, we identified matches based an array of administratively derived characteristics using a combination of coarsened exact and distance function matching on 11 identified key variables that may function as confounders. Once a GeriPACT patient was enrolled, matched PACT patients were then contacted for recruitment using pre-assigned priority categories based on the distance function; if eligible and consented, patients were enrolled and followed with telephone surveys for 18 months.

Results

We successfully enrolled 275 matched dyads in near real-time, with a median time of 7 days between enrolling a GeriPACT patient and a closely matched PACT patient. Standardized mean differences of < 0.2 among nearly all baseline variables indicates excellent baseline covariate balance. Exceptional balance on survey-collected baseline covariates not available at the time of matching suggests our procedure successfully controlled many known, but administratively unobserved, drivers of entrance to GeriPACT.

Conclusions

We present an important process to prospectively evaluate the effects of different treatments when randomization is infeasible and provide guidance to researchers who may be interested in implementing a similar approach. Rich matching variables from the pre-treatment period that reflect treatment assignment mechanisms create a high quality comparison group from which to recruit. This design harnesses the power of national administrative data coupled with collection of patient reported outcomes, enabling rigorous evaluation of non-randomized programs or policies.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01360-4


Sampling strategies to evaluate the prognostic value of a new biomarker on a time-to-event end-point

Abstract

Background

The availability of large epidemiological or clinical data storing biological samples allow to study the prognostic value of novel biomarkers, but efficient designs are needed to select a subsample on which to measure them, for parsimony and economical reasons. Two-phase stratified sampling is a flexible approach to perform such sub-sampling, but literature on stratification variables to be used in the sampling and power evaluation is lacking especially for survival data.

Methods

We compared the performance of different sampling designs to assess the prognostic value of a new biomarker on a time-to-event endpoint, applying a Cox model weighted by the inverse of the empirical inclusion probability.

Results

Our simulation results suggest that case-control stratified (or post stratified) by a surrogate variable of the marker can yield higher performances than simple random, probability proportional to size, and case-control sampling. In the presence of high censoring rate, results showed an advantage of nested case-control and counter-matching designs in term of design effect, although the use of a fixed ratio between cases and controls might be disadvantageous. On real data on childhood acute lymphoblastic leukemia, we found that optimal sampling using pilot data is greatly efficient.

Conclusions

Our study suggests that, in our sample, case-control stratified by surrogate and nested case-control yield estimates and power comparable to estimates obtained in the full cohort while strongly decreasing the number of patients required. We recommend to plan the sample size and using sampling designs for exploration of novel biomarker in clinical cohort data.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01283-0

Evaluating complex interventions in context: systematic, meta-narrative review of case study approaches

Abstract

Background

There is a growing need for methods that acknowledge and successfully capture the dynamic interaction between context and implementation of complex interventions. Case study research has the potential to provide such understanding, enabling in-depth investigation of the particularities of phenomena. However, there is limited guidance on how and when to best use different case study research approaches when evaluating complex interventions. This study aimed to review and synthesise the literature on case study research across relevant disciplines, and determine relevance to the study of contextual influences on complex interventions in health systems and public health research.

Methods

Systematic meta-narrative review of the literature comprising (i) a scoping review of seminal texts (n = 60) on case study methodology and on context, complexity and interventions, (ii) detailed review of empirical literature on case study, context and complex interventions (n = 71), and (iii) identifying and reviewing ‘hybrid papers’ (n = 8) focused on the merits and challenges of case study in the evaluation of complex interventions.

Results

We identified four broad (and to some extent overlapping) research traditions, all using case study in a slightly different way and with different goals: 1) developing and testing complex interventions in healthcare; 2) analysing change in organisations; 3) undertaking realist evaluations; 4) studying complex change naturalistically. Each tradition conceptualised context differently—respectively as the backdrop to, or factors impacting on, the intervention; sets of interacting conditions and relationships; circumstances triggering intervention mechanisms; and socially structured practices. Overall, these traditions drew on a small number of case study methodologists and disciplines. Few studies problematised the nature and boundaries of ‘the case’ and ‘context’ or considered the implications of such conceptualisations for methods and knowledge production.

Conclusions

Case study research on complex interventions in healthcare draws on a number of different research traditions, each with different epistemological and methodological preferences. The approach used and consequences for knowledge produced often remains implicit. This has implications for how researchers, practitioners and decision makers understand, implement and evaluate complex interventions in different settings. Deeper engagement with case study research as a methodology is strongly recommended.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01418-3

Pre-statistical harmonization of behavrioal instruments across eight surveys and trials

Abstract

Background

Data harmonization is a powerful method to equilibrate items in measures that evaluate the same underlying construct. There are multiple measures to evaluate dementia related behavioral symptoms. Pre-statistical harmonization of behavioral instruments in dementia research is the first step to develop a statistical crosswalk between measures. Studies that conduct pre-statistical harmonization of behavioral instruments rarely document their methods in a structured, reproducible manner. This is a crucial step which entails careful review, documentation and scrutiny of source data to ensure sufficient comparability between items prior to data pooling. Here, we document the pre-statistical harmonization of items measuring behavioral and psychological symptoms among people with dementia. We provide a box of recommended procedure for future studies.

Methods

We identified behavioral instruments that are used in clinical practice, a national survey, and randomized trials of dementia care interventions. We rigorously reviewed question content and scoring procedures to establish sufficient comparability across items as well as item quality prior to data pooling. Additionally, we standardized coding to Stata-readable format, which allowed us to automate approaches to identify potential cross-study differences in items and low-quality items. To ensure reasonable model fit for statistical co-calibration, we estimated two-parameter logistic Item Response Theory models within each of the eight studies.

Results

We identified 59 items from 11 behavioral instruments across the eight datasets. We found considerable cross-study heterogeneity in administration and coding procedures for items that measure the same attribute. Discrepancies existed in terms of directionality and quantification of behavioral symptoms for even seemingly comparable items. We resolved item response heterogeneity, missingness and skewness, conditional dependency prior to estimation of item response theory models for statistical co-calibration. We used several rigorous data transformation procedures to address these issues, including re-coding and truncation.

Conclusions

This study highlights the importance of each aspect involved in the pre-statistical harmonization process of behavioral instruments. We provide guidelines and recommendations for how future research may detect and account for similar issues in pooling behavioral and related instruments.

https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01431-6

Mediation analysis methods used in observational research: a scoping review and recommendations

Abstract

Background

Mediation analysis methodology underwent many advancements throughout the years, with the most recent and important advancement being the development of causal mediation analysis based on the counterfactual framework. However, a previous review showed that for experimental studies the uptake of causal mediation analysis remains low. The aim of this paper is to review the methodological characteristics of mediation analyses performed in observational epidemiologic studies published between 2015 and 2019 and to provide recommendations for the application of mediation analysis in future studies.

Methods

We searched the MEDLINE and EMBASE databases for observational epidemiologic studies published between 2015 and 2019 in which mediation analysis was applied as one of the primary analysis methods. Information was extracted on the characteristics of the mediation model and the applied mediation analysis method.

Results

We included 174 studies, most of which applied traditional mediation analysis methods (n = 123, 70.7%). Causal mediation analysis was not often used to analyze more complicated mediation models, such as multiple mediator models. Most studies adjusted their analyses for measured confounders, but did not perform sensitivity analyses for unmeasured confounders and did not assess the presence of an exposure-mediator interaction.

Conclusions

To ensure a causal interpretation of the effect estimates in the mediation model, we recommend that researchers use causal mediation analysis and assess the plausibility of the causal assumptions. The uptake of causal mediation analysis can be enhanced through tutorial papers that demonstrate the application of causal mediation analysis, and through the development of software packages that facilitate the causal mediation analysis of relatively complicated mediation models.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01426-3

Major adverse cardiovascular event definitions used in observational analysis of administrative databases: a systematic review

Abstract

Background

Major adverse cardiovascular events (MACE) are increasingly used as composite outcomes in randomized controlled trials (RCTs) and observational studies. However, it is unclear how observational studies most commonly define MACE in the literature when using administrative data.

Methods

We identified peer-reviewed articles published in MEDLINE and EMBASE between January 1, 2010 to October 9, 2020. Studies utilizing administrative data to assess the MACE composite outcome using International Classification of Diseases 9th or 10th Revision diagnosis codes were included. Reviews, abstracts, and studies not providing outcome code definitions were excluded. Data extracted included data source, timeframe, MACE components, code definitions, code positions, and outcome validation.

Results

A total of 920 articles were screened, 412 were retained for full-text review, and 58 were included. Only 8.6% (n = 5/58) matched the traditional three-point MACE RCT definition of acute myocardial infarction (AMI), stroke, or cardiovascular death. None matched four-point (+unstable angina) or five-point MACE (+unstable angina and heart failure). The most common MACE components were: AMI and stroke, 15.5% (n = 9/58); AMI, stroke, and all-cause death, 13.8% (n = 8/58); and AMI, stroke and cardiovascular death 8.6% (n = 5/58). Further, 67% (n = 39/58) did not validate outcomes or cite validation studies. Additionally, 70.7% (n = 41/58) did not report code positions of endpoints, 20.7% (n = 12/58) used the primary position, and 8.6% (n = 5/58) used any position.

Conclusions

Components of MACE endpoints and diagnostic codes used varied widely across observational studies. Variability in the MACE definitions used and information reported across observational studies prohibit the comparison, replication, and aggregation of findings. Studies should transparently report the administrative codes used and code positions, as well as utilize validated outcome definitions when possible.

https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01440-5


Advancing data science in drug development through an innovative computational framework for data sharing and statistical analysis

Abstract

Background

Novartis and the University of Oxford’s Big Data Institute (BDI) have established a research alliance with the aim to improve health care and drug development by making it more efficient and targeted. Using a combination of the latest statistical machine learning technology with an innovative IT platform developed to manage large volumes of anonymised data from numerous data sources and types we plan to identify novel patterns with clinical relevance which cannot be detected by humans alone to identify phenotypes and early predictors of patient disease activity and progression.

Method

The collaboration focuses on highly complex autoimmune diseases and develops a computational framework to assemble a research-ready dataset across numerous modalities. For the Multiple Sclerosis (MS) project, the collaboration has anonymised and integrated phase II to phase IV clinical and imaging trial data from ≈35,000 patients across all clinical phenotypes and collected in more than 2200 centres worldwide. For the “IL-17” project, the collaboration has anonymised and integrated clinical and imaging data from over 30 phase II and III Cosentyx clinical trials including more than 15,000 patients, suffering from four autoimmune disorders (Psoriasis, Axial Spondyloarthritis, Psoriatic arthritis (PsA) and Rheumatoid arthritis (RA)).

Results

A fundamental component of successful data analysis and the collaborative development of novel machine learning methods on these rich data sets has been the construction of a research informatics framework that can capture the data at regular intervals where images could be anonymised and integrated with the de-identified clinical data, quality controlled and compiled into a research-ready relational database which would then be available to multi-disciplinary analysts. The collaborative development from a group of software developers, data wranglers, statisticians, clinicians, and domain scientists across both organisations has been key. This framework is innovative, as it facilitates collaborative data management and makes a complicated clinical trial data set from a pharmaceutical company available to academic researchers who become associated with the project.

Conclusions

An informatics framework has been developed to capture clinical trial data into a pipeline of anonymisation, quality control, data exploration, and subsequent integration into a database. Establishing this framework has been integral to the development of analytical tools.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01409-4

Impact of the COVID-19 pandemic on publication dynamics and non-COVID-19 research production

Abstract

Background

The COVID-19 pandemic has severely affected health systems and medical research worldwide but its impact on the global publication dynamics and non-COVID-19 research has not been measured. We hypothesized that the COVID-19 pandemic may have impacted the scientific production of non-COVID-19 research.

Methods

We conducted a comprehensive meta-research on studies (original articles, research letters and case reports) published between 01/01/2019 and 01/01/2021 in 10 high-impact medical and infectious disease journals (New England Journal of Medicine, Lancet, Journal of the American Medical Association, Nature Medicine, British Medical Journal, Annals of Internal Medicine, Lancet Global Health, Lancet Public Health, Lancet Infectious Disease and Clinical Infectious Disease). For each publication, we recorded publication date, publication type, number of authors, whether the publication was related to COVID-19, whether the publication was based on a case series, and the number of patients included in the study if the publication was based on a case report or a case series. We estimated the publication dynamics with a locally estimated scatterplot smoothing method. A Natural Language Processing algorithm was designed to calculate the number of authors for each publication. We simulated the number of non-COVID-19 studies that could have been published during the pandemic by extrapolating the publication dynamics of 2019 to 2020, and comparing the expected number to the observed number of studies.

Results

Among the 22,525 studies assessed, 6319 met the inclusion criteria, of which 1022 (16.2%) were related to COVID-19 research. A dramatic increase in the number of publications in general journals was observed from February to April 2020 from a weekly median number of publications of 4.0 (IQR: 2.8–5.5) to 19.5 (IQR: 15.8–24.8) (p < 0.001), followed afterwards by a pattern of stability with a weekly median number of publications of 10.0 (IQR: 6.0–14.0) until December 2020 (p = 0.045 in comparison with April). Two prototypical editorial strategies were found: 1) journals that maintained the volume of non-COVID-19 publications while integrating COVID-19 research and thus increased their overall scientific production, and 2) journals that decreased the volume of non-COVID-19 publications while integrating COVID-19 publications. We estimated using simulation models that the COVID pandemic was associated with a 18% decrease in the production of non-COVID-19 research. We also found a significant change of the publication type in COVID-19 research as compared with non-COVID-19 research illustrated by a decrease in the number of original articles, (47.9% in COVID-19 publications vs 71.3% in non-COVID-19 publications, p < 0.001). Last, COVID-19 publications showed a higher number of authors, especially for case reports with a median of 9.0 authors (IQR: 6.0–13.0) in COVID-19 publications, compared to a median of 4.0 authors (IQR: 3.0–6.0) in non-COVID-19 publications (p < 0.001).

Conclusion

In this meta-research gathering publications from high-impact medical journals, we have shown that the dramatic rise in COVID-19 publications was accompanied by a substantial decrease of non-COVID-19 research.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01404-9

Modular literature review: a novel systematic search and review method to support priority setting in health policy and practice

Abstract

Background

There is an unmet need for review methods to support priority-setting, policy-making and strategic planning when a wide variety of interventions from differing disciplines may have the potential to impact a health outcome of interest. This article describes a Modular Literature Review, a novel systematic search and review method that employs systematic search strategies together with a hierarchy-based appraisal and synthesis of the resulting evidence.

Methods

We designed the Modular Review to examine the effects of 43 interventions on a health problem of global significance. Using the PICOS (Population, Intervention, Comparison, Outcome, Study design) framework, we developed a single four-module search template in which population, comparison and outcome modules were the same for each search and the intervention module was different for each of the 43 interventions. A series of literature searches were performed in five databases, followed by screening, extraction and analysis of data. “ES documents”, source documents for effect size (ES) estimates, were systematically identified based on a hierarchy of evidence. The evidence was categorised according to the likely effect on the outcome and presented in a standardised format with quantitative effect estimates, meta-analyses and narrative reporting. We compared the Modular Review to other review methods in health research for its strengths and limitations.

Results

The Modular Review method was used to review the impact of 46 antenatal interventions on four specified birth outcomes within 12 months. A total of 61,279 records were found; 35,244 were screened by title-abstract. Six thousand two hundred seventy-two full articles were reviewed against the inclusion criteria resulting in 365 eligible articles.

Conclusions

The Modular Review preserves principles that have traditionally been important to systematic reviews but can address multiple research questions simultaneously. The result is an accessible, reliable answer to the question of “what works?”. Thus, it is a well-suited literature review method to support prioritisation, decisions and planning to implement an agenda for health improvement.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01463-y

Feasibility of a hybrid clinical trial for respiratory virus detection in toddlers during the influenza season

Abstract

Background

Traditional clinical trials are conducted at investigator sites. Participants must visit healthcare facilities several times for the trial procedures. Decentralized clinical trials offer an interesting alternative. They use telemedicine and other technological solutions (apps, monitoring devices or web platforms) to decrease the number of visits to study sites, minimise the impact on daily routine, and decrease geographical barriers for participants. Not much information is available on the use of decentralization in randomized clinical trials with vaccines.

Methods

A hybrid clinical trial may be assisted by parental recording of symptoms using electronic log diaries in combination with home collected nasal swabs. During two influenza seasons, children aged 12 to 35 months with a history of recurrent acute respiratory infections were recruited in 12 primary health centers of the Valencia Region in Spain. Parents completed a symptom diary through an ad hoc mobile app that subsequently assessed whether it was an acute respiratory infection and requested collection of a nasal swab. Feasibility was measured using the percentage of returned electronic diaries and the validity of nasal swabs collected during the influenza season. Respiratory viruses were detected by real-time PCR.

Results

Ninety-nine toddlers were enrolled. Parents completed 10,476 electronic diaries out of the 10,804 requested (97%). The mobile app detected 188 potential acute respiratory infections (ARIs) and requested a nasal swab. In 173 (92%) ARI episodes a swab was taken. 165 (95.4%) of these swabs were collected at home and 144 (87.3%) of them were considered valid for laboratory testing. Overall, 152 (81%) of the ARIs detected in the study had its corresponding valid sample collected.

Conclusions

Hybrid procedures used in this clinical trial with the influenza vaccine in toddlers were considered adequate, as we diagnosed most of the ARI cases on time, and had a valid swab in 81% of the cases. Hybrid clinical trials improve participant adherence to the study procedures and could improve recruitment and quality of life of the participants and the research team by decreasing the number of visits to the investigator site.

This report emphasises that the conduct of hybrid CTs is a valid alternative to traditional CTs with vaccines. This hybrid CT achieved high adherence of participant to the study procedures.

https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01474-9

Impact of vaccine prioritization strategies on mitigating COVID-19: an agent-based simulation study using an urban region in the United States

Abstract

Background

Approval of novel vaccines for COVID-19 had brought hope and expectations, but not without additional challenges. One central challenge was understanding how to appropriately prioritize the use of limited supply of vaccines. This study examined the efficacy of the various vaccine prioritization strategies using the vaccination campaign underway in the U.S.

Methods

The study developed a granular agent-based simulation model for mimicking community spread of COVID-19 under various social interventions including full and partial closures, isolation and quarantine, use of face mask and contact tracing, and vaccination. The model was populated with parameters of disease natural history, as well as demographic and societal data for an urban community in the U.S. with 2.8 million residents. The model tracks daily numbers of infected, hospitalized, and deaths for all census age-groups. The model was calibrated using parameters for viral transmission and level of community circulation of individuals. Published data from the Florida COVID-19 dashboard was used to validate the model. Vaccination strategies were compared using a hypothesis test for pairwise comparisons.

Results

Three prioritization strategies were examined: a minor variant of CDC’s recommendation, an age-stratified strategy, and a random strategy. The impact of vaccination was also contrasted with a no vaccination scenario. The study showed that the campaign against COVID-19 in the U.S. using vaccines developed by Pfizer/BioNTech and Moderna 1) reduced the cumulative number of infections by 10% and 2) helped the pandemic to subside below a small threshold of 100 daily new reported cases sooner by approximately a month when compared to no vaccination. A comparison of the prioritization strategies showed no significant difference in their impacts on pandemic mitigation.

Conclusions

The vaccines for COVID-19 were developed and approved much quicker than ever before. However, as per our model, the impact of vaccination on reducing cumulative infections was found to be limited (10%, as noted above). This limited impact is due to the explosive growth of infections that occurred prior to the start of vaccination, which significantly reduced the susceptible pool of the population for whom infection could be prevented. Hence, vaccination had a limited opportunity to reduce the cumulative number of infections. Another notable observation from our study is that instead of adhering strictly to a sequential prioritizing strategy, focus should perhaps be on distributing the vaccines among all eligible as quickly as possible, after providing for the most vulnerable. As much of the population worldwide is yet to be vaccinated, results from this study should aid public health decision makers in effectively allocating their limited vaccine supplies.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01458-9

Impact of the COVID-19 pandemic on publication dynamics and non-COVID-19 research production

Abstract

Background

The COVID-19 pandemic has severely affected health systems and medical research worldwide but its impact on the global publication dynamics and non-COVID-19 research has not been measured. We hypothesized that the COVID-19 pandemic may have impacted the scientific production of non-COVID-19 research.

Methods

We conducted a comprehensive meta-research on studies (original articles, research letters and case reports) published between 01/01/2019 and 01/01/2021 in 10 high-impact medical and infectious disease journals (New England Journal of Medicine, Lancet, Journal of the American Medical Association, Nature Medicine, British Medical Journal, Annals of Internal Medicine, Lancet Global Health, Lancet Public Health, Lancet Infectious Disease and Clinical Infectious Disease). For each publication, we recorded publication date, publication type, number of authors, whether the publication was related to COVID-19, whether the publication was based on a case series, and the number of patients included in the study if the publication was based on a case report or a case series. We estimated the publication dynamics with a locally estimated scatterplot smoothing method. A Natural Language Processing algorithm was designed to calculate the number of authors for each publication. We simulated the number of non-COVID-19 studies that could have been published during the pandemic by extrapolating the publication dynamics of 2019 to 2020, and comparing the expected number to the observed number of studies.

Results

Among the 22,525 studies assessed, 6319 met the inclusion criteria, of which 1022 (16.2%) were related to COVID-19 research. A dramatic increase in the number of publications in general journals was observed from February to April 2020 from a weekly median number of publications of 4.0 (IQR: 2.8–5.5) to 19.5 (IQR: 15.8–24.8) (p < 0.001), followed afterwards by a pattern of stability with a weekly median number of publications of 10.0 (IQR: 6.0–14.0) until December 2020 (p = 0.045 in comparison with April). Two prototypical editorial strategies were found: 1) journals that maintained the volume of non-COVID-19 publications while integrating COVID-19 research and thus increased their overall scientific production, and 2) journals that decreased the volume of non-COVID-19 publications while integrating COVID-19 publications. We estimated using simulation models that the COVID pandemic was associated with a 18% decrease in the production of non-COVID-19 research. We also found a significant change of the publication type in COVID-19 research as compared with non-COVID-19 research illustrated by a decrease in the number of original articles, (47.9% in COVID-19 publications vs 71.3% in non-COVID-19 publications, p < 0.001). Last, COVID-19 publications showed a higher number of authors, especially for case reports with a median of 9.0 authors (IQR: 6.0–13.0) in COVID-19 publications, compared to a median of 4.0 authors (IQR: 3.0–6.0) in non-COVID-19 publications (p < 0.001).

Conclusion

In this meta-research gathering publications from high-impact medical journals, we have shown that the dramatic rise in COVID-19 publications was accompanied by a substantial decrease of non-COVID-19 research.

https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01404-9

Guidance for using artificial intelligence for title and abstract screening while conducting knowledge syntheses

Abstract

Background

Systematic reviews are the cornerstone of evidence-based medicine. However, systematic reviews are time consuming and there is growing demand to produce evidence more quickly, while maintaining robust methods. In recent years, artificial intelligence and active-machine learning (AML) have been implemented into several SR software applications. As some of the barriers to adoption of new technologies are the challenges in set-up and how best to use these technologies, we have provided different situations and considerations for knowledge synthesis teams to consider when using artificial intelligence and AML for title and abstract screening.

Methods

We retrospectively evaluated the implementation and performance of AML across a set of ten historically completed systematic reviews. Based upon the findings from this work and in consideration of the barriers we have encountered and navigated during the past 24 months in using these tools prospectively in our research, we discussed and developed a series of practical recommendations for research teams to consider in seeking to implement AML tools for citation screening into their workflow.

Results

We developed a seven-step framework and provide guidance for when and how to integrate artificial intelligence and AML into the title and abstract screening process. Steps include: (1) Consulting with Knowledge user/Expert Panel; (2) Developing the search strategy; (3) Preparing your review team; (4) Preparing your database; (5) Building the initial training set; (6) Ongoing screening; and (7) Truncating screening. During Step 6 and/or 7, you may also choose to optimize your team, by shifting some members to other review stages (e.g., full-text screening, data extraction).

Conclusion

Artificial intelligence and, more specifically, AML are well-developed tools for title and abstract screening and can be integrated into the screening process in several ways. Regardless of the method chosen, transparent reporting of these methods is critical for future studies evaluating artificial intelligence and AML.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01451-2

Effectiveness of exercise interventions on mental health and health-related quality of life in women with polycystic ovary syndrome: a systematic review

Abstract

Background

Polycystic ovary syndrome (PCOS) is a complex condition, impacting cardio-metabolic and reproductive health, mental health and health-related quality of life. The physical health benefits of exercise for women with PCOS are well-established and exercise is increasingly being recognised as efficacious for improving psychological wellbeing. The aim of this review was to summarise the evidence regarding the effectiveness of exercise interventions on mental health outcomes in women with PCOS.

Methods

A systematic search of electronic databases was conducted in March of 2020. Trials that evaluated the effect of an exercise intervention on mental health or health-related quality of life outcomes in reproductive aged women with diagnosed PCOS were included. Methodological quality was assessed using the modified Downs and Black checklist. Primary outcomes included symptoms of depression and anxiety, and health-related quality of life.

Results

Fifteen articles from 11 trials were identified and deemed eligible for inclusion. Exercise demonstrated positive improvements in health-related quality of life in all of the included studies. Half of included studies also reported significant improvements in depression and anxiety symptoms. There was large variation in methodological quality of included studies and in the interventions utilised.

Conclusions

The available evidence indicates that exercise is effective for improving health-related quality of life and PCOS symptom distress. Exercise also shows some efficacy for improving symptoms and/or prevalence of depression and anxiety in women with PCOS. However, due to large heterogeneity of included studies, conclusions could not be made regarding the impact of exercise intervention characteristics. High-quality trials with well reported exercise intervention characteristics and outcomes are required in order to determine effective exercise protocols for women with PCOS and facilitate translation into practice.


https://bmcpublichealth.biomedcentral.com/articles/10.1186/s12889-021-12280-9

Economic burden of varicella in Europe in the absence of universal varicella vaccination

Abstract

Background

Though the disease burden of varicella in Europe has been reported previously, the economic burden is still unknown. This study estimated the economic burden of varicella in Europe in the absence of Universal Varicella Vaccination (UVV) in 2018 Euros from both payer (direct costs) and societal (direct and indirect costs) perspectives.

Methods

We estimated the country specific and overall annual costs of varicella in absence of UVV in 31 European countries (27 EU countries, plus Iceland, Norway, Switzerland and the United Kingdom). To obtain country specific unit costs and associated healthcare utilization, we conducted a systematic literature review, searching in PubMed, EMBASE, NEED, DARE, REPEC, Open Grey, and public heath websites (1/1/1999–10/15/2019). The number of annual varicella cases, deaths, outpatient visits and hospitalizations were calculated (without UVV) based on age-specific incidence rates (Riera-Montes et al. 2017) and 2018 population data by country. Unit cost per varicella case and disease burden data were combined using stochastic modeling to estimate 2018 costs stratified by country, age and healthcare resource.

Results

Overall annual total costs associated with varicella were estimated to be €662,592,061 (Range: €309,552,363 to €1,015,631,760) in Europe in absence of UVV. Direct and indirect costs were estimated at €229,076,206 (Range €144,809,557 to €313,342,856) and €433,515,855 (Range €164,742,806 to €702,288,904), respectively. Total cost per case was €121.45 (direct: €41.99; indirect: €79.46). Almost half of the costs were attributed to cases in children under 5 years, owing mainly to caregiver work loss. The distribution of costs by healthcare resource was similar across countries. France and Germany accounted for 49.28% of total annual costs, most likely due to a combination of high numbers of cases and unit costs in these countries.

Conclusions

The economic burden of varicella across Europe in the absence of UVV is substantial (over 600 M€), primarily driven by caregiver burden including work productivity losses.


https://bmcpublichealth.biomedcentral.com/articles/10.1186/s12889-021-12343-x

History of drinking problems diminishes the protective effects of within-guideline drinking on 18-year risk of dementia and CIND

Abstract

Objective

To examine the moderating effect of older adults’ history of drinking problems on the relationship between their baseline alcohol consumption and risk of dementia and cognitive impairment, no dementia (CIND) 18 years later.

Method

A longitudinal Health and Retirement Study cohort (n = 4421) was analyzed to demonstrate how older adults’ baseline membership in one of six drinking categories (non-drinker, within-guideline drinker, and outside-guideline drinker groups, divided to reflect absence or presence of a history of drinking problems) predicts dementia and CIND 18 years later.

Results

Among participants with no history of drinking problems, 13% of non-drinkers, 5% of within-guideline drinkers, and 9% of outside-guideline drinkers were classified as having dementia 18-years later. Among those with a history of drinking problems, 14% of non-drinkers, 9% of within-guideline drinkers, and 7% of outside-guideline drinkers were classified with dementia. With Non-Drinker, No HDP as reference category, being a baseline within-guideline drinker with no history of drinking problems reduced the likelihood of dementia 18 years later by 45%, independent of baseline demographic and health characteristics; being a baseline within-guideline drinker with a history of drinking problems reduced the likelihood by only 13% (n.s.). Similar patterns obtained for the prediction of CIND.

Conclusions

For older adults, consuming alcohol at levels within validated guidelines for low-risk drinking may offer moderate long-term protection from dementia and CIND, but this effect is diminished by having a history of drinking problems. Efforts to predict and prevent dementia and CIND should focus on older adults’ history of drinking problems in addition to how much alcohol they consume.


https://bmcpublichealth.biomedcentral.com/articles/10.1186/s12889-021-12358-4

Reporting methodological issues of the mendelian randomization studies in health and medical research: a systematic review

Abstract

Background

Mendelian randomization (MR) studies using Genetic risk scores (GRS) as an instrumental variable (IV) have increasingly been used to control for unmeasured confounding in observational healthcare databases. However, proper reporting of methodological issues is sparse in these studies. We aimed to review published papers related to MR studies and identify reporting problems.

Methods

We conducted a systematic review using the clinical articles published between 2009 and 2019. We searched PubMed, Scopus, and Embase databases. We retrieved information from every MR study, including the tests performed to evaluate assumptions and the modelling approach used for estimation. Using our inclusion/exclusion criteria, finally, we identified 97 studies to conduct the review according to the PRISMA statement.

Results

Only 66 (68%) of the studies empirically verified the first assumption (Relevance assumption), and 40 (41.2%) studies reported the appropriate tests (e.g., R2, F-test) to investigate the association. A total of 35.1% clearly stated and discussed theoretical justifications for the second and third assumptions. 30.9% of the studies used a two-stage least square, and 11.3% used the Wald estimator method for estimating IV. Also, 44.3% of the studies conducted a sensitivity analysis to illuminate the robustness of estimates for violations of the untestable assumptions.

Conclusions

We found that incompleteness of the justification of the assumptions for the instrumental variable in MR studies was a common problem in our selected studies. This may misdirect the findings of the studies.



Comparisons of statistical distributions for cluster sizes in a developing pandemic

Abstract

Background

We consider cluster size data of SARS-CoV-2 transmissions for a number of different settings from recently published data. The statistical characteristics of superspreading events are commonly described by fitting a negative binomial distribution to secondary infection and cluster size data as an alternative to the Poisson distribution as it is a longer tailed distribution, with emphasis given to the value of the extra parameter which allows the variance to be greater than the mean. Here we investigate whether other long tailed distributions from more general extended Poisson process modelling can better describe the distribution of cluster sizes for SARS-CoV-2 transmissions.

Methods

We use the extended Poisson process modelling (EPPM) approach with nested sets of models that include the Poisson and negative binomial distributions to assess the adequacy of models based on these standard distributions for the data considered.

Results

We confirm the inadequacy of the Poisson distribution in most cases, and demonstrate the inadequacy of the negative binomial distribution in some cases.

Conclusions

The probability of a superspreading event may be underestimated by use of the negative binomial distribution as much larger tail probabilities are indicated by EPPM distributions than negative binomial alternatives. We show that the large shared accommodation, meal and work settings, of the settings considered, have the potential for more severe superspreading events than would be predicted by a negative binomial distribution. Therefore public health efforts to prevent transmission in such settings should be prioritised.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-022-01517-9

Statistical methods for evaluating the fine needle aspiration cytology procedure in breast cancer diagnosis

Abstract

Background

Statistical issues present while evaluating a diagnostic procedure for breast cancer are non rare but often ignored, leading to biased results. We aimed to evaluate the diagnostic accuracy of the fine needle aspiration cytology(FNAC), a minimally invasive and rapid technique potentially used as a rule-in or rule-out test, handling its statistical issues: suspect test results and verification bias.

Methods

We applied different statistical methods to handle suspect results by defining conditional estimates. When considering a partial verification bias, Begg and Greenes method and multivariate imputation by chained equations were applied, however, and a Bayesian approach with respect to each gold standard was used when considering a differential verification bias. At last, we extended the Begg and Greenes method to be applied conditionally on the suspect results.

Results

The specificity of the FNAC test above 94%, was always higher than its sensitivity regardless of the proposed method. All positive likelihood ratios were higher than 10, with variations among methods. The positive and negative yields were high, defining precise discriminating properties of the test.

Conclusion

The FNAC test is more likely to be used as a rule-in test for diagnosing breast cancer. Our results contributed in advancing our knowledge regarding the performance of FNAC test and the methods to be applied for its evaluation.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-022-01506-y

Assessing transferability in systematic reviews of health economic evaluations – a review of methodological guidance

Abstract

Objective

For assessing cost-effectiveness, Health Technology Assessment (HTA) organisations may use primary economic evaluations (P-HEs) or Systematic Reviews of Health Economic evaluations (SR-HEs). A prerequisite for meaningful results of SR-HEs is that the results from existing P-HEs are transferable to the decision context (e.g, HTA jurisdiction). A particularly pertinent issue is the high variability of costs and resource needs across jurisdictions. Our objective was to review the methods documents of HTA organisations and compare their recommendations on considering transferability in SR-HE.

Methods

We systematically hand searched the webpages of 158 HTA organisations for relevant methods documents from 8th January to 31st March 2019. Two independent reviewers performed searches and selected documents according to pre-defined criteria. One reviewer extracted data in standardised and piloted tables and a second reviewer checked them for accuracy. We synthesised data using tabulations and in a narrative way.

Results

We identified 155 potentially relevant documents from 63 HTA organisations. Of these, 7 were included in the synthesis. The included organisations have different aims when preparing a SR-HE (e.g. to determine the need for conducting their own P-HE). The recommendations vary regarding the underlying terminology (e.g. transferability/generalisability), the assessment approaches (e.g. structure), the assessment criteria and the integration in the review process.

Conclusion

Only few HTA organisations address the assessment of transferability in their methodological recommendations for SR-HEs. Transferability considerations are related to different purposes. The assessment concepts and criteria are heterogeneous. Developing standards to consider transferability in SR-HEs is desirable.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-022-01536-6

Locating and testing the healthy context paradox: examples from the INCLUSIVE trial

Abstract

Background

The healthy context paradox, originally described with respect to school-level bullying interventions, refers to the generation of differences in mental wellbeing amongst those who continue to experience bullying even after interventions successfully reduce victimisation. Using data from the INCLUSIVE trial of restorative practice in schools, we relate this paradox to the need to theorise potential harms when developing interventions; formulate the healthy context paradox in a more general form defined by mediational relationships and cluster-level interventions; and propose two statistical models for testing the healthy context paradox informed by multilevel mediation methods, with relevance to structural and individual explanations for this paradox.

Methods

We estimated two multilevel mediation models with bullying victimisation as the mediator and mental wellbeing as the outcome: one with a school-level interaction between intervention assignment and the mediator; and one with a random slope component for the student-level mediator-outcome relationship predicted by school-level assignment. We relate each of these models to contextual or individual-level explanations for the healthy context paradox.

Results

Neither model suggested that the INCLUSIVE trial represented an example of the healthy context paradox. However, each model has different interpretations which relate to a multilevel understanding of the healthy context paradox.

Conclusions

Greater exploration of intervention harms, especially when those accrue to population subgroups, is an essential step in better understanding how interventions work and for whom. Our proposed tests for the presence of a healthy context paradox provide the analytic tools to better understand how to support development and implementation of interventions that work for all groups in a population.



Detecting the patient’s need for help with machine learning based on expressions

Abstract

Background

Developing machine learning models to support health analytics requires increased understanding about statistical properties of self-rated expression statements used in health-related communication and decision making. To address this, our current research analyzes self-rated expression statements concerning the coronavirus COVID-19 epidemic and with a new methodology identifies how statistically significant differences between groups of respondents can be linked to machine learning results.

Methods

A quantitative cross-sectional study gathering the “need for help” ratings for twenty health-related expression statements concerning the coronavirus epidemic on an 11-point Likert scale, and nine answers about the person’s health and wellbeing, sex and age. The study involved online respondents between 30 May and 3 August 2020 recruited from Finnish patient and disabled people’s organizations, other health-related organizations and professionals, and educational institutions (n = 673). We propose and experimentally motivate a new methodology of influence analysis concerning machine learning to be applied for evaluating how machine learning results depend on and are influenced by various properties of the data which are identified with traditional statistical methods.

Results

We found statistically significant Kendall rank-correlations and high cosine similarity values between various health-related expression statement pairs concerning the “need for help” ratings and a background question pair. With tests of Wilcoxon rank-sum, Kruskal-Wallis and one-way analysis of variance (ANOVA) between groups we identified statistically significant rating differences for several health-related expression statements in respect to groupings based on the answer values of background questions, such as the ratings of suspecting to have the coronavirus infection and having it depending on the estimated health condition, quality of life and sex. Our new methodology enabled us to identify how statistically significant rating differences were linked to machine learning results thus helping to develop better human-understandable machine learning models.

Conclusions

The self-rated “need for help” concerning health-related expression statements differs statistically significantly depending on the person’s background information, such as his/her estimated health condition, quality of life and sex. With our new methodology statistically significant rating differences can be linked to machine learning results thus enabling to develop better machine learning to identify, interpret and address the patient’s needs for well-personalized care.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-021-01502-8

Should samples be weighted to decrease selection bias in online surveys during the COVID-19 pandemic? Data from seven datasets

Abstract

Background

Online surveys have triggered a heated debate regarding their scientific validity. Many authors have adopted weighting methods to enhance the quality of online survey findings, while others did not find an advantage for this method. This work aims to compare weighted and unweighted association measures after adjustment over potential confounding, taking into account dataset properties such as the initial gap between the population and the selected sample, the sample size, and the variable types.

Methods

This study assessed seven datasets collected between 2019 and 2021 during the COVID-19 pandemic through online cross-sectional surveys using the snowball sampling technique. Weighting methods were applied to adjust the online sample over sociodemographic features of the target population.

Results

Despite varying age and gender gaps between weighted and unweighted samples, strong similarities were found for dependent and independent variables. When applied on the same datasets, the regression analysis results showed a high relative difference between methods for some variables, while a low difference was found for others. In terms of absolute impact, the highest impact on the association measure was related to the sample size, followed by the age gap, the gender gap, and finally, the significance of the association between weighted age and the dependent variable.

Conclusion

The results of this analysis of online surveys indicate that weighting methods should be used cautiously, as weighting did not affect the results in some databases, while it did in others. Further research is necessary to define situations in which weighting would be beneficial.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-022-01547-3

Ripple effects mapping: capturing the wider impacts of systems change efforts in public health

Abstract

Background

Systems approaches are currently being advocated and implemented to address complex challenges in Public Health. These approaches work by bringing multi-sectoral stakeholders together to develop a collective understanding of the system, and then to identify places where they can leverage change across the system. Systems approaches are unpredictable, where cause-and-effect cannot always be disentangled, and unintended consequences – positive and negative – frequently arise. Evaluating such approaches is difficult and new methods are warranted.

Methods

Ripple Effects Mapping (REM) is a qualitative method which can capture the wider impacts, and adaptive nature, of a systems approach. Using a case study example from the evaluation of a physical activity-orientated systems approach in Gloucestershire, we: a) introduce the adapted REM method; b) describe how REM was applied in the example; c) explain how REM outputs were analysed; d) provide examples of how REM outputs were used; and e) describe the strengths, limitations, and future uses of REM based on our reflections.

Results

Ripple Effects Mapping is a participatory method that requires the active input of programme stakeholders in data gathering workshops. It produces visual outputs (i.e., maps) of the programme activities and impacts, which are mapped along a timeline to understand the temporal dimension of systems change efforts. The REM outputs from our example were created over several iterations, with data collected every 3–4 months, to build a picture of activities and impacts that have continued or ceased. Workshops took place both in person and online. An inductive content analysis was undertaken to describe and quantify the patterns within the REM outputs. Detailed guidance related to the preparation, delivery, and analysis of REM are included in this paper.

Conclusion

REM may help to advance our understanding and evaluation of complex systems approaches, especially within the field of Public Health. We therefore invite other researchers, practitioners and policymakers to use REM and continuously evolve the method to enhance its application and practical utility.

Developing a tool to assess the skills to perform a health technology assessment

Abstract

Background

Health technology assessment (HTA) brings together evidence from various disciplines while using explicit methods to assess the value of health technologies. In resource-constrained settings, there is a growing demand to measure and develop specialist skills, including those for HTA, to aid the implementation of Universal Healthcare Coverage. The purpose of this study was twofold: a) to find validated tools for the assessment of the technical capacity to conduct a HTA, and if none were found, to develop a tool, and b) to describe experiences of its pilot.

Methods

First, a mapping review identified tools to assess the skills to conduct a HTA. A medical librarian conducted a comprehensive search in four databases (MEDLINE, Embase, Web of Science, ERIC). Then, incorporating results from the mapping and following an iterative process involving stakeholders and experts, we developed a HTA skills assessment tool. Finally, using an online platform to gather and analyse responses, in collaboration with our institutional partner, we piloted the tool in Ghana, and sought feedback on their experiences.

Results

The database search yielded 3871 records; fifteen those were selected based on a priori criteria. These records were published between 2003 and 2018, but none covered all technical skills to conduct a HTA. In the absence of an instrument meeting our needs, we developed a HTA skill assessment tool containing four sections (general information, core and soft skills, and future needs). The tool was designed to be administered to a broad range of individuals who would potentially contribute to the planning, delivery and evaluation of HTA. The tool was piloted with twenty-three individuals who completed the skills assessment and shared their initial impressions of the tool.

Conclusions

To our knowledge, this is the first comprehensive tool enabling the assessment of technical skills to conduct a HTA. This tool allows teams to understand where their individual strengths and weakness lie. The tool is in the early validation phases and further testing is needed.

https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-022-01562-4

Research monitoring practices in critical care research: a survey of current state and attitudes

Abstract

Background/Aims

In 2016, international standards governing clinical research recommended that the approach to monitoring a research project should be undertaken based on risk, however it is unknown whether this approach has been adopted in Australia and New Zealand (ANZ) throughout critical care research. The aims of the project were to: 1) Gain an understanding of current research monitoring practices in academic-led clinical trials in the field of critical care research, 2) Describe the perceived barriers and enablers to undertaking research monitoring.

Methods

Electronic survey distributed to investigators, research co-ordinators and other research staff currently undertaking and supporting academic-led clinical trials in the field of critical care in ANZ.

Results

Of the 118 respondents, 70 were involved in the co-ordination of academic trials; the remaining results pertain to this sub-sample. Fifty-eight (83%) were working in research units associated with hospitals, 29 (41%) were experienced Research Coordinators and 19 (27%) Principal Investigators; 31 (44%) were primarily associated with paediatric research. Fifty-six (80%) develop monitoring plans with 33 (59%) of these undertaking a risk assessment; the most common barrier reported was lack of expertise. Nineteen (27%) indicated that centralised monitoring was used, noting that technology to support centralised monitoring (45/51; 88%) along with support from data managers and statisticians (45/52; 87%) were key enablers. Coronavirus disease-19 (COVID-19) impacted monitoring for 82% (45/55) by increasing remote (25/45; 56%) and reducing onsite (29/45; 64%) monitoring.

Conclusions

Contrary to Good Clinical Practice guidance, risk assessments to inform monitoring plans are not being consistently performed due to lack of experience and guidance. There is an urgent need to enhance risk assessment methodologies and develop technological solutions for centralised statistical monitoring.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-022-01551-7



Estimation of treatment effects in observational stroke care data: comparison of statistical approaches

Abstract

Introduction

Various statistical approaches can be used to deal with unmeasured confounding when estimating treatment effects in observational studies, each with its own pros and cons. This study aimed to compare treatment effects as estimated by different statistical approaches for two interventions in observational stroke care data.

Patients and methods

We used prospectively collected data from the MR CLEAN registry including all patients (n = 3279) with ischemic stroke who underwent endovascular treatment (EVT) from 2014 to 2017 in 17 Dutch hospitals. Treatment effects of two interventions – i.e., receiving an intravenous thrombolytic (IVT) and undergoing general anesthesia (GA) before EVT – on good functional outcome (modified Rankin Scale ≤2) were estimated. We used three statistical regression-based approaches that vary in assumptions regarding the source of unmeasured confounding: individual-level (two subtypes), ecological, and instrumental variable analyses. In the latter, the preference for using the interventions in each hospital was used as an instrument.

Results

Use of IVT (range 66–87%) and GA (range 0–93%) varied substantially between hospitals. For IVT, the individual-level (OR ~ 1.33) resulted in significant positive effect estimates whereas in instrumental variable analysis no significant treatment effect was found (OR 1.11; 95% CI 0.58–1.56). The ecological analysis indicated no statistically significant different likelihood (β = − 0.002%; P = 0.99) of good functional outcome at hospitals using IVT 1% more frequently. For GA, we found non-significant opposite directions of points estimates the treatment effect in the individual-level (ORs ~ 0.60) versus the instrumental variable approach (OR = 1.04). The ecological analysis also resulted in a non-significant negative association (0.03% lower probability).

Discussion and conclusion

Both magnitude and direction of the estimated treatment effects for both interventions depend strongly on the statistical approach and thus on the source of (unmeasured) confounding. These issues should be understood concerning the specific characteristics of data, before applying an approach and interpreting the results. Instrumental variable analysis might be considered when unobserved confounding and practice variation is expected in observational multicenter studies.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-022-01590-0

Characterising and justifying sample size sufficiency in interview-based studies: systematic analysis of qualitative health research over a 15-year period

Abstract

Background

Choosing a suitable sample size in qualitative research is an area of conceptual debate and practical uncertainty. That sample size principles, guidelines and tools have been developed to enable researchers to set, and justify the acceptability of, their sample size is an indication that the issue constitutes an important marker of the quality of qualitative research. Nevertheless, research shows that sample size sufficiency reporting is often poor, if not absent, across a range of disciplinary fields.

Methods

A systematic analysis of single-interview-per-participant designs within three health-related journals from the disciplines of psychology, sociology and medicine, over a 15-year period, was conducted to examine whether and how sample sizes were justified and how sample size was characterised and discussed by authors. Data pertinent to sample size were extracted and analysed using qualitative and quantitative analytic techniques.

Results

Our findings demonstrate that provision of sample size justifications in qualitative health research is limited; is not contingent on the number of interviews; and relates to the journal of publication. Defence of sample size was most frequently supported across all three journals with reference to the principle of saturation and to pragmatic considerations. Qualitative sample sizes were predominantly – and often without justification – characterised as insufficient (i.e., ‘small’) and discussed in the context of study limitations. Sample size insufficiency was seen to threaten the validity and generalizability of studies’ results, with the latter being frequently conceived in nomothetic terms.

Conclusions

We recommend, firstly, that qualitative health researchers be more transparent about evaluations of their sample size sufficiency, situating these within broader and more encompassing assessments of data adequacy. Secondly, we invite researchers critically to consider how saturation parameters found in prior methodological studies and sample size community norms might best inform, and apply to, their own project and encourage that data adequacy is best appraised with reference to features that are intrinsic to the study at hand. Finally, those reviewing papers have a vital role in supporting and encouraging transparent study-specific reporting.

Bias amplification in the g-computation algorithm for time-varying treatments: a case study of industry payments and prescription of opioid products

Abstract

Background

It is often challenging to determine which variables need to be included in the g-computation algorithm under the time-varying setting. Conditioning on instrumental variables (IVs) is known to introduce greater bias when there is unmeasured confounding in the point-treatment settings, and this is also true for near-IVs which are weakly associated with the outcome not through the treatment. However, it is unknown whether adjusting for (near-)IVs amplifies bias in the g-computation algorithm estimators for time-varying treatments compared to the estimators ignoring such variables. We thus aimed to compare the magnitude of bias by adjusting for (near-)IVs across their different relationships with treatments in the time-varying settings.

Methods

After showing a case study of the association between the receipt of industry payments and physicians’ opioid prescribing rate in the US, we demonstrated Monte Carlo simulation to investigate the extent to which the bias due to unmeasured confounders is amplified by adjusting for (near-)IV across several g-computation algorithms.

Results

In our simulation study, adjusting for a perfect IV of time-varying treatments in the g-computation algorithm increased bias due to unmeasured confounding, particularly when the IV had a strong relationship with the treatment. We also found the increase in bias even adjusting for near-IV when such variable had a very weak association with unmeasured confounders between the treatment and the outcome compared to its association with the time-varying treatments. Instead, this bias amplifying feature was not observed (i.e., bias due to unmeasured confounders decreased) by adjusting for near-IV when it had a stronger association with the unmeasured confounders (≥0.1 correlation coefficient in our multivariate normal setting).

Conclusion

It would be recommended to avoid adjusting for perfect IV in the g-computation algorithm to obtain a less biased estimate of the time-varying treatment effect. On the other hand, it may be recommended to include near-IV in the algorithm unless their association with unmeasured confounders is very weak. These findings would help researchers to consider the magnitude of bias when adjusting for (near-)IVs and select variables in the g-computation algorithm for the time-varying setting when they are aware of the presence of unmeasured confounding.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-022-01563-3

Learning from COVID-19 related trial adaptations to inform efficient trial design—a sequential mixed methods study

Abstract

Background

Many clinical trial procedures were often undertaken in-person prior to the COVID-19 pandemic, which has resulted in adaptations to these procedures to enable trials to continue. The aim of this study was to understand whether the adaptations made to clinical trials by UK Clinical Trials Units (CTUs) during the pandemic have the potential to improve the efficiency of trials post-pandemic.

Methods

This was a mixed methods study, initially involving an online survey administered to all registered UK CTUs to identify studies that had made adaptations due to the pandemic. Representatives from selected studies were qualitatively interviewed to explore the adaptations made and their potential to improve the efficiency of future trials. A literature review was undertaken to locate published evidence concerning the investigated adaptations. The findings from the interviews were reviewed by a group of CTU and patient representatives within a workshop, where discussions focused on the potential of the adaptations to improve the efficiency of future trials.

Results

Forty studies were identified by the survey. Fourteen studies were selected and fifteen CTU staff were interviewed about the adaptations. The workshop included 15 CTU and 3 patient representatives. Adaptations were not seen as leading to direct efficiency savings for CTUs. However, three adaptations may have the potential to directly improve efficiencies for trial sites and participants beyond the pandemic: a split remote-first eligibility assessment, recruitment outside the NHS via a charity, and remote consent. There was a lack of published evidence to support the former two adaptations, however, remote consent is widely supported in the literature. Other identified adaptations may benefit by improving flexibility for the participant. Barriers to using these adaptations include the impact on scientific validity, limitations in the role of the CTU, and participant’s access to technology.

Conclusions

Three adaptations (a split remote-first eligibility assessment, recruitment outside the NHS via a charity, and remote consent) have the potential to improve clinical trials but only one (remote consent) is supported by evidence. These adaptations could be tested in future co-ordinated ‘studies within a trial’ (SWAT).

A flexible approach for variable selection in large-scale healthcare database studies with missing covariate and outcome data

Abstract

Background

Prior work has shown that combining bootstrap imputation with tree-based machine learning variable selection methods can provide good performances achievable on fully observed data when covariate and outcome data are missing at random (MAR). This approach however is computationally expensive, especially on large-scale datasets.

Methods

We propose an inference-based method, called RR-BART, which leverages the likelihood-based Bayesian machine learning technique, Bayesian additive regression trees, and uses Rubin’s rule to combine the estimates and variances of the variable importance measures on multiply imputed datasets for variable selection in the presence of MAR data. We conduct a representative simulation study to investigate the practical operating characteristics of RR-BART, and compare it with the bootstrap imputation based methods. We further demonstrate the methods via a case study of risk factors for 3-year incidence of metabolic syndrome among middle-aged women using data from the Study of Women’s Health Across the Nation (SWAN).

Results

The simulation study suggests that even in complex conditions of nonlinearity and nonadditivity with a large percentage of missingness, RR-BART can reasonably recover both prediction and variable selection performances, achievable on the fully observed data. RR-BART provides the best performance that the bootstrap imputation based methods can achieve with the optimal selection threshold value. In addition, RR-BART demonstrates a substantially stronger ability of detecting discrete predictors. Furthermore, RR-BART offers substantial computational savings. When implemented on the SWAN data, RR-BART adds to the literature by selecting a set of predictors that had been less commonly identified as risk factors but had substantial biological justifications.

Conclusion

The proposed variable selection method for MAR data, RR-BART, offers both computational efficiency and good operating characteristics and is utilitarian in large-scale healthcare database studies.

Global prediction model for COVID-19 pandemic with the characteristics of the multiple peaks and local fluctuations

Abstract

Background

With the spread of COVID-19, the time-series prediction of COVID-19 has become a research hotspot. Unlike previous epidemics, COVID-19 has a new pattern of long-time series, large fluctuations, and multiple peaks. Traditional dynamical models are limited to curves with short-time series, single peak, smoothness, and symmetry. Secondly, most of these models have unknown parameters, which bring greater ambiguity and uncertainty. There are still major shortcomings in the integration of multiple factors, such as human interventions, environmental factors, and transmission mechanisms.

Methods

A dynamical model with only infected humans and removed humans was established. Then the process of COVID-19 spread was segmented using a local smoother. The change of infection rate at different stages was quantified using the continuous and periodic Logistic growth function to quantitatively describe the comprehensive effects of natural and human factors. Then, a non-linear variable and NO2 concentrations were introduced to qualify the number of people who have been prevented from infection through human interventions.

Results

The experiments and analysis showed the R2 of fitting for the US, UK, India, Brazil, Russia, and Germany was 0.841, 0.977, 0.974, 0.659, 0.992, and 0.753, respectively. The prediction accuracy of the US, UK, India, Brazil, Russia, and Germany in October was 0.331, 0.127, 0.112, 0.376, 0.043, and 0.445, respectively.

Conclusion

The model can not only better describe the effects of human interventions but also better simulate the temporal evolution of COVID-19 with local fluctuations and multiple peaks, which can provide valuable assistant decision-making information.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-022-01604-x

A systematic review of methods to estimate colorectal cancer incidence using population-based cancer registries

Abstract

Background

Epidemiological studies of incidence play an essential role in quantifying disease burden, resource planning, and informing public health policies. A variety of measures for estimating cancer incidence have been used. Appropriate reporting of incidence calculations is essential to enable clear interpretation. This review uses colorectal cancer (CRC) as an exemplar to summarize and describe variation in commonly employed incidence measures and evaluate the quality of reporting incidence methods.

Methods

We searched four databases for CRC incidence studies published between January 2010 and May 2020. Two independent reviewers screened all titles and abstracts. Eligible studies were population-based cancer registry studies evaluating CRC incidence. We extracted data on study characteristics and author-defined criteria for assessing the quality of reporting incidence. We used descriptive statistics to summarize the information.

Results

This review retrieved 165 relevant articles. The age-standardized incidence rate (ASR) (80%) was the most commonly reported incidence measure, and the 2000 U.S. standard population the most commonly used reference population (39%). Slightly more than half (54%) of the studies reported CRC incidence stratified by anatomical site. The quality of reporting incidence methods was suboptimal. Of all included studies: 45 (27%) failed to report the classification system used to define CRC; 63 (38%) did not report CRC codes; and only 20 (12%) documented excluding certain CRC cases from the numerator. Concerning the denominator estimation: 61% of studies failed to state the source of population data; 24 (15%) indicated census years; 10 (6%) reported the method used to estimate yearly population counts; and only 5 (3%) explicitly explained the population size estimation procedure to calculate the overall average incidence rate. Thirty-three (20%) studies reported the confidence interval for incidence, and only 7 (4%) documented methods for dealing with missing data.

Conclusion

This review identified variations in incidence calculation and inadequate reporting of methods. We outlined recommendations to optimize incidence estimation and reporting practices. There is a need to establish clear guidelines for incidence reporting to facilitate assessment of the validity and interpretation of reported incidence.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-022-01632-7

Using observational study data as an external control group for a clinical trial: an empirical comparison of methods to account for longitudinal missing data

Abstract

Background

Observational data are increasingly being used to conduct external comparisons to clinical trials. In this study, we empirically examined whether different methodological approaches to longitudinal missing data affected study conclusions in this setting.

Methods

We used data from one clinical trial and one prospective observational study, both Norwegian multicenter studies including patients with recently diagnosed rheumatoid arthritis and implementing similar treatment strategies, but with different stringency. A binary disease remission status was defined at 6, 12, and 24 months in both studies. After identifying patterns of longitudinal missing outcome data, we evaluated the following five approaches to handle missingness: analyses of patients with complete follow-up data, multiple imputation (MI), inverse probability of censoring weighting (IPCW), and two combinations of MI and IPCW.

Results

We found a complex non-monotone missing data pattern in the observational study (N = 328), while missing data in the trial (N = 188) was monotone due to drop-out. In the observational study, only 39.0% of patients had complete outcome data, compared to 89.9% in the trial. All approaches to missing data indicated favorable outcomes of the treatment strategy in the trial and resulted in similar study conclusions. Variations in results across approaches were mainly due to variations in estimated outcomes for the observational data.

Conclusions

Five different approaches to handle longitudinal missing data resulted in similar conclusions in our example. However, the extent and complexity of missing observational data affected estimated comparative outcomes across approaches, highlighting the need for careful consideration of methods to account for missingness in this setting. Based on this empirical examination, we recommend using a prespecified advanced missing data approach to account for longitudinal missing data, and to conduct alternative approaches in sensitivity analyses.


https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-022-01639-0

Estimating risk ratio from any standard epidemiological design by doubling the cases

Abstract

Background

Despite the ease of interpretation and communication of a risk ratio (RR), and several other advantages in specific settings, the odds ratio (OR) is more commonly reported in epidemiological and clinical research. This is due to the familiarity of the logistic regression model for estimating adjusted ORs from data gathered in a cross-sectional, cohort or case-control design. The preservation of the OR (but not RR) in case-control samples has contributed to the perception that it is the only valid measure of relative risk from case-control samples. For cohort or cross-sectional data, a method known as ‘doubling-the-cases’ provides valid estimates of RR and an expression for a robust standard error has been derived, but is not available in statistical software packages.

Methods

In this paper, we first describe the doubling-of-cases approach in the cohort setting and then extend its application to case-control studies by incorporating sampling weights and deriving an expression for a robust standard error. The performance of the estimator is evaluated using simulated data, and its application illustrated in a study of neonatal jaundice. We provide an R package that implements the method for any standard design.

Results

Our work illustrates that the doubling-of-cases approach for estimating an adjusted RR from cross-sectional or cohort data can also yield valid RR estimates from case-control data. The approach is straightforward to apply, involving simple modification of the data followed by logistic regression analysis. The method performed well for case-control data from simulated cohorts with a range of prevalence rates. In the application to neonatal jaundice, the RR estimates were similar to those from relative risk regression, whereas the OR from naive logistic regression overestimated the RR despite the low prevalence of the outcome.

Conclusions

By providing an R package that estimates an adjusted RR from cohort, cross-sectional or case-control studies, we have enabled the method to be easily implemented with familiar software, so that investigators are not limited to reporting an OR and can examine the RR when it is of interest.

Automating risk of bias assessment in systematic reviews: a real-time mixed methods comparison of human researchers to a machine learning system

Abstract

Background

Machine learning and automation are increasingly used to make the evidence synthesis process faster and more responsive to policymakers’ needs. In systematic reviews of randomized controlled trials (RCTs), risk of bias assessment is a resource-intensive task that typically requires two trained reviewers. One function of RobotReviewer, an off-the-shelf machine learning system, is an automated risk of bias assessment.

Methods

We assessed the feasibility of adopting RobotReviewer within a national public health institute using a randomized, real-time, user-centered study. The study included 26 RCTs and six reviewers from two projects examining health and social interventions. We randomized these studies to one of two RobotReviewer platforms. We operationalized feasibility as accuracy, time use, and reviewer acceptability. We measured accuracy by the number of corrections made by human reviewers (either to automated assessments or another human reviewer’s assessments). We explored acceptability through group discussions and individual email responses after presenting the quantitative results.

Results

Reviewers were equally likely to accept judgment by RobotReviewer as each other’s judgement during the consensus process when measured dichotomously; risk ratio 1.02 (95% CI 0.92 to 1.13; p = 0.33). We were not able to compare time use. The acceptability of the program by researchers was mixed. Less experienced reviewers were generally more positive, and they saw more benefits and were able to use the tool more flexibly. Reviewers positioned human input and human-to-human interaction as superior to even a semi-automation of this process.

Conclusion

Despite being presented with evidence of RobotReviewer’s equal performance to humans, participating reviewers were not interested in modifying standard procedures to include automation. If further studies confirm equal accuracy and reduced time compared to manual practices, we suggest that the benefits of RobotReviewer may support its future implementation as one of two assessors, despite reviewer ambivalence. Future research should study barriers to adopting automated tools and how highly educated and experienced researchers can adapt to a job market that is increasingly challenged by new technologies.

A progressive three-state model to estimate time to cancer: a likelihood-based approach

Abstract

Background

To optimize colorectal cancer (CRC) screening and surveillance, information regarding the time-dependent risk of advanced adenomas (AA) to develop into CRC is crucial. However, since AA are removed after diagnosis, the time from AA to CRC cannot be observed in an ethically acceptable manner. We propose a statistical method to indirectly infer this time in a progressive three-state disease model using surveillance data.

Methods

Sixteen models were specified, with and without covariates. Parameters of the parametric time-to-event distributions from the adenoma-free state (AF) to AA and from AA to CRC were estimated simultaneously, by maximizing the likelihood function. Model performance was assessed via simulation. The methodology was applied to a random sample of 878 individuals from a Norwegian adenoma cohort.

Results

Estimates of the parameters of the time distributions are consistent and the 95% confidence intervals (CIs) have good coverage. For the Norwegian sample (AF: 78%, AA: 20%, CRC: 2%), a Weibull model for both transition times was selected as the final model based on information criteria. The mean time among those who have made the transition to CRC since AA onset within 50 years was estimated to be 4.80 years (95% CI: 0; 7.61). The 5-year and 10-year cumulative incidence of CRC from AA was 13.8% (95% CI: 7.8%;23.8%) and 15.4% (95% CI: 8.2%;34.0%), respectively.

Conclusions

The time-dependent risk from AA to CRC is crucial to explain differences in the outcomes of microsimulation models used for the optimization of CRC prevention. Our method allows for improving models by the inclusion of data-driven time distributions.



The effectiveness of hand hygiene interventions for preventing community transmission or acquisition of novel coronavirus or influenza infections: a systematic review

Abstract

Background

Novel coronaviruses and influenza can cause infection, epidemics, and pandemics. Improving hand hygiene (HH) of the general public is recommended for preventing these infections. This systematic review examined the effectiveness of HH interventions for preventing transmission or acquisition of such infections in the community.

Methods

PubMed, MEDLINE, CINAHL and Web of Science databases were searched (January 2002–February 2022) for empirical studies related to HH in the general public and to the acquisition or transmission of novel coronavirus infections or influenza. Studies on healthcare staff, and with outcomes of compliance or absenteeism were excluded. Study selection, data extraction and quality assessment, using the Cochrane Effective Practice and Organization of Care risk of bias criteria or Joanna Briggs Institute Critical Appraisal checklists, were conducted by one reviewer, and double-checked by another. For intervention studies, effect estimates were calculated while the remaining studies were synthesised narratively. The protocol was pre-registered (PROSPERO 2020: CRD42020196525).

Results

Twenty-two studies were included. Six were intervention studies evaluating the effectiveness of HH education and provision of products, or hand washing against influenza. Only two school-based interventions showed a significant protective effect (OR: 0.64; 95% CI 0.51, 0.80 and OR: 0.40; 95% CI 0.22, 0.71), with risk of bias being high (n = 1) and unclear (n = 1). Of the 16 non-intervention studies, 13 reported the protective effect of HH against influenza, SARS or COVID-19 (P < 0.05), but risk of bias was high (n = 7), unclear (n = 5) or low (n = 1). However, evidence in relation to when, and how frequently HH should be performed was inconsistent.

Conclusions

To our knowledge, this is the first systematic review of effectiveness of HH for prevention of community transmission or acquisition of respiratory viruses that have caused epidemics or pandemics, including SARS-CoV-1, SARS-CoV-2 and influenza viruses. The evidence supporting the protective effect of HH was heterogeneous and limited by methodological quality; thus, insufficient to recommend changes to current HH guidelines. Future work is required to identify in what circumstances, how frequently and what product should be used when performing HH in the community and to develop effective interventions for promoting these specific behaviours in communities during epidemics.


https://bmcpublichealth.biomedcentral.com/articles/10.1186/s12889-022-13667-y

Machine learning approach for the prediction of 30-day mortality in patients with sepsis-associated encephalopathy

Abstract

Objective

Our study aimed to identify predictors as well as develop machine learning (ML) models to predict the risk of 30-day mortality in patients with sepsis-associated encephalopathy (SAE).

Materials and methods

ML models were developed and validated based on a public database named Medical Information Mart for Intensive Care (MIMIC)-IV. Models were compared by the area under the curve (AUC), accuracy, sensitivity, specificity, positive and negative predictive values, and Hosmer–Lemeshow good of fit test.

Results

Of 6994 patients in MIMIC-IV included in the final cohort, a total of 1232 (17.62%) patients died following SAE. Recursive feature elimination (RFE) selected 15 variables, including acute physiology score III (APSIII), Glasgow coma score (GCS), sepsis related organ failure assessment (SOFA), Charlson comorbidity index (CCI), red blood cell volume distribution width (RDW), blood urea nitrogen (BUN), age, respiratory rate, PaO2, temperature, lactate, creatinine (CRE), malignant cancer, metastatic solid tumor, and platelet (PLT). The validation cohort demonstrated all ML approaches had higher discriminative ability compared with the bagged trees (BT) model, although the difference was not statistically significant. Furthermore, in terms of the calibration performance, the artificial neural network (NNET), logistic regression (LR), and adapting boosting (Ada) models had a good calibration—namely, a high accuracy of prediction, with P-values of 0.831, 0.119, and 0.129, respectively.

Conclusions

The ML models, as demonstrated by our study, can be used to evaluate the prognosis of SAE patients in the intensive care unit (ICU). Online calculator could facilitate the sharing of predictive models.

Machine learning is an effective method to predict the 90-day prognosis of patients with transient ischemic attack and minor stroke

Abstract

Objective

We aimed to investigate factors related to the 90-day poor prognosis (mRS≥3) in patients with transient ischemic attack (TIA) or minor stroke, construct 90-day poor prognosis prediction models for patients with TIA or minor stroke, and compare the predictive performance of machine learning models and Logistic model.

Method

We selected TIA and minor stroke patients from a prospective registry study (CNSR-III). Demographic characteristics,smoking history, drinking history(≥20g/day), physiological data, medical history,secondary prevention treatment, in-hospital evaluation and education,laboratory data, neurological severity, mRS score and TOAST classification of patients were assessed. Univariate and multivariate logistic regression analyses were performed in the training set to identify predictors associated with poor outcome (mRS≥3). The predictors were used to establish machine learning models and the traditional Logistic model, which were randomly divided into the training set and test set according to the ratio of 70:30. The training set was used to construct the prediction model, and the test set was used to evaluate the effect of the model. The evaluation indicators of the model included the area under the curve (AUC) of the discrimination index and the Brier score (or calibration plot) of the calibration index.

Result

A total of 10967 patients with TIA and minor stroke were enrolled in this study, with an average age of 61.77 ± 11.18 years, and women accounted for 30.68%. Factors associated with the poor prognosis in TIA and minor stroke patients included sex, age, stroke history, heart rate, D-dimer, creatinine, TOAST classification, admission mRS, discharge mRS, and discharge NIHSS score. All models, both those constructed by Logistic regression and those by machine learning, performed well in predicting the 90-day poor prognosis (AUC >0.800). The best performing AUC in the test set was the Catboost model (AUC=0.839), followed by the XGBoost, GBDT, random forest and Adaboost model (AUCs equal to 0.838, 0, 835, 0.832, 0.823, respectively). The performance of Catboost and XGBoost in predicting poor prognosis at 90-day was better than the Logistic model, and the difference was statistically significant(P<0.05). All models, both those constructed by Logistic regression and those by machine learning had good calibration.

Conclusion

Machine learning algorithms were not inferior to the Logistic regression model in predicting the poor prognosis of patients with TIA and minor stroke at 90-day. Among them, the Catboost model had the best predictive performance. All models provided good discrimination.

Nested and multipart prospective observational studies, flaming fiasco or efficiently economical?: The Brain, Bone, Heart case study

Abstract

Background

Collecting new data from cross-sectional/survey and cohort observational study designs can be expensive and time-consuming. Nested (hierarchically cocooned within an existing parent study) and/or Multipart (≥ 2 integrally interlinked projects) study designs can expand the scope of a prospective observational research program beyond what might otherwise be possible with available funding and personnel. The Brain, Bone, Heart (BBH) study provides an exemplary case to describe the real-world advantages, challenges, considerations, and insights from these complex designs.

Main

BBH is a Nested, Multipart study conducted by the Specialized Center for Research Excellence (SCORE) on Sex Differences at Emory University. BBH is designed to examine whether estrogen insufficiency-induced inflammation compounds HIV-induced inflammation, leading to end-organ damage and aging-related co-morbidities affecting the neuro-hypothalamic–pituitary–adrenal axis (brain), musculoskeletal (bone), and cardiovascular (heart) organ systems. Using BBH as a real-world case study, we describe the advantages and challenges of Nested and Multipart prospective cohort study design in practice. While excessive dependence on its parent study can pose challenges in a Nested study, there are significant advantages to the study design as well. These include the ability to leverage a parent study’s resources and personnel; more comprehensive data collection and data sharing options; a broadened community of researchers for collaboration; dedicated longitudinal research participants; and, access to historical data. Multipart, interlinked studies that share a common cohort of participants and pool of resources have the advantage of dedicated key personnel and the challenge of increased organizational complexity. Important considerations for each study design include the stability and administration of the parent study (Nested) and the cohesiveness of linkage elements and staff organizational capacity (Multipart).

Conclusion

Using the experience of BBH as an example, Nested and/or Multipart study designs have both distinct advantages and potential vulnerabilities that warrant consideration and require strong biostatistics and data management leadership to optimize programmatic success and impact.

Sample size recalculation based on the prevalence in a randomized test-treatment study

Abstract

Background

Randomized test-treatment studies aim to evaluate the clinical utility of diagnostic tests by providing evidence on their impact on patient health. However, the sample size calculation is affected by several factors involved in the test-treatment pathway, including the prevalence of the disease. Sample size planning is exposed to strong uncertainties in terms of the necessary assumptions, which have to be compensated for accordingly by adjusting prospectively determined study parameters during the course of the study.

Method

An adaptive design with a blinded sample size recalculation in a randomized test-treatment study based on the prevalence is proposed and evaluated by a simulation study. The results of the adaptive design are compared to those of the fixed design.

Results

The adaptive design achieves the desired theoretical power, under the assumption that all other nuisance parameters have been specified correctly, while wrong assumptions regarding the prevalence may lead to an over- or underpowered study in the fixed design. The empirical type I error rate is sufficiently controlled in the adaptive design as well as in the fixed design.

Conclusion

The consideration of a blinded recalculation of the sample size already during the planning of the study may be advisable in order to increase the possibility of success as well as an enhanced process of the study. However, the application of the method is subject to a number of limitations associated with the study design in terms of feasibility, sample sizes needed to be achieved, and fulfillment of necessary prerequisites.



Upsells

Choose Pricing Plan:

  • An Introduction to Qualitative Research

    $34.99

    Buy Now
  • An Introduction to Research Methods

    $34.99

    Buy Now
  • Membership

    $14.99 / month

    Buy Now