Paper 40: What do adult outpatients included in clinical trials know about the investigational drugs being assessed: A crosssectional study in France.

Author

Lee Jones - Senior Biostatistician - Statistical Review

Published

April 5, 2026

References

Fronteau C, Pare´ M, Benoit P, Tollec S, Hamon C, Schwiertz V, et al. (2019) What do adult outpatients included in clinical trials know about the investigational drugs being assessed: A crosssectional study in France. PLoS ONE 14(8): e0220383. https://doi.org/10.1371/journal.pone.0220383

Disclosure

This reproducibility project was conducted to the best of our ability, with careful attention to statistical methods and assumptions. The research team comprises four senior biostatisticians (three of whom are accredited), with 20 to 30 years of experience in statistical modelling and analysis of healthcare data. While statistical assumptions play a crucial role in analysis, their evaluation is inherently subjective, and contextual knowledge can influence judgements about the importance of assumption violations. Differences in interpretation may arise among statisticians and researchers, leading to reasonable disagreements about methodological choices.

Our approach aimed to reproduce published analyses as faithfully as possible, using the details provided in the original papers. We acknowledge that other statisticians may have differing success in reproducing results due to variations in data handling and implicit methodological choices not fully described in publications. However, we maintain that research articles should contain sufficient detail for any qualified statistician to reproduce the analyses independently.

Methods used in our reproducibility analyses

There were two parts to our study. First, 100 articles published in PLOS ONE were randomly selected from the health domain and sent for post-publication peer review by statisticians. Of these, 95 included linear regression analyses and were therefore assessed for reporting quality. The statisticians evaluated what was reported, including regression coefficients, 95% confidence intervals, and p-values, as well as whether model assumptions were described and how those assumptions were evaluated. This report provides a brief summary of the initial statistical review.

The second part of the study involved reproducing linear regression analyses for papers with available data to assess both computational and inferential reproducibility. All papers were initially assessed for data availability, and the statistical software used. From those with accessible data, the first 20 papers (from the original random sample) were evaluated for computational reproducibility. Within each paper, individual linear regression models were identified and assigned a unique number. A maximum of three models per paper were selected for assessment. When more than three models were reported, priority was given to the final model or the primary models of interest as identified by the authors; any remaining models were selected at random.

To assess computational reproducibility, differences between the original and reproduced results were evaluated using absolute discrepancies and rounding error thresholds, tailored to the number of decimal places reported in each paper. Results for each reported statistic, e.g., regression coefficient, were categorised as Reproduced, Incorrect Rounding, or Not Reproduced, depending on how closely they matched the original values. Each paper was then classified as Reproduced, Mostly Reproduced, Partially Reproduced, or Not Reproduced. The mostly reproduced category included cases with minor rounding or typographical errors, whereas partially reproduced indicated substantial errors were observed, but some results were reproduced.

For models deemed at least partially computationally reproducible, inferential reproducibility was further assessed by examining whether statistical assumptions were met and by conducting sensitivity analyses, including bootstrapping where appropriate. We examined changes in standardized regression coefficients, which reflect the change in the outcome (in standard deviation units) for a one standard deviation increase in the predictor. Meaningful differences were defined as a relative change of 10% or more, or absolute differences of 0.1 (moderate) and 0.2 (substantial). When non-linear relationships were identified, inferential reproducibility was assessed by comparing model fit measures, including R², Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC). When the Gaussian distribution was not appropriate for the dependent variable, alternative distributions were considered, and model fit was evaluated using AIC and BIC.

Results from the reproduction of the Fronteau et al. (2019) paper are presented below. An overall summary of results is presented first, followed by model-specific results organised within tab panels. Within each panel, the Original results tab displays the linear regression outputs extracted from the published paper. The Reproduced results tab presents estimates derived from the authors’ shared data, along with a comprehensive assessment of linear regression assumptions. The Differences tab compares the original and reproduced models to assess computational reproducibility. Finally, the Sensitivity analysis tab evaluates inferential reproducibility by examining whether identified assumption violations meaningfully affected the results.

Summary from statistical review

This paper explored medication knowledge in adult outpatients included in clinical trials. An observational study using backwards selection linear regression was used to assess medication understanding. No assumptions, outliers or collinearity were discussed, and interpretation were based solely on p-values.

Data availability and software used

The authors provided data in a wide formatted Excel file an accompanying data dictionary. SAS was used for analyses of linear regression models.

Regression sample

One multivariable linear regression was reported in the results and was used to assess reproducibility. The outcome variable was medication understanding, a backwards regression was used with final model with predictors of patient’s disease, clinical trial phase, number of investigational medication products dispensed, drug name on the prescription and education.

Computational reproducibility results

This paper could not be computationally reproduced. The data dictionary was provided in a different language from the dataset, making it challenging to interpret the variables. Although the paper appeared to be well reported, with variable reference levels clearly stated, the data were supplied in their original coding and required recoding prior to analysis.

Four of the five independent variables could be identified. However, the variable drug name on the prescription could not be unambiguously matched in the dataset. This variable was reported as binary (Yes/No), and the data dictionary suggested that it might correspond to Etiq_2Criteres (Prescription_2Criteria). While the frequencies were similar, they did not match exactly: the paper reported No = 40, Yes = 187, with 9 missing values, whereas the dataset contained No = 36, Yes = 199, with 1 missing value. It is therefore unclear whether this was the correct variable or whether additional data cleaning or derivation steps were required. Nevertheless, this variable was used as the best available approximation for the attempted reproduction.

Inferential reproducibility results

As this model was not computationally reproducible, inferential reproducibility was not considered, since the original analyses could not be reproduced and therefore, statistical assumptions could not be meaningfully compared or interpreted.

Recommended changes

The authors should update the shared dataset to reflect the final, analysis-ready version used in the study, rather than the raw data alone.
Please include an English data dictionary, and make sure the variable names match those reported in the paper.
Evaluate the assumptions of the linear regression models by examining residuals, identifying influential outliers, and assessing multicollinearity among predictors. If any assumptions are violated, address them using appropriate methods.
Place greater emphasis on the magnitude and practical relevance of regression coefficients, including their confidence intervals, rather than focusing predominantly on statistical significance.

Model 1

Model results for Medication understanding

Term	B	Lower	Upper	p-value
Intercept
Disease:
HIV – Other	1.49	1.04	1.95	<0.0001
Trial_phase:
II/III/IV – I	1.27	0.23	2.31	0.0169
IMP_binary:
1 – >1	0.71	0.12	1.30	0.0182
Prescription_2Criteria:
YES – NO	0.65	0.17	1.14	0.0086
Education:
≥ High school diploma – < High school diploma	0.55	0.17	0.93	0.0050
SE = Standard error; Lower = lower confidence interval; Upper = upper confidence interval.

Fit statistics for Medication understanding

R	R2	R2Adj	AIC	RMSE	F	DF1	DF2	p-value

R2 Adj = Adjusted R2; AIC = Akaike Information Criterion; RMSE = The Root Mean Squared Error; DF1 = Degrees of freedom for the model; DF2 = Degrees of freedom for the residuals.

ANOVA table for Medication understanding

Term	SS	DF	MS	F	p-value
Disease
Trial_phase
IMP_binary
Prescription_2Criteria
Education
Residuals
SS = Sum of Squares; DF = Degrees of freedom; MS = Mean Square.

Model results for Medication understanding

Term	B	SE	Lower	Upper	t	p-value
Intercept	3.269	0.628	2.032	4.507	5.208	<0.0001
Disease:
HIV – Other	1.440	0.231	0.985	1.894	6.240	<0.0001
Trial_phase:
II/III/IV – I	1.201	0.531	0.155	2.247	2.264	0.0246
IMP_binary:
1 – >1	0.854	0.303	0.257	1.452	2.818	0.0053
Prescription_2Criteria:
YES – NO	0.429	0.260	−0.084	0.941	1.650	0.1004
Education:
≥ High school diploma – < High school diploma	0.671	0.190	0.296	1.046	3.524	0.0005
SE = Standard error; Lower = lower confidence interval; Upper = upper confidence interval.

Fit statistics for Medication understanding

R	R2	R2Adj	AIC	RMSE	F	DF1	DF2	p-value
0.507	0.257	0.240	744.163	1.332	14.417	5	208	<0.0001
R2 Adj = Adjusted R2; AIC = Akaike Information Criterion; RMSE = The Root Mean Squared Error; DF1 = Degrees of freedom for the model; DF2 = Degrees of freedom for the residuals.

ANOVA table for Medication understanding

Term	SS	DF	MS	F	p-value
Disease	71.136	1	71.136	38.941	<0.0001
Trial_phase	9.363	1	9.363	5.126	0.0246
IMP_binary	14.502	1	14.502	7.939	0.0053
Prescription_2Criteria	4.973	1	4.973	2.723	0.1004
Education	22.684	1	22.684	12.418	0.0005
Residuals	379.964	208	1.827
SS = Sum of Squares; DF = Degrees of freedom; MS = Mean Square; Calculated using type III SS.

Visualisation of regression model

The blue line shows the best line of fit with shading representing 95% confidence intervals, while holding all other covariates constant. The dots show partial residuals, which reflect the observed data adjusted for all other predictors except the one being plotted.

Forest plot showing original and reproduced coefficients and 95% confidence intervals for Medication understanding

Change in regression coefficients

term	O_B	R_B	Change.B	reproduce.B
Intercept		3.2693
Disease:
HIV – Other	1.49	1.4396	−0.0504	Not Reproduced
Trial_phase:
II/III/IV – I	1.27	1.2012	−0.0688	Not Reproduced
IMP_binary:
1 – >1	0.71	0.8545	0.1445	Not Reproduced
Prescription_2Criteria:
YES – NO	0.65	0.4286	−0.2214	Not Reproduced
Education:
≥ High school diploma – < High school diploma	0.55	0.6709	0.1209	Not Reproduced
O_B = original B; R_B = reproduced B; Change.B = change in R_B - O_B; Reproduce.B = B reproduced.

Change in lower 95% confidence intervals for coefficients

term	O_lower	R_lower	Change.lci	Reproduce.lower
Intercept		2.0317
Disease:
HIV – Other	1.04	0.9848	−0.0552	Not Reproduced
Trial_phase:
II/III/IV – I	0.23	0.1552	−0.0748	Not Reproduced
IMP_binary:
1 – >1	0.12	0.2566	0.1366	Not Reproduced
Prescription_2Criteria:
YES – NO	0.17	−0.0835	−0.2535	Not Reproduced
Education:
≥ High school diploma – < High school diploma	0.17	0.2956	0.1256	Not Reproduced
O_lower = original lower confidence interval; R_lower = reproduced lower confidence interval; change.lci = change in R_lower - O_lower; Reproduce.lower = lower confidence interval reproduced.

Change in upper 95% confidence intervals for coefficients

term	O_upper	R_upper	Change.uci	Reproduce.upper
Intercept		4.5070
Disease:
HIV – Other	1.95	1.8944	−0.0556	Not Reproduced
Trial_phase:
II/III/IV – I	2.31	2.2472	−0.0628	Not Reproduced
IMP_binary:
1 – >1	1.30	1.4523	0.1523	Not Reproduced
Prescription_2Criteria:
YES – NO	1.14	0.9408	−0.1992	Not Reproduced
Education:
≥ High school diploma – < High school diploma	0.93	1.0462	0.1162	Not Reproduced
O_upper = original upper confidence interval; R_upper = reproduced upper confidence interval; change.uci = change in R_upper - O_upper; Reproduce.upper = upper confidence interval reproduced.

Change in p-values

Term	O_p	R_p	Change.p	Reproduce.p	SigChangeDirection
Intercept		<0.0001
Disease:
HIV – Other	<0.0001	<0.0001	0.0000	Reproduced	Remains sig, B same direction
Trial_phase:
II/III/IV – I	0.0169	0.0246	0.0077	Not Reproduced	Remains sig, B same direction
IMP_binary:
1 – >1	0.0182	0.0053	−0.0129	Not Reproduced	Remains sig, B same direction
Prescription_2Criteria:
YES – NO	0.0086	0.1004	0.0918	Not Reproduced	Sig to non-sig, B changes direction
Education:
≥ High school diploma – < High school diploma	0.0050	0.0005	−0.0045	Not Reproduced	Remains sig, B same direction
O_p = original p-value; R_p = reproduced p-value; Changep = change in p-value R_p - O_p; Reproduce.p = p-values reproduced. SigChangeDirection = statistical significance and B change between original and reproduced models. Note, p-values that were <0.0001 were set to 0.000099 for the purposes of comparison.

Bland Altman plot showing differences between original and reproduced p-values for Medication understanding

Results for p-values

One of the five p-values was reproduced.

Conclusion computational reproducibility

This model was not computationally reproducible.