Paper 22: Post-traumatic stress disorder and self reported outcomes after traumatic brain injury in victims of assault

Author

Lee Jones - Senior Biostatistician - Statistical Review

Published

March 15, 2026

References

Bown D, Belli A, Qureshi K, Davies D, Toman E, Upthegrove R (2019) Post-traumatic stress disorder and self-reported outcomes after traumatic brain injury in victims of assault. PLoS ONE 14(2): e0211684. https://doi.org/10.1371/journal.pone.0211684

Bown, Dominic and Belli, Antonio and Davies, David and Toman, Emma (2019) Research data supporting “Post-traumatic stress disorder and self-reported outcomes after traumatic brain injury in victims of assault”. https://doi.org/10.25500/eData.bham.00000293

Disclosure

This reproducibility project was conducted to the best of our ability, with careful attention to statistical methods and assumptions. The research team comprises four senior biostatisticians (three of whom are accredited), with 20 to 30 years of experience in statistical modelling and analysis of healthcare data. While statistical assumptions play a crucial role in analysis, their evaluation is inherently subjective, and contextual knowledge can influence judgements about the importance of assumption violations. Differences in interpretation may arise among statisticians and researchers, leading to reasonable disagreements about methodological choices.

Our approach aimed to reproduce published analyses as faithfully as possible, using the details provided in the original papers. We acknowledge that other statisticians may have differing success in reproducing results due to variations in data handling and implicit methodological choices not fully described in publications. However, we maintain that research articles should contain sufficient detail for any qualified statistician to reproduce the analyses independently.

Methods used in our reproducibility analyses

There were two parts to our study. First, 100 articles published in PLOS ONE were randomly selected from the health domain and sent for post-publication peer review by statisticians. Of these, 95 included linear regression analyses and were therefore assessed for reporting quality. The statisticians evaluated what was reported, including regression coefficients, 95% confidence intervals, and p-values, as well as whether model assumptions were described and how those assumptions were evaluated. This report provides a brief summary of the initial statistical review.

The second part of the study involved reproducing linear regression analyses for papers with available data to assess both computational and inferential reproducibility. All papers were initially assessed for data availability and the statistical software used. From those with accessible data, the first 20 papers (from the original random sample) were evaluated for computational reproducibility. Within each paper, individual linear regression models were identified and assigned a unique number. A maximum of three models per paper were selected for assessment. When more than three models were reported, priority was given to the final model or the primary models of interest as identified by the authors; any remaining models were selected at random.

To assess computational reproducibility, differences between the original and reproduced results were evaluated using absolute discrepancies and rounding error thresholds, tailored to the number of decimal places reported in each paper. Results for each reported statistic, e.g., regression coefficient, were categorised as Reproduced, Incorrect Rounding, or Not Reproduced, depending on how closely they matched the original values. Each paper was then classified as Reproduced, Mostly Reproduced, Partially Reproduced, or Not Reproduced. The mostly reproduced category included cases with minor rounding or typographical errors, whereas partially reproduced indicated substantial errors were observed, but some results were reproduced.

For models deemed at least partially computationally reproducible, inferential reproducibility was further assessed by examining whether statistical assumptions were met and by conducting sensitivity analyses, including bootstrapping where appropriate. We examined changes in standardized regression coefficients, which reflect the change in the outcome (in standard deviation units) for a one standard deviation increase in the predictor. Meaningful differences were defined as a relative change of 10% or more, or absolute differences of 0.1 (moderate) and 0.2 (substantial). When non-linear relationships were identified, inferential reproducibility was assessed by comparing model fit measures, including R², Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC). When the Gaussian distribution was not appropriate for the dependent variable, alternative distributions were considered, and model fit was evaluated using AIC and BIC.

Results from the reproduction of the Bown et al. (2019) paper are presented below. An overall summary of results is presented first, followed by model-specific results organised within tab panels. Within each panel, the Original results tab displays the linear regression outputs extracted from the published paper. The Reproduced results tab presents estimates derived from the authors’ shared data, along with a comprehensive assessment of linear regression assumptions. The Differences tab compares the original and reproduced models to assess computational reproducibility. Finally, the Sensitivity analysis tab evaluates inferential reproducibility by examining whether identified assumption violations meaningfully affected the results.

Summary from statistical review

This study examines the association of assault with self-reported outcomes, including quality of life in patients attending traumatic brain injury clinics. The article is easy to read and uses a mix of non-parametric and linear regression but does not discuss normality, other assumptions, or collinearity. The authors state a cross-sectional design and describe a clinically meaningful difference between groups. Univariate modelling is done with different tests, and the authors talk in terms of significant or borderline variables for inclusion in multivariable models.

Data availability and software used

Data is available from University of Birmingham open access UBRIA eData repository in an Excel in a wide format with no data dictionary. SPSS was used for analysis.

Regression sample

There were eight multivariable model identified in the paper with the main focusing on physical, cognitive and emotional symptoms after traumatic brain injury (TBI). Three linear regressions were randomly sampled these outcomes included Rivermead Post-concussion Questionnaire (RPQ), both the full scale and the subscale RPQ13 and the Impact of Event Scale (IES). All analysis was adjusted for age, ethnicity and extracranial trauma.

Computational reproducibility results

The three models assessed in this paper were not found to be computationally reproducible and were only partially reproduced. A partial data dictionary was provided, with label values for categorical variables and missing data definitions included in the accompanying Excel file. Identification of “extracranial trauma” was unclear; it appeared to correspond to a variable named Polytrauma in the dataset, which was inferred by matching descriptive statistics. Multiple missing value codes were used inconsistently, for example, age was documented as missing if coded 100, but in the dataset, missing values appeared as zero. A filter variable was available to identify included cases, which is good practice.

Although the reproduced sample sizes matched the publication, minor discrepancies in medians and regression results suggested potential differences in the data. These may have arisen from updates to the dataset after results were generated. For instance, the median age for participants experiencing assault was reported as 29.3 (22.1, 36.4), whereas the reproduced median was 29.0 (22.0, 35.0). Similarly, RPQ (full) scores for the same group were reported as 31.0 (13.4, 42.0) but reproduced as 31.0 (14.75, 42.00). These small differences may reflect slight variations in the dataset used or differences in rounding or summary procedures.

Linear regression results were not fully reproduced; coefficients and p-values differed at the decimal level. The authors stated that pairwise deletion was used for missing data or incomplete questionnaires to maximize data usage. It was unclear whether this applied to the regression models. Testing the pairwise option in SPSS yielded results that differed substantially from the reported findings, suggesting that discrepancies between the reproduced and reported analyses were likely due to minor differences in the data rather than analysis strategy. Linear regression results were also difficult to reproduce due to incomplete reporting. Only the regression coefficients and p-values for the primary independent variable were provided, with limited detail on the model specification. While the covariates used for adjustment were listed, it was unclear how categorical variables were handled. For example, Ethnicity, which included levels such as White, Asian, and Other, appeared to have been treated as a continuous variable. In SPSS, the regression function does not automatically handle categorical variables and requires the user to manually create dummy variables.

Inferential reproducibility results

None of the three models assessed in this paper were found to be inferentially reproducible. The primary reasons were violations of model assumptions and issues with data distribution and parameterisation. Notably, ethnicity was treated as a continuous variable, which is inappropriate given that it is a nominal variable. While treating ordinal variables as continuous can be reasonable if the relationship is approximately linear, this does not apply to nominal variables like ethnicity, which should be modelled categorically. Across the three outcome variables, 10–24% of values were zero. This was most pronounced in the IES model, where a strong residual pattern indicated violation of the normality assumption. A Tweedie distribution, which accounts for zero inflation and non-normality, provided a substantially better fit, reducing AIC by 69–159 across all outcomes.

Recommended Changes

A more comprehensive data dictionary should be provided, ensuring that all variable names and definitions exactly match those reported in the paper.
Check that reported results match the data provided.
Include tables with full reporting of statistical analysis.
Include ethnicity in models as a categorical variable.
Use an appropriate distribution to fit the data, such as a Tweedie model, to account for zeros.
Evaluate the assumptions of the linear regression models by examining residuals, identifying influential outliers, and assessing multicollinearity among predictors. If any assumptions are violated, address them using appropriate methods.

Model 1

Model results for RPQ (full)

Term	B	SE	p-value
Intercept
Mechanism_of_TBI:
Assault – Non-assault	5.708	3.495	0.105
Age
Ethicity_short
Extracranial_trauma:
Yes – No
SE = Standard error; Lower = lower confidence interval; Upper = upper confidence interval.

Fit statistics for RPQ (full)

R	R2	R2Adj	AIC	RMSE	F	DF1	DF2	p-value

R2 Adj = Adjusted R2; AIC = Akaike Information Criterion; RMSE = The Root Mean Squared Error; DF1 = Degrees of freedom for the model; DF2 = Degrees of freedom for the residuals.

ANOVA table for RPQ (full)

Term	SS	DF	MS	F	p-value
Mechanism_of_TBI
Age
Ethicity_short
Extracranial_trauma
Residuals
SS = Sum of Squares; DF = Degrees of freedom; MS = Mean Square.

Model results for RPQ (full)

Term	B	SE	Lower	Upper	t	p-value
Intercept	22.782	5.661	11.599	33.965	4.025	<0.001
Mechanism_of_TBI:
Assault – Non-assault	5.718	3.495	−1.186	12.623	1.636	0.1039
Age	−0.141	0.077	−0.294	0.012	−1.814	0.0716
Ethicity_short	2.842	2.836	−2.761	8.446	1.002	0.3179
Extracranial_trauma:
Yes – No	4.679	2.971	−1.191	10.548	1.575	0.1174
SE = Standard error; Lower = lower confidence interval; Upper = upper confidence interval.

Fit statistics for RPQ (full)

R	R2	R2Adj	AIC	RMSE	F	DF1	DF2	p-value
0.308	0.095	0.071	1,352.162	16.812	4.013	4	153	0.0040
R2 Adj = Adjusted R2; AIC = Akaike Information Criterion; RMSE = The Root Mean Squared Error; DF1 = Degrees of freedom for the model; DF2 = Degrees of freedom for the residuals.

ANOVA table for RPQ (full)

Term	SS	DF	MS	F	p-value
Mechanism_of_TBI	781.299	1	781.299	2.677	0.1039
Age	960.764	1	960.764	3.292	0.0716
Ethicity_short	293.130	1	293.130	1.004	0.3179
Extracranial_trauma	723.878	1	723.878	2.480	0.1174
Residuals	44,656.668	153	291.874
SS = Sum of Squares; DF = Degrees of freedom; MS = Mean Square; Calculated using type III SS.

Visualisation of regression model

The blue line shows the best line of fit with shading representing 95% confidence intervals, while holding all other covariates constant. The dots show partial residuals, which reflect the observed data adjusted for all other predictors except the one being plotted.

Checking residuals plots for patterns

Blue line showing quadratic fit for residuals

Testing residuals for non linear relationships

Term	Statistic	p-value	Results
Mechanism_of_TBI
Age	−2.245	0.0262	Linearity violation
Ethicity_short	−2.241	0.0265	Linearity violation
Extracranial_trauma
Tukey test	−0.164	0.8700	No linearity violation
Specification test for predictors using quadratic tests, for fitted values curvature is tested through Tukey's one-degree-of-freedom test for nonadditivity.

Checking univariate relationships with the dependent variable using scatterplots

Blue line shows linear relationship, red line indicates relationship inferred by GAM modelling

Linearity results

Plots and tests show an unaccounted non-linear relationship may be present.

Testing for homoscedasticity

Statistic	p-value	Parameter	Method
1.147	0.8868	4	studentized Breusch-Pagan test

Homoscedasticity results

The studentized Breusch-Pagan test supports homoscedasticity.
There is no distinct funnelling pattern observed, supporting homoscedasticity of residuals.

Model descriptives including cook’s distance and leverage to understand outliers

Term	N	Mean	SD	Median	Min	Max	Skewness	Kurtosis
RPQ (full)	158	23.332	17.728	20.000	0.000	60.000	0.334	−1.092
Mechanism_of_TBI*	158	1.241	0.429	1.000	1.000	2.000	1.203	−0.557
Age	158	42.209	19.564	39.500	16.000	88.000	0.348	−1.123
Ethicity_short	158	1.203	0.502	1.000	1.000	3.000	2.439	5.049
Extracranial_trauma*	158	1.361	0.482	1.000	1.000	2.000	0.574	−1.681
.fitted	158	23.332	5.463	22.976	13.259	36.053	0.210	−0.745
.resid	158	−0.000	16.865	−3.426	−33.351	38.824	0.373	−0.770
.leverage	158	0.032	0.020	0.028	0.013	0.124	2.611	7.962
.sigma	158	17.084	0.063	17.109	16.837	17.140	−1.602	2.174
.cooksd	158	0.007	0.010	0.003	0.000	0.060	2.829	9.293
.std.resid	158	−0.001	1.004	−0.206	−2.003	2.319	0.362	−0.773
dfb.1_	158	0.000	0.083	0.001	−0.224	0.336	0.854	3.262
dfb.M__T	158	0.000	0.087	0.002	−0.329	0.357	−0.149	3.235
dfb.Age	158	0.000	0.074	0.000	−0.253	0.257	0.042	1.011
dfb.Eth_	158	−0.001	0.094	0.000	−0.501	0.380	−1.441	11.776
dfb.Ex_Y	158	0.000	0.081	0.008	−0.266	0.185	−0.388	0.623
dffit	158	−0.005	0.186	−0.032	−0.553	0.478	−0.066	0.157
cov.r	158	1.034	0.043	1.043	0.895	1.156	−0.808	1.605
* categorical variable

Cooks threshold

Cook’s distance measures the overall change in fit, if an observation is removed. Potential influential observations are identified by \(\text{Cook's Distance}_i > \frac{4}{n}\), where n is the number of observations. In practice a threshold of 0.5 is often used to identify influential observations.

DFFIT threshold

DFFIT measures how many standard deviations the fitted values will change when observation is removed. Potential influential observations \(\left| \text{DFFITS}_i \right| > \frac{2\sqrt{p}}{\sqrt{n}}\) where p is the number of predictors (including the intercept) and n is the number of observations. In practice this can result in a large number of points identified, often DFFIT \(\pm 1\) is used to identify highly influential observations.

DFBETA threshold

DFBETA measures the change in a regression coefficient, in units of its standard error, when a particular observation is removed from the model. There is a DFBETA for each parameter in the model. Potential influential observations \(|\text{DFBETA}_{ij}| > \frac{2}{\sqrt{n}}\), where n is the number of observations. In larger datasets this threshold can flag a high number of observations with only minor influence on the model. In practice, DFBETA \(\pm 1\) is often used to identify outliers.

Influence plot

Observations with high leverage (horizontal) and large residuals (vertical, typically at ±2 or ±3 studentized residuals) are concerning, as they may disproportionately influence the model. This combination is reflected by large bubbles with high Cook’s distance indicated by darker shadings of blue.

COVRATIO plot

COVRATIO measures the overall change in the precision (covariance matrix) of the estimated regression coefficients when the ith observation is removed. Values close to 1 indicate little influence on the model’s precision. Values below 1 suggest that an observation inflates the variances and reduces precision, resulting in wider confidence intervals, whereas values above 1 suggest deflated variances and narrower confidence intervals. A commonly cited guideline is \(\left|\mathrm{COVRATIO}_i - 1\right| > \frac{3p}{n}\), where p is the number of parameters and n is the number of observations. A practical cut-off between 0.9 to 1.1 was used to flag observations with meaningful impact on precision, although there is no agreed universal alternative cut-off.

Observations of interest identified by the influence plot

ID	StudRes	Leverage	CookD	dfb.1_	dfb.M__T	dfb.Age	dfb.Eth_	dfb.Ex_Y	dffit	cov.r
21	0.443	0.112	0.005	−0.100	0.048	0.045	0.134	−0.026	0.157	1.156
103	2.225	0.016	0.016	0.005	−0.061	0.129	−0.011	−0.111	0.287	0.895
61	2.353	0.040	0.044	−0.224	−0.134	0.161	0.380	−0.170	0.478	0.900
27	−1.400	0.124	0.055	0.329	0.106	−0.175	−0.491	0.124	−0.528	1.107
172	−1.614	0.105	0.060	0.336	0.107	−0.105	−0.501	−0.118	−0.553	1.061
StudRes = studentized residuals; CookD = Cook's Distance a combined measure of leverage and influence. DFBETAS (dfb.*) measures how much a specific regression coefficient changes (in standard errors) when an observation is removed; DFFITS measures how much the fitted (predicted) value for an observation changes (in standard deviations) when that observation is removed; cov.r = coefficient covariance ratio which measures how much the overall variance (precision) of the coefficients changes when that observation is removed.

Results for outliers and influential points

Cook’s Distance and other influence diagnostics indicated no observations of concern.

Checking for normality of the residuals using a Q–Q plot

Normality of residuals using Shapiro-Wilk and Kolmogorov-Smirnov tests

Statistic	p-value	Method
0.098	0.0945	Asymptotic one-sample Kolmogorov-Smirnov test

Statistic	p-value	Method
0.969	0.0012	Shapiro-Wilk normality test

Normality results

The Kolmogorov-Smirnov supports residuals being normally distributed.
The Shapiro-Wilk normality test indicates residuals may not be normally distributed.
The Q-Q plot shows a minor violation of normality in the middle, so this should be checked.

Assessing collinearity with VIF

Term	VIF	Tolerance
Mechanism_of_TBI	1.208	0.828
Age	1.235	0.810
Ethicity_short	1.089	0.918
Extracranial_trauma	1.102	0.908
VIF = Variance Inflation Factor.

Collinearity results

All VIF values are under three, indicating no collinearity issues.
Overall, when taking into account VIF and SE, the model does not have collinearity issues.

Assessing independence with the Durbin–Watson test for autocorrelation

AutoCorrelation	Statistic	p-value
−0.008	2.009	0.9800

Independence results

The Durbin–Watson test suggests there are no auto-correlation issues.
The study design seems to be independent.

Assumption conclusions

Forest plot showing original and reproduced coefficients and 95% confidence intervals for RPQ (full)

Change in regression coefficients

term	O_B	R_B	Change.B	reproduce.B
Intercept		22.7822
Mechanism_of_TBI:
Assault – Non-assault	5.708	5.7181	0.0101	Not Reproduced
Age		−0.1405
Ethicity_short		2.8423
Extracranial_trauma:
Yes – No		4.6786
O_B = original B; R_B = reproduced B; Change.B = change in R_B - O_B; Reproduce.B = B reproduced.

Change in Standard error

term	O_SE	R_SE	Change.SE	Reproduce.SE
Intercept		5.6607
Mechanism_of_TBI:
Assault – Non-assault	3.495	3.4949	−0.0001	Reproduced
Age		0.0774
Ethicity_short		2.8362
Extracranial_trauma:
Yes – No		2.9709
O_SE = original standard error; R_SE = reproduced standard error; Change.SE = change in R_SE - O_SE; Reproduce.lower = standard error reproduced.

Change in p-values

Term	O_p	R_p	Change.p	Reproduce.p	SigChangeDirection
Intercept		<0.001
Mechanism_of_TBI:
Assault – Non-assault	0.105	0.1039	−0.0011	Not Reproduced	Remains non-sig, B same direction
Age		0.0716
Ethicity_short		0.3179
Extracranial_trauma:
Yes – No		0.1174
O_p = original p-value; R_p = reproduced p-value; Changep = change in p-value R_p - O_p; Reproduce.p = p-values reproduced. SigChangeDirection = statistical significance and B change between original and reproduced models. Note, p-values that were <0.001 were set to 0.00099 for the purposes of comparison.

Results for p-values

The p-value for this model was not reproduced.

Conclusion computational reproducibility

This model was not computationally reproducible, although some aspects were partially reproduced such as Standard error. Minor discrepancies in regression coefficients and p-values suggest there may be small differences in the underlying data.

Methods

While the model was partially reproduced, it is likely that minor differences in the underlying data account for the remaining discrepancies. However, these differences are unlikely to have affected the model assumptions, and inferential reproducibility was assessed. The original model exhibited violations of normality and linearity, with 9.6% of RPQ_full values equal to zero. To account for the excess zeros and the non-normal distribution, a Tweedie model was fitted and compared to the reproduced model using AIC. Ethnicity was included as a categorical variable, and age was assessed for linearity. Age was standardized for the analysis, but RPQ_full was not standardized to maintain compatibility with the Tweedie distribution.

Results

Treating ethnicity as a continuous variable was not appropriate, as the differences between groups were not equally spaced and the relationship with the outcome was not linear. The categorical variable improved model fit, with a decrease in AIC of 3.13.

	Reproduced					Categorical Ethnicity
Characteristic	Beta	SE1	Statistic	95% CI1	p-value	Beta	SE1	Statistic	95% CI1	p-value
(Intercept)	16.66	3.74	4.45	9.27, 24.05	<0.001	19.29	1.99	9.67	15.35, 23.23	<0.001
Mechanism_of_TBI
Non-assault
Assault	5.72	3.49	1.64	−1.19, 12.62	0.104	4.49	3.49	1.29	−2.41, 11.39	0.201
Extracranial_trauma
No
Yes	4.68	2.97	1.57	−1.19, 10.55	0.117	4.63	2.93	1.58	−1.16, 10.42	0.116
z_Age	−2.86	1.58	−1.81	−5.98, 0.25	0.072	−2.54	1.56	−1.63	−5.63, 0.55	0.106
Ethicity_short	2.84	2.84	1.00	−2.76, 8.45	0.318
Ethicity
White
Asian						10.72	4.49	2.39	1.84, 19.59	0.018
Other						−2.20	6.61	−0.33	−15.26, 10.87	0.740
1SE = Standard Error, CI = Confidence Interval
Note: Reproduced model – Adjusted R² = 0.071; AIC = 1352.16; BIC = 1370.54
Categorical Ethnicity model – Adjusted R² = 0.095; AIC = 1349.03; BIC = 1370.47

Fitting a Tweedie model to account for zero’s, the estimated dispersion parameter (φ) was 4.70, indicating greater variability than would be expected under a Gaussian model, and the Tweedie power parameter (p) was 1.36, consistent with a compound Poisson–Gamma distribution. Compared to the linear model, the Tweedie model yielded a substantial AIC reduction of 68.96. Although inclusion of a squared age term marginally improved model fit (AIC and BIC) and residual diagnostics, these improvements were small, and the simpler model was retained.

In this model, the intercept represents the geometric mean of the RPQ_full score for patients who were white, had not been assaulted, had not experienced extracranial trauma, and were at the average age (since age was standardized). This corresponds to the expected RPQ_full score when all predictors are at their reference level or zero. The exponentiated coefficient for Mechanism_of_TBI indicates that patients who had been assaulted had an estimated 19% higher RPQ_full score compared to those who had not (p = 0.271). The coefficient for standardized age was 0.87, suggesting that the RPQ_full decreases by 13% as age increased by one standard deviation.

	Tweedie with linear age			Tweedie with quadratic age
Characteristic	exp(Beta)	95% CI1	p-value	exp(Beta)	95% CI1	p-value
(Intercept)	19.5	16.1, 23.5	<0.001	24.4	19.1, 31.1	<0.001
Mechanism_of_TBI
Non-assault	—	—		—	—
Assault	1.19	0.87, 1.61	0.271	1.16	0.86, 1.57	0.339
Extracranial_trauma
No	—	—		—	—
Yes	1.19	0.91, 1.56	0.202	1.14	0.88, 1.49	0.316
Ethicity
White	—	—		—	—
Asian	1.41	0.98, 2.04	0.068	1.41	0.98, 2.03	0.062
Other	0.90	0.48, 1.67	0.731	0.91	0.50, 1.68	0.770
z_Age	0.87	0.75, 1.01	0.068	0.87	0.75, 1.01	0.074
z_Age2				0.78	0.66, 0.93	0.006
1CI = Confidence Interval
Tweedie with linear age — Dispersion (φ): 4.70; Power (p): 1.360; AIC: 1283.20; BIC: 1307.70
Tweedie with quadratic age — Dispersion (φ): 4.62; Power (p): 1.350; AIC: 1277.59; BIC: 1305.16

The residual diagnostics for the final Tweedie model appeared satisfactory, with the DHARMa package producing appropriate residual plots to assess overdispersion, zero inflation, and uniformity. DHARMa residuals are based on the rank of each observed value among many simulated values from the fitted model. This approach produces discrete residual values (e.g., 0.25, 0.5, 0.75) that reflect the percentile position of the observed value within the simulated distribution.

Inferential Reproducibility conclusion

This model was not found to be inferentially reproducible, due to differences in model distribution and specification. A Tweedie distribution was identified as more appropriate for the data, given the presence of zero inflation and non-normality. In addition, model parameterisation differed, including treating categorical variables as continuous.

Model 2

Model results for IES

Term	B	SE	p-value
Intercept
Mechanism_of_TBI:
Assault – Non-assault	6.550	4.736	0.170
Age
Ethicity_short
Extracranial_trauma:
Yes – No
SE = Standard error; Lower = lower confidence interval; Upper = upper confidence interval.

Fit statistics for IES

R	R2	R2Adj	AIC	RMSE	F	DF1	DF2	p-value

R2 Adj = Adjusted R2; AIC = Akaike Information Criterion; RMSE = The Root Mean Squared Error; DF1 = Degrees of freedom for the model; DF2 = Degrees of freedom for the residuals.

ANOVA table for IES

Term	SS	DF	MS	F	p-value
Mechanism_of_TBI
Age
Ethicity_short
Extracranial_trauma
Residuals
SS = Sum of Squares; DF = Degrees of freedom; MS = Mean Square.

Model results IES

Term	B	SE	Lower	Upper	t	p-value
Intercept	8.742	7.982	−7.081	24.564	1.095	0.2759
Mechanism_of_TBI:
Assault – Non-assault	6.512	4.737	−2.877	15.901	1.375	0.1720
Age	−0.049	0.097	−0.241	0.144	−0.502	0.6169
Ethicity_short	7.784	4.657	−1.447	17.015	1.672	0.0975
Extracranial_trauma:
Yes – No	3.146	3.883	−4.551	10.844	0.810	0.4196
SE = Standard error; Lower = lower confidence interval; Upper = upper confidence interval.

Model fit for IES

R	R2	R2Adj	AIC	RMSE	F	DF1	DF2	p-value
0.271	0.074	0.039	991.981	18.491	2.148	4	108	0.0798
R2 Adj = Adjusted R2; AIC = Akaike Information Criterion; RMSE = The Root Mean Squared Error; DF1 = Degrees of freedom for the model; DF2 = Degrees of freedom for the residuals.

ANOVA table for IES

Term	SS	DF	MS	F	p-value
Mechanism_of_TBI	676.212	1	676.212	1.890	0.1720
Age	90.033	1	90.033	0.252	0.6169
Ethicity_short	999.480	1	999.480	2.794	0.0975
Extracranial_trauma	234.865	1	234.865	0.657	0.4196
Residuals	38,634.804	108	357.730
SS = Sum of Squares; DF = Degrees of freedom; MS = Mean Square; Calculated using type III SS.