The STarT Back Screening Tool (SBT) distributes low back pain (LBP) patients into three prognostic groups for stratified care. This approach has demonstrated beneficial clinical and cost-effectiveness.
To translate and validate the SBT by investigating its psychometric properties among Israelis with acute and sub-acute LBP, and to evaluate its ability to predict disability after three months.
The SBT was transcultural adapted into Hebrew using published guidelines. A total of 150 patients receiving physical therapy for acute or subacute LBP were administered the SBT. Clinical outcomes included the Roland-Morris Disability Questionnaire (RMDQ), the Hospital Anxiety and Depression Scale (HADS), the Fear-Avoidance Beliefs Questionnaire (FABQ) and a numerical pain rating scale (NPRS), collected by an independent interviewer by phone at the start of the physical therapy treatment and after three months.
The test-retest reliability of the SBT total score and psychosocial subscale were excellent (intraclass correlation coefficient 0.89 and 0.82). Spearman’s correlation coefficient between SBT total score and RMDQ was 0.82, HADS (Anxiety 0.66, Depression 0.76), FABQ (exercise 0.53), NPRS (severe pain 0.48, average pain 0.53). The SBT baseline score showed excellent predictive abilities in discriminating poor disability after three months (ROC curve = 0.825, P < 0.001, 95% CI 0.756–0.894).
The Israeli translation and cross-cultural adaptation of the SBT is a valid and reliable instrument. The SBT discriminated low, medium and high-risk groups, and predicts disability after three months.
The Hebrew translation of the STarT Back (SBT) was found to be a valid and reliable tool.
The translated SBT discriminated low, medium and high-risk groups.
The translated SBT retained its predictive abilities concerning disability after three months.
The SBT tool can be implemented in the public health system in Israel.
Low back pain (LBP) is the leading cause of disability worldwide, and its burden is growing due to increased population and life span ( ). Most cases of LBP are non-specific, meaning, without any identified spinal pathology ( ). Poor understanding of etiology and prognosis complicates the selection of effective treatment for non-specific LBP. The STarT Back approach has been suggested to improve clinical outcomes, where patients are selected for treatment by prognostic classification ( ). The STarT Back is a cost-effective, evidence-based approach, utilizing a screening tool that has demonstrated clinical utility in predicting outcomes of patients with LBP (mainly tendency to chronicity) ( ). The STarT Back screening tool (SBT) supports primary care decision making to allocate patients with acute and subacute non-specific LBP into three prognostic groups (low, medium and high risk) with matched treatment pathways. The SBT has repeatedly proven to be a valid and reliable instrument and has been well accepted by both patients and clinicians ( ). Recently, the NICE guidelines ( ) for LBP and sciatica assessment and management, recommended using risk assessment and risk stratification tools such as the STarT Back.
The original SBT was developed in English and been validated and translated into several languages. ( ; ; ; ; ; ). The SBT was developed to screen patients with acute LBP at early stages in order to predict the transition to chronic LBP. Hence, we selected only patients in the early stages (acute and sub-acute) for our sample. This study aimed to translate and validate the SBT by investigating its psychometric properties among Israelis with acute and sub-acute LBP, and to evaluate its ability to predict disability after three months. We hypothesized that the SBT tool would yield high positive correlations with disability, depression, and anxiety, and moderately positively correlated with pain and fear avoidance of exercise.
STarT Back Screening Tool
The SBT is a 9-item questionnaire that classifies patients into three risk categories, according to the presence of physical and psychosocial risk factors for persistent LBP symptoms ( ). The nine items divided into physical and psychosocial subscales. The physical subscale includes four items: referred leg pain, disability (2 items), and comorbid pain. The psychosocial subscale includes five items: bothersomeness, catastrophizing, fear, anxiety, and depression. Each item scored as positive or negative. All points are added together for the total score (range 0–9). Items 5 to 9 form the psychosocial subscale (range 0–5). The Subscale scores are used to categorize the patient’s risk level, low risk if the total score from both subscales is 0–3, high risk if the psychosocial subscale score is 4 or 5, and a medium risk for all others ( ).
The translation was done with the permission of Keele university researchers, who developed the SBT. The translation process followed the recommendations for best practice in questionnaire translation. ( ; “ | Process of Translation and Adaptation of Instruments” n.d.). The SBT was initially translated forwards to Hebrew and then back to English, by two independent pairs of translators fluent in both English and Hebrew. The original translators (English to Hebrew) spoke English fluently and had Hebrew as their mother tongue. They were all aware of the concepts behind the questionnaire. Discrepancies were resolved through mutual discussion. The second stage translators (Hebrew to English), had English as their mother tongue, were fluent in Hebrew and were unaware of the concepts behind the questionnaire. The expert committee consisted of two physiotherapists who are specialists both in LBP and in LBP research and one more psychologist specializing in research in psychological aspects (and their measurement) in pain populations. After comparing the content of the original and backward translated version, the observed differences were discussed, and a pre-final Hebrew language version was developed. The pre-final version was then discussed with ten patients with LBP, who commented on the burden, ease of understanding, comprehensiveness, and readability of the translated version. No difficulties in comprehension were noted at this stage and a final version was produced.
We tested the correlations of the following measures with the SBT: 1) the Roland-Morris Disability Questionnaire (RMDQ) ( ) as a measure of back disability, 2) Numerical pain rating scale (NPRS) ( ) for the most severe and average pain intensity (0 = no pain and 10 = the worst pain), 3) Hospital Anxiety and Depression Scale (HADS) ( ) as a measure of anxiety and depressive symptoms and 4) Fear-Avoidance Beliefs Questionnaire (FABQ) which measures fear-avoidance of exercise and work. We have excluded the section of fear-avoidance of work from the FABQ questionnaire as it is not available on the SBT (there is no question about work in SBT). These questionnaires were selected, as they are considered appropriate for studying the SBT’s construct validity and are frequently used in LBP research ( ; ). Also, age, sex, weight, height, smoking habits, occupation, and employment status (employed, unemployed, on sick leave, retired) were included in the baseline data collection questionnaire.
Procedure for recruitment
Patients with LBP were recruited between March 2018 and June 2019, from three large outpatient physical therapy clinics of Clalit Health Services. Inclusion criteria were patients experiencing acute (less than six weeks) or subacute (6–12 weeks) LBP, with or without radicular pain, age of at least 18 years, and the ability to understand the Hebrew language. Exclusion criteria were chronic pain (more than 12 weeks of pain) and suspected red flags.
Their treating physiotherapist recruited patients at their initial assessment. Potential participants were asked for their permission to undergo a telephone interview. After patients gave their consent, a researcher called them by telephone to interview and fill out their baseline questionnaires. For the test-retest assessment, the same researcher called within one week to fill out the SBT questionnaire. Only patients that reported no change in their condition participated in the test-retest assessment. Three months after the first contact, the same researcher called each patient and filled out all the questionnaires again. All in all, the entire validation process for all versions of the questionnaire was conducted by telephone. The ethical review board of Clalit Health Services approved the study (No. 0157-17-COM2).
Fifty patients were included in the test-retest investigation. Patients were asked whether they had improved or not over the past week and were included only if they reported ‘no change’ in their symptoms.
The sample size was calculated with G*Power 188.8.131.52 using the z-test family to detect the correlation between two measures, the total SBT and Disability (which is the primary outcome on the SBT trial) ( ). The input parameters were as follows: for a two-tailed test, assuming a medium effect size of 0.5, α = 0.05 and β = 0.95, the total sample size recommended was 147 participants.
Data analysis was performed using IBM SPSS Statistics version 25. Characteristics of the sample were described using frequencies and means with standard deviations, and standard error measurement. Normality was evaluated by looking at each variable’s skewnss and kurtosis. The equal variance was examined by the Levene test, which was insignificant for each variable examined. Internal consistency was measured by calculating Cronbach’s alpha for the SBT scale. Test-retest reliability of SBT, between the baseline and 1-week follow-up, was evaluated by calculating the intra-class correlation coefficient (ICC) for the total score, psychosocial subscale, and corresponding risk groups (i.e., low, medium or high risk; and low or high psychosocial score) and each question individually. For reliability, we carried out an Intraclass correlation coefficient (ICC), on each item between the first measurement and the second measurement, within one week, in 50 patients. A two-way mixed effect test-retest absolute agreement ICC was used ( ). ICC values interpreted as follows: poor <0.40, fair 0.40–0.59, good 0.60–0.74, and excellent 0.75–1.00 ( ). Finally, standard error of measurement (SEM) values were calculated based on the differences between times of measurement, as conducted in previous studies ( ).
Construct validity was assessed by analyzing the correlations between the SBT (total score and psychosocial subscale) and reference questionnaires (NPRS, RMDQ, HADS, FABQ) using Spearman’s correlation coefficients.
The criteria for correlation values used was: weak <0.30, moderate 0.30–0.59, strong ≥0.60. ( ). The difference between risk groups was measured by a two-way repeated measure analysis of variance (ANOVA) to verify change from baseline to 3-month clinical outcomes. We considered the factors: time (baseline and post three months) and risk category (low, medium and high), according to the baseline risk category.
For the ability to predict patients with poor disability outcomes after three months, we used the receiver operating characteristic curve (ROC curve). This was performed by calculating the ROC curve for the tool’s overall scores against a reference standard cut point for poor disability after three months (RMDQ ≥7). The ROC Curve values interpreted as follows: not acceptable < 0.5, acceptable 0.7–0.8, excellent 0.8–0.9, an outstanding 0.9–1.00( ).
During the forward and backward translations, we found minor linguistic differences in the following items: item 1 (“spread down my legs”), item 1 and 2 (“at some time”), item 5 (“it’s not really safe”), item 6 (“worrying thoughts”) and item 9 (“extremely”). Translation of item 1 was challenging because not all people understand the meaning of “radiate” down the leg in Hebrew. The term ‘Radiate’ was noted by the research team a-priori as a problematic term, that did not translate well. It was removed before the questionnaire was tested with patients. Therefore, it was agreed to use the term “spread down my legs.” Additionally, the term “at some time” required changing in Hebrew. In the translation of item 5, “it’s not really safe” there was much consideration. After consulting with the developer of SBT, who recommended emphasizing the severity of the feeling, it was translated as “dangerous.” The same considerations applied regarding “worrying thoughts” and “extremely.”
Characteristics of the patients who completed the first set of questionnaires (n = 150) with the results stratified by SBT risk groups are described in Table 1 . Forty-eight patients (32.5%) were allocated to the low-risk group, 75 (49.7%) in the medium risk, and 27 (17.9%) at the high risk. ANOVA revealed that the mean scores of NPRS, RMDQ, FABQ, and HADS were significantly different across SBT risk groups ( Table 1 ) at baseline. Post-hoc analyses were used to determine significant differences between every two risk groups.
|Variable||All patients |
N = 150
|Low risk |
N = 48, 32%
|Medium risk N = 75, 50%||High risk |
N = 27, 18%
|Age||53.5 ± 19.5||51.7 ± 21.4||55.7 ± 18.1||50.7 ± 19.5||0.381|
|Gender, n (%) women||87 (58%)||25 (51%)||46 (61.3%)||16 (59.3%)||0.597|
|Symptom duration (week)||6.5 ± 3.6||7.3 ± 3.6 a||6.4 ± 3.5||5.1 ± 3.6 a||0.03 a|
|BMI||26.0 ± 4.4||24.5 ± 3.8 a||27.1 ± 4.4 a||25.6 ± 4.6||0.004 a|
|Smoking, n (%)||32 (21.3%)||6 (12.2%) a||16 (21.3%)||10 (37%) a||0.04 a|
|Employed, n (%)||79 (52.7%)||32 (65.3%)||35 (46.7%)||12 (44.4%)|
|Unemployed, n (%)||9 (6%)||1 (2%)||5 (6.7%)||3 (11.1%)|
|Sick leave, n (%)||18 (12%)||2 (4.1%)||11 (14.7%)||5 (18.5%)|
|Retired, n (%)||44 (29.3%)||13 (26.5%)||24 (32%)||7 (25.9%)|
|Disability RMDQ (0–23)||13.05 ± 5.6||7.0 ± 3.8 a, b||15.4 ± 3.8 a||17.2 ± 3.5 a, b||0.001 a|
|Severe pain (0–10)||7.5 ± 2.3||6.3 ± 2.2 a, b||7.9 ± 2.1 a||8.7 ± 2.1 a, b||0.001 a|
|Average pain (0–10)||6.0 ± 2.1||4.8 ± 1.8 a, b||6.3 ± 1.9 a||7.1 ± 2.1 a, b||0.001 a|
|Fear avoidance exercise (0–24)||9.6 ± 6.3||5.5 ± 4.8 a||10.3 ± 5.9 a||15.2 ± 5.0 a||0.001 a|
|Anxiety (0–21)||6.1 ± 5.1||2.2 ± 2.9 a||7.2 ± 5.0 a||10.0 ± 4.4 a||0.001 a|
|Depression (0–21)||5.7 ± 4.7||1.8 ± 1.7 a||6.6 ± 4.5 a||9.9 ± 4.1 a||0.001 a|