“CATCH-IT Reports” are Critically Appraised Topics in Communication, Health Informatics, and Technology, discussing recently published ehealth research. We hope these reports will draw attention to important work published in journals, provide a platform for discussion around results and methodological issues in eHealth research, and help to develop a framework for evidence-based eHealth. CATCH-IT Reports arise from “journal club” - like sessions founded in February 2003 by Gunther Eysenbach.

Monday, December 14, 2009

Final CATCH-IT Report: Syndromic Surveillance Using Ambulatory Electronic Health Records

Hripcsak G, Soulakis ND, Li L, Morrison FP, Lai AM, Friedman C, Calman NS, Mostashari F. (2009). Journal of the American Medical Informatics Association, 16(3):354-61. Epub 2009 Mar 4.

Abstract & Blog Comments
Draft
Slideshow - not able to upload

Introduction
Syndromic surveillance is a type of surveillance that uses health-related data to predict or detect disease outbreaks or bioterrorism events. Much of the work in this area of research has been conducted on structured health data (1)(2). However, these systems typically need to be tailored to for a particular IT system and institution due to a lack of available data standards. Engineering syndromic surveillance systems in this way is both time consuming and localized only. On the other hand, utilizing narrative records for syndromic surveillance brings unique challenges as this data requires natural language processing. Several studies have successfully experimented with this approach (3)(4). With the increasing adoption of electronic health records (EHRs) there is an abundance of clinically-relevant data that could potentially be used for syndromic surveillance. As a result, creating a generic syndromic surveillance system that could be broadly applied and disseminated across institutions is attractive. This CATCH-IT report is a critique of a research paper detailing an approach to creating such a system (5).

Objectives
The aim of this study was to develop and assess the performance of a syndromic surveillance system using both structured and narrative ambulatory EHR data. The evaluation methodology suggests that the authors may be trying to assess the system’s performance based on its concurrent validity with other existing surveillance systems.

Hypothesis
Not explicitly stated. It appears implicitly that the authors expect the signals from the test systems and the ED data to occur at the same time (no lag).

Methodology

Setting
The Institute for Family Health (IFH) served as the data source for testing the surveillance systems. IFH is comprised of 13 community health centres in New York, all of which use the Epic EHR system.

Query Development
The authors took two different approaches to developing their syndromic surveillance system, a tailored approach on structured data and a generic approach on narrative data. The two syndromes of interest, influenza-like-illnesses (ILI) and gastrointestinal-infectious diseases (GIID), were defined by two physicians. Both sets of queries were developed based on these definitions. The tailored queries were created specifically for IFH by mapping key terms to available system data and using past influenza season data. The performance of the tailored queries for ILI and GIID were not thoroughly evaluated. The MedLEE natural language processing system (NLP) (6) was utilized to create the generic queries for narrative data. Generic queries were tested on internal medicine ambulatory notes from the Columbia University Medical Centre (CUMC). These queries were evaluated using a gold standard that was produced by a manual review of a subset of clinical notes. Queries were then selected based on their ROC performance.

Evaluation
The resulting queries were tested on 2004-2005 data from the Institutes for Family Health (IFH). All structured notes with recorded temperature (124,568) and de-identified narrative notes (277,963) were analyzed. The results of the two test systems were compared with two existing sources, the New York City Emergency Department chief complaint syndromic surveillance system (NYC ED) (7) and the New York World Health Organization (WHO) influenza isolates, using the lagged cross-correlation technique. The NYC ED served as the only comparison source for GIID.

Results
The ILI lagged cross-correlation for IFH structured and narrative isolates showed both a strong correlation with NYC ED isolates (0.93 and 0.88, respectively) and with one another (0.88). The correlation with WHO isolates was high (0.89 structured, 0.84 narrative), although less precise and produced an asymmetric lagged cross-correlation shape, which hindered interpretation of the true lag.

GIID results were more ambiguous. While IFH structured data correlated relatively well with the NYC ED data (0.81), the IFH narrative data correlated poorly with both IFH structured and NYC ED isolates (0.51 and 0.47, respectively). This result indicated that there was a particular problem with the generic narrative approach on GIID data. However, across all GIID comparisons there was a lack of precision in the correlations (wide confidence intervals) and clarity in interpreting the true lag.

Author’s Conclusions
The authors concluded that the tailored structured EHR system correlated well with validated measures with respect to both syndromes. While the narrative EHR data performed well only on ILI data, the authors believe this approach is feasible and has the potential for broad dissemination.

Methodological Issues & Questions

Query Development
Both sets of queries were based on syndrome definitions created by two domain experts. While the definitions of ILI and GIID are fairly well established, it is not clear why the authors decided to employ expert opinion for their definitions rather than using a standard definition from the CDC or the WHO. Although the definitions used in this study are valid and likely do not represent a major source of error, any contentions on this issue could have easily been avoided.

The bigger methodological issue in respect to the structured query development is the lack of query evaluation. A crude measure of sensitivity was calculated for the ILI query, but no manual review was undertaken to produce measures of specificity or predictive value. Of even more concern, the performance of the GIID query was not assessed at all. There does not appear to be any logical reason for these methodological omissions and this flaw calls into question the validity of the analyses of the structured query.

While the narrative query development was described in more detail than the structured query development, it is not without its flaws. CUMC notes were used for testing in this phase. However, the lack of context surrounding the CUMC makes it difficult to determine the robustness and generalizability of the query. For instance, it is not mentioned what EHR system is used, what kind of patient population is seen at this institution, or why ambulatory internal medicine notes were used for query testing. It seems counter-intuitive to use notes from a medical specialty to create a query for primary care ambulatory notes. Additionally, only notes generated by physicians who used the EHR were used and we are not told how many physicians encompassed this population.

To their benefit, the authors do conduct a manual review of a subset of the notes to produce a gold standard, but this process is not detailed clearly. It is unknown how many notes were used for this process and why only one reviewer undertook this process. Ideally, at least two reviewers would perform the review and a measure of inter-rater reliability (kappa) would be reported.

Comparison Data
A large part of this study’s value rests on the comparisons made between the test IFH systems and the established systems (NYC ED and WHO). However, the fundamental question here is whether or not these comparisons are valid and appropriate as the patient population and data range may differ greatly.

The NYC ED system utilizes chief complaints which are timely and produce good agreement for respiratory and GI illness (7). However, patients with GIID or mild respiratory symptoms may not go to the ED. This limitation has come to light as the system has failed in the past to detect GI outbreaks (7) and indicates the NYC ED data may not represent a gold standard. Additionally, the authors of the current study propose that the poor performance of the GIID narrative query may be due to the fact that the NYC ED covers a much broader geographical area. The implication here is that GIID are often localized and therefore may not be captured by local IFH community clinics. These distinctions between the NYC ED and IFH may account for some of the ambiguous results obtained and raises doubts about the decision to use the NYC ED as a comparison. However, it is likely that no alternative comparison source exists and therefore, the NYC ED represented the best available data source.

While the hypothesis is not explicitly stated, it appears that the authors would expect that their surveillance system would produce signals concurrent with the emergency signals. However, this assumption may not be valid as there is nothing to suggest that primary care signals would behave in this manner.

WHO Isolates are used as the second ILI comparison in this study, but are not described in any noteworthy detail. This hinders the reader in understanding the appropriateness of this source and thereby the meaningfulness of the cross-correlation results. A brief search on the internet revealed that the WHO has a National Influenza Centre (NIC) in NY which is part of a larger WHO Global Influenza Surveillance Network. The NIC samples patients with ILI and submits their biomedical isolates to the WHO for analysis. The WHO in turn will use the information for pandemic planning. The amount of testing that these centres will conduct depends on what phase the flu season is in, with more testing occurring during the start of the season in order to confirm influenza and less testing occurring at the peak of the season due to practicalities. Because the nature of the WHO NIC is very different in both motivation, operation, and scale compared to the NYC ED and IFH syndromic surveillance systems it raises questions about the appropriateness of using it as a comparison. In their study of the NYC ED, Heffernan(2004) includes the WHO isolates as a visual reference to provide context for their own signals, but they do not attempt to use it as a correlation metric. Given the aforementioned reasons this example may be the most appropriate way to use the WHO isolates.

Syndrome Keywords
The last area of discussion is regarding the keywords used to define the syndromes. A breakdown of the keywords used in each study found that only 3 terms (fever, viral syndrome, and bronchitis) were similar between the NYC ED and IFH queries. Interestingly, 4 terms used in the IFH queries (cold, congestion of the nose, sneezing, and sniffles) were exclusion criteria for ILI in the NYC ED. If the queries used across the studies are dissimilar, it may indicate that they are not identifying the same patients and this would raise more issues concerning the meaningfulness of the cross-correlation results.

Surprisingly, the GIID keywords for both NYC ED and IFH were similar. This complicates the interpretation of the GIID results, but may suggest that the issue here is with the geographical range and patient population.

Discussion
Due to the methodological issues discussed above, it is difficult to determine the validity and salience of the conclusions reached by the authors. As this project was exploratory in nature it would be beneficial if the authors took a step back to carefully review their queries and revise them appropriately after thorough evaluation of each query’s sensitivity, specificity, and predictive value. The most salient problem should be to develop internal consistency and reliability between the two IFH systems before attempting to compare their performance with external measures that may or may not be appropriately matched.

Overall, while the study is under practical limitations in its choice of comparative data sources, the authors presented an interesting idea that will likely be of use in the future and should be developed and evaluated more carefully.

Q’s for authors

1) How were abbreviations, misspellings, acronyms, and localized language dealt with in the narrative query development?
2) Why were internal medicine notes chosen for the IFH system if one is trying to create a general approach?
3) Can you please provide more detail about the WHO isolates and why they were chosen as comparison data?

References

1) Cochrane DG, Allegra JR, Chen JH, Chang HG. Investigating syndromic peaks using remotely available electronic medical records. Advances Dis Surveil 2007; 4-48.
2) Thompson MW, Correlation between alerts generated from electronic medical record (EMR) data sources and traditional sources. Advances Dis Surveill 2007; 4:268.
3) South BR, Gundlapalli AV, Phansalkar, S et al. Automated detection of GI syndrome using structured and non-structured data from the VA EMR. Advances Dis Surveill 2007;4:62.
4) Chapman MW, Dowling JN, Wagner MM. Fever detection from free-text clinical records for biosurveillance. J Biomed Inform 2004 Apr;37(2):120-7.
5) Hripcsak G, Soulakis ND, Li L, Morrison FP, Lai AM, Friedman C, Calman NS, Mostashari F. Journal of the American Medical Informatics Association 2009; 16(3):354-61. Epub 2009 Mar 4.
6) Friedman C, Shagina L, Lussier Y, Hripcsak G. Automated encoding of clinical documents based on natural language processing. J Am Med Inform Assoc 2004; 11(5):392-402.
7) Heffernan R, Mostashari F, Das D, Karpati A, Kulldorff M, & Weiss D. Syndromic surveillance in public health practice, New York City. Emerging Infectious Diseases 2004; 10 (5): 858-864.

Monday, December 7, 2009

CATCH-IT Final Report: Web-based weight loss in primary care: A RCT

Paper: Bennett GB, Herring SJ, Puleo E, Stein EK, Emmons KM and Gillman MW. Web-based Weight Loss in Primary Care: A Randomized Controlled Trial. Obesity (2009) Advance online publication, 20 August 2009. DOI:10.1038/oby.2009.242

Abstract: click here
Slide presentation: click here
Draft report: click here

Introduction
The purpose of this paper is to review the study by Bennett et al.(1) on their web-based behavior modification intervention for weight loss. With rising obesity rates around the world,(2) there is a need for weight loss interventions that are accessible to a larger number of individuals. Behavior therapy can significantly enhance comprehensive weight loss strategies,(3) but access to lifestyle interventions are limited by costs and availability of counseling services. The authors present a web-based tool with the potential for wide scale implementation at low costs.

Objective
The objective of the study was to evaluate the short-term (12-week) efficacy of a web-based intervention in primary care patients with obesity (BMI 30 to 40 kg/m2) and hypertension.

Methods
A total of 101 obese, hypertensive patients were randomized to receive either the web-based intervention (n=51) or usual care (n=50). Intervention participants had access to the comprehensive weight loss website for 3 months, and four counseling sessions (two in-person sessions and two telephone sessions). Counseling was provided by a health coach (registered dietician) trained to use principles of motivational interviewing. The health coach provided counseling on “obesogenic” behavior goals (determined at the start of the intervention). Participants could select new goals at week 6. The primary purpose of the website was to facilitate daily self-monitoring of adherence to behavior change goals.

Participants in the usual care group received the standard care offered by the outpatient clinic. They were also given a copy of the “Aim for a Healthy Weight” document published by the National Heart Lung and Blood Institute.(4)

At baseline, and at 3 month follow-up, participants completed a web-based survey followed by anthropometric measures and blood pressure assessments. Participants were offered $25 for attending each assessment.

Results
Primary outcome: Greater weight loss was reported in the intervention group (-2.281 kg +/- 3.21) than the usual care group (0.28 +/- 1.87 kg); mean difference -2.56 kg (95% CI -3.60, -1.53). Intervention participants lost a greater percentage of baseline body weight (-2.6% +/- 3.3%) than usual care participants (0.39% +/- 2.16%); mean difference -3.04% (95% CI -4.26, -1.83). About a quarter of intervention participants (25.6%) lost >5% of their initial body weight at 12-week. None of the usual care participants lost > 5% body weight in the study period.

Secondary outcomes: A reduction in BMI was observed among the intervention group (-0.94 +/- 1.16 kg/m2) compared to an increase in BMI in the usual care group (0.13 +/- 0.75 kg/m2); mean difference -0.07 kg/m2 (95% CI -1.49, -0.64). No statistically significant differences were found for waist circumference, systolic blood pressure, and diastolic blood pressure.

Participants meeting the login goal (3 times per week) for at least 6 weeks had greater weight loss (-3.30 +/- 3.78 kg) than those who met the login goal for less than 6 weeks (-0.42 +/- 1.78 kg); mean difference: -2.88 kg (95% CI -1.56, -4.60). Those who met the login goal for 10 weeks (83% of study weeks) demonstrated much greater weight loss (-4.50 +/- 3.29 kg) than those who did not (-0.60 +/- 1.87); mean difference: -3.90 kg; (95% CI -2.43, -5.36).

No association was found between participation in four coaching sessions and weight loss.

Discussion
Bennett et al. reported short term weight loss of -2.25kg +/- 3.21 using a web based behavioral intervention. While this result is statistically significant, the small amount of weight loss is of limited clinical significance. Obesity guidelines suggest losing 10% of initial body weight (e.g., 10% of 100kg person = 10kg) for clinical benefits.(5,6)

The weight loss reported in the study was somewhat lower than weight loss reported in other weight loss internet interventions.(7) The authors suggested that dietary restrictions would be necessary to achieve results of larger magnitude. However, the rationale for treating obese patients with behavior therapy alone is not clear. Canadian obesity guidelines recommend diet and physical activity as the first-line treat of obesity. Behavior interventions are considered as an adjunct to other interventions.(3)

The relatively short study period was also somewhat unusual since long-term weight loss is the main challenge in obesity. The trial period of 12 weeks is much shorter than study periods reported by other studies.(6)

According to the authors, the intervention was developed to overcome the challenge of long term adherence. Adherence to behavior change strategies often wanes over time. Unfortunately, adherence to the web based intervention also decreased over the study period with 78% who met login goal at week 1, versus 43.1% at 12 weeks.

The authors suggested that the web based intervention can be implemented without a health coach. But, success without coach support may be less since a research assistant (health coach) contacting participants can act as a “push” factor that lowers attrition rates.(8)

Some of the references noted in the paper do not support the text. For example, on page 2, the paper states that “research staff subsequently collected anthropometric measures and blood pressure using established procedures (reference 20).” However, reference 20 refers to a 24 page survey NHANES food questionnaire.(9)

Some of the numbers reported in the paper need revisiting. For example, Table 1 (page 2) indicates that intervention participants (n = 51) had a higher body weight (101.0 kg +/- 15.4) than usual care participants (n = 50) with a body weight of (97.3 kg +/- 10.9). But, the body weight of all participants (n=101) was also reported as (97.3 kg +/- 10.9).

Despite all the weaknesses noted above, the study was well designed. The paper included most items on the CONSORT(10,11) and STARE-HI(12) checklists. Although the authors did not explicitly state that this was a pilot study, it appears that they have already begun other trials using their iOTA approach (see website).

The web intervention included many interesting features such as the display of the “average performance for other program participants,” regular updates to behavioral skills needed to adhere to obesogenic behavior change goals, social networking forum, recipes.

The obesogenic behavior change goals are very short and simple ("Walk 10,000 steps every day," "Watch 2 h or less of TV every day," "Avoid sugar-sweetened beverages," "Avoid fast food," "Eat breakfast every day," and "No late night meals and snacks"). Christensen(13) suggests that “shorter interventions” could be the primary role of the internet in disease prevention instead of delivery of lengthy therapy that requires hours of online work.

In terms of future research, it would be interesting to compare the web based intervention with other weight loss interventions. Although comparison with “usual care” may be common practice in randomized controlled trials, a “head-to-head” trial with an alternative intervention may contribute more knowledge. For instance, a comparison of the web based intervention with a paper based monitoring tool could be very informative, since it would allow analysis of web enabled features.

Questions
1. In the usual care group, what was standard care? How many visits did the usual care group make to the primary care provider for weight reduction?
2. Of the 124 ineligible participants, what were the reasons for ineligibility?
3. What was the web based survey completed by participants at baseline and at 3 months follow-up? Was the NHANES food questionnaire used as the web based survey? What were the results of the survey?
4. Table 3 (page 4) excluded the one participant who did not login once. Given that the range of logins from week 1 to week 12 includes “0”, why was this data omitted?
5. Intervention participants received “two 20-min motivational coaching sessions in person (baseline and week 6), and two, 20-min biweekly sessions via telephone (week 3 and 9)”. Were there biweekly telephone coaching sessions in addition to the two telephone sessions at week 3 and week 9? What was the impact of the “message feature that allowed for direct communication with the coach”? Was there extended access to the health coach beyond the four sessions throughout the 12 week study period?
6. With regards to the web-based intervention, what was being tracked by the web-based intervention (number of times eat out, number of stairs walked)? How many minutes on average did each session take? What behaviour skills were presented on the website and updated biweekly? What was the impact of the social networking forum?

References
1. Bennett G Bennett GG, Herring SJ, Puleo E, Stein EK, Emmons KM and Gillman MW. Web-based Weight Loss in Primary Care: A Randomized Controlled Trial. Obesity (2009) Advance online publication, 20 August 2009. DOI:10.1038/oby.2009.242
2. OECD Health Data 2009: How Does Canada Compare. http://www.oecd.org/dataoecd/46/33/38979719.pdf
3. Lau DCW, Douketis JD, Morrison KM, Hramiak IM, Sharma AM, Canadian clinical practice guidelines on the management and prevention of obesity in adults and children. CMAJ 2007;176(8 suppl):Online-1–117 www.cmaj.ca/cgi/content/full/176/8/S1/DC1
4. National Heart, Lung, and Blood Institute, National Institutes of Health. Aim for a Healthy Weight. Washington DC: US Department of Health and Human Services, 2005. http://www.nhlbi.nih.gov/health/public/heart/obesity/aim_hwt.pdf
5. Wadden TA, Butryn ML, Wilson C. Lifestyle modification for the management of obesity. Gastroenterology. 2007 May;132(6):2226-38.
6. Sarwer DB, von Sydow Green A, Vetter ML, Wadden TA. Behavior therapy for obesity: where are we now? Curr Opin Endocrinol Diabetes Obes. 2009 Oct;16(5):347-52. DOI: 10.1097/MED.0b013e32832f5a79
7. M. Neve, P. J. Morgan, P. R. Jones and C. E. Collins. Effectiveness of web-based interventions in achieving weight loss and weight loss maintenance in overweight and obese adults: a systematic review with meta-analysis. Obesity Reviews. 2009 Sep 14 DOI: 10.1111/j.1467-789X.2009.00646.x
8. Eysenbach Gunther. The Law of Attrition. J Med Internet Res. 2005;7(1):e11. doi: 10.2196/jmir.7.1.e11. http://www.jmir.org/2005/1/e11/v7e11
9. National Heart, Lung, and Blood Institute, National Institutes of Health. Aim for a Healthy Weight. Washington DC: US Department of Health and Human Services, 2005. http://www.nhlbi.nih.gov/health/public/heart/obesity/aim_hwt.pdf
10. Moher D, Schulz KF, Altman DG for the CONSORT Group. The CONSORT Statement: Revised Recommendations for Improving the Quality of Reports of Parallel-Group Randomized Trials. Ann Intern Med. 2001;134:657-662. (http://www.consort-statement.org)
11. Altman DG, Schulz KF, Moher D, Egger M, Davidoff F, Elbourne D, Gøtzsche PC, Lang T for the CONSORT Group. The Revised CONSORT Statement for Reporting Randomized Trials: Explanation and Elaboration. Ann Intern Med. 2001;134:663-694. (http://www.consort-statement.org)
12. Talmon J, Ammenwerth E, Brender J, de Keizer N, Nykänen P, Rigby M. STARE-HI--Statement on reporting of evaluation studies in Health Informatics. Int J Med Inform. 2009 Jan;78(1):1-9.
13. Christensen H, Ma A. The Law of Attrition Revisited. (J Med Internet Res 2006;8(3):e20) doi:10.2196/jmir.8.3.e20. http://www.jmir.org/2006/3/e20/

Friday, December 4, 2009

(Final) CATCH-IT Report: Effectiveness of Active-Online, An Individually Tailored Physical Activity Intervention, in a Real-life Setting: RCT

Wanner M., Martin-Diener E., Braun-Fahrländer C., Bauer G., Martin B.W. (2009). Effectiveness of Active-Online, an Individually Tailored Physical Activity Intervention, in a Real-Life Setting: Randomized Controlled Trial. J Med Internet Res, 11 (3): e23.


Original Post - Abstract Only - Full Text - Slideshow - Draft Report

Introduction

This report is a summary and analysis of the study conducted by Wanner et al. (2009)entitled Effectiveness of Active-Online, An Individually Tailored Physical Activity Intervention, in a Real-life Setting: Randomized Controlled Trial. The focus of the study was a web-based physical activity intervention called Active-Online which provides users with customized advice on increasing their physical activity levels. The study compared the effectiveness of Active-Online to a non-tailored website in changing physical activity behaviour when delivered in a real-life setting.

The authors found that a tailored web-based intervention is not more effective than a non-tailored website when deployed in an uncontrolled setting.


Objective

The study aimed to answer the following three questions:

  1. What is the effectiveness of Active-Online, compared to a non-tailored website, in increasing self-reported and objectively measured physical activity levels in the general population when delivered in a real-life setting?
  2. Do participants of the randomized study differ from spontaneous users of Active-Online, and how does effectiveness differ among these groups?
  3. What is the impact of frequency and duration of use of Active-Online on changes in physical activity behaviour?


Methods

A randomized controlled trial (RCT) was used to answer the questions posed by the authors. Three groups of participants were observed during the trial—the control group (CG), the intervention group (IG), and the spontaneous users (SU) group. CG and IG participants were recruited through media advertisements and randomized into their respective groups using a computer-based random number generator. Participants in the SU group were recruited directly from the Active-Online website by redirecting them to the study website if they chose to participate in the study. A sub-group of participants volunteered to wear an accelerometer so that their physical activity levels could be objectively measured during the study.

Participants in IG and SU visited the Active-Online website and answered diagnostic questions about their physical activity behaviour to receive customized feedback on how to improve them. Those in CG visited a static website to receive generic tips on physical activity and health.

All groups were followed up via email at 6 weeks (FU1), 6 months (FU2) and 13 months (FU3) after the baseline assessment. There was no face-to-face component in the study.


Measures

Three types of data were collected in the study:

  1. Self-reported subjective measurements of physical activity levels obtained through follow-up questionnaires presented to all groups
  2. Objective measurements obtained from accelerometers worn by the subgroup
  3. Frequency and duration of visits to Active-Online obtained from the Active-Online user database which recorded each log-in to the website


Results

There was a significant increase in subjectively measured levels of physical activity among all groups from baseline to FU3, but no significant differences between randomized groups. However, the differences were more pronounced in the SU group. As for the objective measurements of physical activity obtained from accelerometer readings, there was no increase from baseline values in any of the groups. Measurements of frequency and duration of use of Active-Online showed an increase in self-reported total minutes of physical activity with increasing duration of use. However, this result was no longer significant when adjusted for stages of behaviour change (a concept based on the seven-stage behaviour change model as described by Martin-Deiner et al. (2004)).


Limitations

The inclusion of SU as an additional study arm may be seen by some readers as an interesting and exploratory endeavour. However, others may find that it takes away from the clarity of the study. The fact that the SU group is not randomized, not homogeneous with the two other groups, and only represents 7.4% of all visitors of Active-Online may cause some readers to wonder why it was included in the study at all. Moreover, the measurements obtained from this group were explicitly discounted in the “Discussion” section of the paper. It is suggested that the authors alert readers of the exploratory nature of the SU group early in the “Methods” section of the report when this group is first introduced. Readers will thus be made aware that the SU will not be counted towards the results of the study and has only been included to add another dimension of interest.

It is not clear in the report whether or not the authors had a set of eligibility criteria for participants, although this being a web-based study with no face-to-face component, it would have been difficult to enforce any eligibility criteria on participants at all. Furthermore, it is not known from the report how the authors ensured the uniqueness of participants. Participants were identified using unique email addresses, but it is quite likely that a single participant may have registered for the study multiple times using several different email addresses. This could seriously impact the study results if the same user was assigned to more than one user group as a result of using multiple email addresses.

In addition to the eligibility criteria, it is recommended that the authors provide a sample of the advertisement used to recruit participants and the questionnaire used at each follow-up, to comply with the CONSORT standards for reporting RCTs (CONSORT, 2009).

One other limitation of the study is that most of the participants already had high levels of physical activity at baseline, leading to a ceiling effect. As expected from such a large sample size, the results showed a regression towards the mean physical activity level in the population.


Discussion

Overall, the study was very well presented, with sufficient background information, clear writing, and appropriate use of tables and figures. The authors took a bold step in conducting a web-based intervention in an uncontrolled setting over a very long period of time. It is commendable that the authors frankly reported the limited effectiveness of their intervention when some researchers may have hesitated to do so. They did not go beyond their evidence to draw conclusions.

The study clearly answered the three questions set forth in the objective:

  1. The study found significant increase in physical activity levels between baseline and last follow-up (FU3) in all groups; however, there was no difference in the results between the randomized groups.
  2. Spontaneous users differed from randomized users in baseline characteristics, and also showed a significant increase in physical activity levels after using the intervention, compared to the randomized groups.
  3. The impact of frequency and duration of use of Active-Online on changes in physical activity levels of participants is not clear after the study.

The results from this study resonate with those of similar studies investigating web-based physical activity interventions (Spittaels et al., 2007). It adds to the existing evidence that effectiveness of a web-based physical activity intervention may be difficult to demonstrate when delivered in an uncontrolled setting. The study highlights some of the key issues pertaining to web-based studies in real-life settings, including attrition and contamination of the control group. High attrition rates have been recognized as a common problem in Internet-based studies (Eysenbach, 2005) and this was evident in the present study as well. It was also acknowledged by the authors that some members of the control group were familiar with and had used Active-Online at least once during the course of the study. This may have caused a bias towards the null.

Results from this study will be particularly useful for researchers in the field of healthcare and sports medicine. Further research could include the delivery of web-based physical activity interventions within wider health promotion contexts such as primary care or workplace settings.


Questions to the Authors

  1. How do you account for the contamination of CG in Internet-based studies such as this one?
  2. Could members of CG have accessed Active-Online as SU (using a different email address)?
  3. What were the technical difficulties causing 38 participants to be omitted from the study?
  4. How did you validate the uniqueness of the participants?
  5. What was the reason for not measuring the usage of the non-tailored website?


References

CONSORT. (2009). The CONSORT Group. Retrieved November 14, 2009, from The CONSORT Group: http://www.consort-statement.org/

Eysenbach, G. (2005). The Law of Attrition. Journal of Medical Internet Research , 7 (1), e11.

Martin-Diener, E., Thuring, N., Melges, T., & Martin, B. (2004). The Stages of Change in three stage concepts and two modes of physical activity: a comparison of stage distributions and practical implications. Health Education Research , 19 (4), 406-417.

Spittaels, H., De Bourdeaudhuij, I., & Vandelanotte, C. (2007). Evaluation of a website-delivered computer-tailored intervention for increasing physical activity in the general population. Prev Med, 44 (3), 209-217.

Spittaels, H., De Bourdeaudhuij, I., Brug, J., & Vandelanotte, C. (2007). Effectiveness of an online computer-tailored physical activity intervention in a real-life setting. Health Education Research, 22 (3), 385-396.

Wanner, M., Martin-Diener, E., Braun-Fahrlander, C., Bauer, G., & Martin, B. (2009). Effectiveness of Active-Online: An Individually Tailored Physical Activity Intervention, in a Real-Life Setting: Randomized Controlled Trial. Journal of Medical Internet Research , 11 (3), e23.