Original Paper

Efficacy of Quality Criteria to Identify Potentially Harmful Information: A Cross-sectional Survey of Complementary and Alternative Medicine Web Sites

Muhammad Walji1, MS; Smitha Sagaram1, MBBS, MS; Deepak Sagaram1, MBBS; Funda Meric-Bernstam2, MD; Craig Johnson1, PhD; Nadeem Q Mirza2, MD; Elmer V Bernstam1, MD

1University of Texas Health Science Center at Houston, School of Health Information Sciences, Houston TX, USA
2Department of Surgical Oncology, The University of Texas M.D. Anderson Cancer Center, Houston TX, USA

Corresponding Author:
Elmer V Bernstam, MD
School of Health Information Sciences
University of Texas Health Science Center at Houston
7000 Fannin, Suite 600
Houston TX 77030
USA
Phone: +1 713 500 3901
Fax: +1 713 500 3929
Email:



ABSTRACT

Background: Many users search the Internet for answers to health questions. Complementary and alternative medicine (CAM) is a particularly common search topic. Because many CAM therapies do not require a clinician's prescription, false or misleading CAM information may be more dangerous than information about traditional therapies. Many quality criteria have been suggested to filter out potentially harmful online health information. However, assessing the accuracy of CAM information is uniquely challenging since CAM is generally not supported by conventional literature.
Objective: The purpose of this study is to determine whether domain-independent technical quality criteria can identify potentially harmful online CAM content.
Methods: We analyzed 150 Web sites retrieved from a search for the three most popular herbs: ginseng, ginkgo and St. John's wort and their purported uses on the ten most commonly used search engines. The presence of technical quality criteria as well as potentially harmful statements (commissions) and vital information that should have been mentioned (omissions) was recorded.
Results: Thirty-eight sites (25%) contained statements that could lead to direct physical harm if acted upon. One hundred forty five sites (97%) had omitted information. We found no relationship between technical quality criteria and potentially harmful information.
Conclusions: Current technical quality criteria do not identify potentially harmful CAM information online. Consumers should be warned to use other means of validation or to trust only known sites. Quality criteria that consider the uniqueness of CAM must be developed and validated.

(J Med Internet Res 2004;6(2):e21)
doi:10.2196/jmir.6.2.e21

KEYWORDS

Quality; harm; Internet; medical information; World Wide Web; complementary and alternative medicine



Introduction

Online health information can harm as well as heal. Many quality criteria have been suggested to help consumers identify misleading, inaccurate, or harmful information. Objective quality criteria that offer a limited number of options are particularly promising since they are easier to assess. For example, it is easier to assess whether an author is identified than to determine whether the author is qualified. However, even seemingly objective quality criteria have proven unreliable without specific operational definitions [1]. Further, there is little evidence that these criteria, known as "technical criteria," actually filter out undesirable health information. The few studies that have attempted to evaluate technical criteria reported conflicting results [2-4]. If harmful information can be effectively identified, this should be publicized. If, on the other hand, currently available quality criteria cannot identify potentially harmful information, then we should caution consumers and work on finding other ways of identifying problematic information online.

In this study, we analyze Web sites that display information about complementary and alternative medicine (CAM). CAM includes "diverse medical and healthcare systems, practices and products that are not presently considered to be a part of conventional medicine," such as dietary supplements, aromatherapy, chiropractic, and homeopathy [5]. Assessing accuracy and quality of CAM Web sites poses unique challenges as there is less documented research on the efficacy of CAM products, yet use is common and the potential for harm remains. There is also no gatekeeper to control and monitor access to CAM. Consumers can choose the product and dosage without having to encounter a healthcare professional. In fact, patients often fail to report CAM use to their physicians [6]. On the other hand, consumers frequently turn to the Internet to answer questions about CAM, and trust and act upon what they see online [7]. However, CAM information online has been found to be commercially driven [8], to be poorly referenced [8], and to contain illegal claims [9], and it may therefore be dangerous to consumers [10]. The combination of accessible, unproven CAM therapies and poor quality online CAM information is dangerous.

"Accuracy is a function of whether a site reflects the use of … agreed-upon benchmark[s] such as clinical practice guidelines." [11] The accuracy of CAM information, which is often not evidence-based and lacks support from the peer-reviewed biomedical literature, is not testable. However, we can assess the potential harm of displayed information, even if we cannot verify its accuracy. Further, if information regarding the safety and efficacy of a product is available, it should be displayed.

Our previous work provides preliminary evidence that breast cancer Web sites that meet more technical quality criteria are less likely to contain false statements [12]. Motivated by a desire to help consumers, we sought to determine whether current technical quality criteria can identify potentially harmful CAM information.


Materials and Methods

Selection of Web Sites

Consumers use general-purpose search engines rather than medical sites or portals to find information, and most do not go beyond the first page of search results [13]. Therefore, we chose the ten most popular search engines (Table 1) to select Web sites that consumers are likely to encounter [14]. The three most popular herbs in the United States (in terms of dollars spent) [15], ginseng, ginkgo, and St. Johns wort, and their most common uses formed the search query. The following three queries were executed in each search engine on July 15, 2003: "ginseng and cancer," "ginkgo and memory loss," and "St. John's wort and depression." All Web sites listed on the first results page, including sponsored or paid links, were analyzed.

Table 1. Search engines used to select Web sites
Search Engine
1. Google
2. Yahoo
3. MSN
4. AOL
5. Ask Jeeves
6. Overture
7. Infospace
8. Netscape
9. AltaVista
10. Lycos

A Web site was included if it contained at least one sentence or phrase of health information on the search topic. Health information was defined as "information intended to be used to maintain or improve health, including to understand disease processes, health care issues, etc… to prevent, diagnose, or treat health problems, to be rehabilitated from the effect of diseases, or treatments, and to seek and select health care plans, providers, and other resources." [16] Duplicate URLs were removed. HTTrack [17], a Web site copier was used to permanently capture each Web site and every directly linked page.

Assessing Technical Quality Criteria

In prior work, we assessed inter-rater agreement for popular technical quality criteria [1]. We assessed the degree to which two raters agreed upon the presence or absence of 22 quality criteria selected from Eysenbach's systematic review [17] of a sample of 21 CAM Web sites. Our preliminary analysis showed poor inter-rater agreement on 10 of the 22 criteria. Therefore, we created operational definitions for each of the criteria, decreased the allowed choices, and defined a location to look for the information. As a result, 15 out of the 22 quality criteria had acceptable inter-rater agreement (kappa > 0.6).

For this study, one evaluator (MW) analyzed all Web sites for compliance with 15 technical quality criteria (Table 2) that we previously determined to be reliably assessable. Therefore, in this study we did not re-calculate inter-observer reliability for these technical criteria.

Assessing Potential Harm

First, a set of critical facts for each of the three herbs was determined by consensus of two clinically trained reviewers (SS, DS); please see appendices 1-3. The sets of critical facts were extracted from two independent sources of CAM information: the Physician Desk Reference (PDR) for Herbal Medicines [19] and the Sloan Kettering database of herbs [20]. After the sets of critical facts were determined, the CAM content displayed on each Web site was independently evaluated by both reviewers. Cases where reviewers disagreed were resolved by consensus. In order to minimize bias, materials identifying each Web site's origin, such as organization name, logo, footers, URLs and hyperlinks were removed. However, no changes were made to the design or layout.

In order to verify the concordance between reviewers, two additional clinically trained evaluators (validation reviewers), who were not aware of the study hypothesis or quality criteria tested, were given 30 randomly selected sites from the same sample looked at by primary reviewers (SS, DS). Inter-rater agreement between the validation reviewers was calculated. The validation reviewers were given the same critical facts documents as the primary reviewers and each validation reviewer assessed every site independently. After each reviewer independently evaluated the Web sites, inter-rater agreement was calculated between the two validation reviewers. Subsequently, cases of disagreement were resolved by consensus. A second inter-observer agreement measure was calculated between the pairs of reviewers (primary reviewers vs. validation reviewers) based on the consensus data.

Content on each page was scrutinized for the presence of misleading statements likely to cause physical harm (acts of commission) and for vital information that was missing (acts of omission). Commission may be thought of as a surrogate for accuracy, while omission has been referred to as completeness, coverage, or comprehensiveness [21]. We based our evaluation on the following framework, adapted from Markman [22]:

    1. Direct toxicity
    2. Interaction with conventional medical therapy
    3. Delay in diagnosis or conventional treatment
    4. Avoidance of conventional treatment
    1. Warnings
    2. Drug interactions
    3. Contraindications
    4. Side effects

Statements that suggest use of higher doses of herbs than recommended in the critical facts documents (appendices 1-3) were categorized as causing "direct toxicity." Statements suggesting that the herb protects against disease and encouraging patients to self-medicate instead of seeing a physician were placed in the "delay in diagnosis or conventional treatment" category. Statements that project herbs as an "alternative to conventional treatment" (for example, "the herb is the first choice of treatment for the disease") were categorized as potentially causing "avoidance of conventional treatment." Statements that suggested using herbs with medications known to have drug interactions (for example, using St. John's wort with monoamine oxidase inhibitors) were classified as causing potential harm due to "interaction with conventional therapy." However, while evaluating potential physical harm due to omission of information about interactions, we did not expect Web sites to list all the drug interactions listed in the critical facts documents. Web sites that noted at least one drug interaction were considered not to omit drug interaction information. Web sites with vague statements such as "there are many interactions," were categorized as having "omitted drug interactions." Potential physical harm was present if any error of commission or omission was found.

We recognize that in addition to physical harm due to either commission or omission, CAM information on the Internet may cause other types of harm, such as emotional and financial. Emotional harm may occur because of inaccurate perception of disease or conventional therapy such as exaggeration of side effects of conventional treatment and presentation of alternative treatment as a "natural cure." Financial harm may be caused by the purchase of ineffective or harmful yet expensive CAM products. However, we did not evaluate emotional and financial harm in this study because of the inherent subjectivity involved, and difficulty in quantifying and assessing such measures.

Statistical Analyses

The dichotomous (yes/no) dependent variables were: 1) physical harm from commission and 2) physical harm from omission. The independent variables were also dichotomous and consisted of the 15 technical quality criteria listed in Table 2. In addition, these 15 criteria were grouped into 5 categories [23]: authority, transparency and honesty, updating of information, editorial policy, and other. Web sites were classified into two groups based on whether they complied with the median number of quality criteria. The first group complied with six or fewer technical quality criteria, the second group complied with more than six technical quality criteria.

Inter-observer agreement measures were calculated to assess a) the degree to which validation reviewers agreed among themselves in their assessments of these dichotomous dependent variables (Table 3) and, b) the degree to which the validation reviewers agreed with the primary reviewers (Table 4). Cohen's kappa (K) is a commonly used measure of inter-observer agreement between two observers for dichotomous data. However, because K is affected in complex ways by the presence of bias between observers and by the distributions of data across the categories [24], we computed the prevalence-adjusted bias-adjusted kappa (PABAK), the bias index (BI) and the prevalence index (PI), as well as K, as recommended by Byrt et al [24].

The bias index (BI) is defined as the difference between the proportions of "Yes" for the two raters. The prevalence index (PI) is defined as the difference between the probability of "Yes" and the probability of "No." A BI close to 0 indicates less bias, while values closer to 1 (absolute value) indicate greater bias. Similarly, a PI close to 1 (absolute value) indicates high prevalence, while a PI closer to 0 indicates lower prevalence. The BI then measures the degree to which one reviewer tends to identify more or fewer occurrences than the other, while the PI measures the degree to which "Yes" agreements or "No" agreements predominate. The PABAK index of agreement between two observers is a measure that adjusts for both bias and prevalence. Although the derivation of the PABAK index is somewhat more complex, in practice it can be calculated as 2P0 - 1, where P0 is the proportion of observed agreement. Consequently, PABAK ranges from -1 to +1 and like K, a value of 0 represents no better than chance agreement, while magnitudes approaching 1 indicate maximal agreement.

Chi-square was calculated for each pairing of an independent variable with a dependent variable. Given the large number of statistical tests performed, significance was set at α<0.01. All analyses were performed using SPSS 11.0 statistical software.


Results

A total of 546 Web sites were retrieved. After removing duplicates and checking for eligibility, 150 Web sites remained: 54 for the query "ginseng and cancer," 46 for "ginkgo and memory loss," and 50 for "St. John's wort and depression."

Table 2. Compliance of CAM Web sites with technical quality criteria. Criteria are also grouped into 5 categories (in bold). Values are counts (percentages)
Quality criteriaNumber of Web sites (%)
Authority
Disclosure of authorship41 (27)
Author's credentials disclosed17 (11)
Credentials of physicians disclosed2 (1)
Author's affiliation disclosed17 (11)
Transparency and Honesty
Sources clear100 (67)
General disclosures147 (98)
References provided54 (36)
Disclosure of ownership144 (96)
Currency/ Updating of information
Date of creation disclosed31 (21)
Date of last update disclosed21 (14)
Date of creation or update disclosed49 (33)
Editorial Policy
Editorial review process9 (6)
Others
Internal search engine present78 (52)
Feedback mechanism132 (88)
Copyright notice105 (70)

Technical Quality Criteria

Most Web sites did not comply with technical quality criteria. On average, a Web site complied with 6.3 (SD±2.6) of 15 criteria. One site failed to comply with any criteria, while three sites complied with 13 criteria. Only 27% of sites disclosed authorship, 36% provided references and 6% mentioned an editorial review process. Table 2 shows the number of Web sites that complied with each of the 15 quality criteria.

Assessing Potential Harm: Agreement among Reviewers

As shown in Table 3, agreement between the two evaluation reviewers was high (all PABAK > 0.67). Although there was little bias, there was a strong prevalence effect. Therefore, the two validation reviewers had a high degree of agreement for all measures of harm from commission and omission. Similarly, as shown in Table 4, consensus agreement between the primary and validation reviewers was also high (all PABAK > 0.73).

Table 3. Agreement among validation reviewers on a sample of 30 Web sites
P0BIPIKPABAK
A. Physical Harm-Commission*0.870-0.80.2590.73
Direct Toxicity0.930.070.93Undefined0.87
Interactions0.97-0.030.97Undefined0.93
Delay in diagnosis10-1Undefined1
Avoidance of conventional therapy0.97-0.03-0.90.6510.93
B. Physical Harm-Omission*0.970.030.97Undefined0.93
Omission of Warnings0.930.070.80.6340.87
Omission of Drug Interactions0.97-0.030.70.870.93
Omission of Contraindications100.811
Omission of Adverse Reactions0.83-0.170.770.2420.67

P0 = observed agreement,BI = bias index, PI = prevalence index, K = Cohen's kappa, PABAK = prevalence-adjusted bias-adjusted kappa. Undefined = SPSS did not compute value due to zero variability in a variable.


Table 4. Agreement between primary and validation reviewers on a sample of 30 Web sites
P0BIPIKPABAK
A. Physical Harm-Commission*0.930.07-0.730.710.87
Direct Toxicity0.930.07-0.80.630.87
Interactions10-1Undefined1
Delay in diagnosis0.930.07-0.93Undefined0.87
Avoidance of conventional therapy10-0.9311
B. Physical Harm-Omission*0.970.030.97Undefined0.93
Omission of Warnings0.90.030.830.350.8
Omission of Drug Interactions0.97-0.030.70.870.93
Omission of Contraindications0.97-0.030.770.840.93
Omission of Adverse Reactions0.870.070.730.430.73

P0 = observed agreement, BI = bias index, PI = prevalence index, K = Cohen's kappa, PABAK = prevalence-adjusted bias-adjusted kappa. Undefined = SPSS did not compute value due to zero variability in a variable.


Table 5. Number of CAM Web sites that display potentially harmful information. Values are counts (percentages)
Type of HarmNumber of Web sites (%)
A. Physical Harm-Commission*38 (25)
Direct Toxicity19 (13)
Interactions12 (8)
Delay in diagnosis5 (3)
Avoidance of conventional therapy10 (7)
B. Physical Harm-Omission*145 (97)
Omission of Warnings121 (81)
Omission of Drug Interactions124 (83)
Omission of Contraindications134 (89)
Omission of Adverse Reactions125 (83)

*Note: Totals in these rows are calculated if any of the four categories of commission or omission were found on the Web site.


Potential Harm

Potential physical harm from omission was more prevalent than from commission (97% vs. 25%, Table 5). However, a substantial number of Web sites (25%) displayed statements that could lead to physical harm. Statements that may cause toxicity if acted upon (direct toxicity) were present in 13% of CAM Web sites, while 7% of Web sites included statements encouraging the avoidance of conventional therapies. Eight percent of sites included information that may lead to harm from interactions if the advice were followed. Most CAM Web sites (97%) omitted vital information such as contraindications (89%) and drug interactions (83%).

Technical Quality Criteria

We found that individual technical quality criteria did not identify sites with the potential to cause physical harm from commission or omission (Table 6). Similarly, when technical criteria were grouped into categories (such as authority, transparency and honesty, etc.), no significant association was found with potential physical harm (Table 7). Even when Web sites were classified into two groups, those complying with more criteria (≥ 6) versus fewer criteria (<6), there was no significant relationship. Overall, 44 hypotheses were tested but none were significant at the α<0.01 level, despite our study having 0.80 power to detect significance. Surprisingly, the presence of two quality criteria where a significant association was found at α<0.05 ("sources clear" and "editorial review process") indicated a greater chance of potential harm; the reverse of their original intent. However, it is possible that these two significant results may be due to chance since we conducted numerous statistical analyses.

Table 6. Association between individual quality criteria and potential harm. Values are counts (percentages of Web sites complying with that criterion)
Total number of Web sites complying with criterionPhysical harm by
CommissionOmission
Present
(n = 38)
Disclosure of authorship4111 (29)30 (27)0.8039 (27)2 (40)0.52
Author's credentials disclosed173 (8)14 (12)0.4416 (11)1 (20)0.53
Credentials of physicians disclosed21 (3)1 (1)0.422 (1)0 (0)0.79
Author's affiliation disclosed176 (16)11 (10)0.3217 (12)0 (0)0.42
Sources clear10031 (82)69 (62)0.0295 (65)5 (100)0.11
Date of creation disclosed318 (21)23 (20)0.9530 (21)1 (20)0.97
Date of last update disclosed215 (13)16 (14)0.8620 (14)1 (20)0.69
Date of creation or update disclosed4913 (34)36 (32)0.8147 (32)2 (40)0.72
General disclosures14737 (97)110 (98)0.75142 (98)5 (100)0.75
References provided5414 (37)40 (36)0.951 (35)3 (60)0.26
Disclosure of ownership14435 (92)109 (97)0.16139 (96)5 (100)0.64
Internal search engine present7821 (55)57 (51)0.6475 (52)3 (60)0.72
Feedback mechanism13232 (84)100 (89)0.41128 (88)4 (80)0.58
Copyright notice10531 (82)74 (66)0.07100 (69)5 (100)0.13
Editorial review process95 (13)4 (4)0.039 (6)0 (0)0.57

Table 7. Association between groups of technical quality criteria and potential harm. Values are counts (percentages of Web sites complying with that criterion)
Total number of Web sites complying with criterionPhysical harm by
CommissionOmission
Present
(n = 38)
Authority4111 (29)30 (27)0.8039 (27)2 (40)0.52
Transparency and honesty14937 (97)112 (100)0.09144 (99)5 (100)0.85
Currency/updating of information5113 (34)38 (34)0.9849 (34)2 (40)0.77
Editorial policy95 (13)4 (4)0.039 (6)0 (0)0.57
Others13934 (90)105 (94)0.38134 (92)5 (100)0.52

Top Level Domain

We also explored the relationship between top level domain and potential harm. Seventy-seven percent of the 150 Web sites analyzed were commercial (.com), 10% organizational (.org), 7% network (.net), 3% educational (.edu), 2% governmental (.gov) and 1% unknown (numerical IP address only). Fisher's exact test statistic was calculated as expected values in some cases were <5, and significance was set at α = 0.05 level. Only the network top level domain had a significant relationship with physical harm from omission (Table 8). Of the 10 Web sites with the network top level domain, 20% did not contain harm from omission. In contrast, only 2% of sites that had a top level domain other than network did not have harm from omission (p<0.04). However, there was no statistically significant relationship between network and non-network sites with respect to physical harm from commission. Although there were few educational and government sites in our study, it is notable that there were no identified cases of potential harm by commission in these sites. As most Web sites were commercial, it is difficult to draw meaningful conclusions from this analysis.

Table 8. Association between top level domain and potential harm. Values are counts (percentages of Web sites complying with that top level domain)
Total number of Web sites with top level domainPhysical harm by
CommissionOmission
Present
(n = 38)
11631 (82)85 (76)0.65114 (79)2 (40)0.07
103 (8)7 (6)0.718 (6)2 (40)0.04*
40 (0)4 (4)0.574 (3)0 (0)1.0
164 (11)12 (11)1.015 (10)1 (20)0.43
30 (0)3 (3)0.573 (2)0 (0)1.0
10 (0)1 (1)1.01 (1)0 (0)1.0

Note: Fisher's exact test calculated as expected values in some cases were <5


Intent to Sell Products

In order to explore the relationships between Web sites that sold products and those that did not, two evaluators independently revisited each Web site and identified Web sites that allowed the ordering of products. Agreement between reviewers was high (K=0.95). Fifty-three percent of Web sites (n=79) sold products. There was no significant relationship between selling products and potential harm due to omission (P=0.56) or commission (P=0.02). Although not statistically significant at the α = 0.01 level, selling products was actually related to less harm from commission, the reverse of what we would expect. In fact 63% (n=24) of the harmful Web sites from commission were found on sites that did not sell products, while 37% (n=14) were found on Web sites that sold products. Therefore, in our sample there does not appear to be more harmful information on sites that sell products.


Discussion

We found that most CAM Web sites were potentially harmful either by displaying statements which could cause harm, or by omitting vital information. However, our data suggest that available technical quality criteria fail to identify potentially harmful information online.

We found that one quarter of CAM Web sites present information that may cause physical harm if acted upon. These sites encouraged consumers to avoid conventional therapy, presented information on products that may be directly toxic, or presented information on products that may cause interactions with conventional medications. This is potentially dangerous because consumers have easy access to CAM products online and act upon what they see on the Internet [7], often do so without the knowledge or advice of clinicians [25].

Almost all (97%) CAM Web sites omitted vital warnings, drug interactions, contraindications, or adverse reactions. This is concerning because many consumers perceive "natural" products as safe. Further, many herbs that may be safe when used alone interact with conventional medications.

Previous studies have found scientific references [4], absence of financial interest [4], display of copyright [2], and display of editorial policy [3] to correlate with information accuracy. Technical quality criteria evaluated in this study may be unsuitable for CAM information as they seek to identify accuracy, which is difficult to determine for CAM. Surprisingly, even generally accepted measures of content quality such as disclosure of authorship and updating of information had no relationship to potential harm. Other researchers have also encountered difficulty in developing guidelines to evaluate CAM information [26].

Our previous study of breast cancer information online found that sites which complied with >3 JAMA benchmarks [27] (authorship, references, currency, and disclosure) were more accurate than lower quality sites (<3 JAMA benchmarks) [12]. However, in this sample of CAM Web sites we found no such relationship for potential harm resulting from commission (p=0.31) or omission (p=0.21). We are forced to question the assumption, at least for CAM information, that consumers can be taught to discern good content from bad by looking at domain-independent quality criteria. Recommending such criteria may convey a false sense of security, inadvertently causing consumers to trust harmful CAM websites. Although the technical criteria we assessed had no relationship to potential harm, other criteria or tools not tested may have some value.

Table 9. Web sites that contained no errors (neither commission nor omission)
Company/OrganizationSelling ProductsTop Level Domain
American Cancer SocietyNo.org
About IncNo.com
Pagewise IncNo.com
Natural PharmacyYes.net
Vitamin TraderYes.com

Five Web sites contained no harmful information from either commission or omission at the time of our study (Table 9). Four of the five best performing Web sites were retrieved from a search for St. John's wort, and one from a search on ginseng. One of these Web sites was from the American Cancer Society. However, the remaining four Web sites were from commercial or for-profit entities, two of which sold products. We note that Web site content changes frequently. Therefore, it is difficult to endorse any list of Web sites.

The major limitation of our study is the inherently subjective domain. Whether or not information has the potential to harm a consumer is a subjective clinical judgment which defies strict definition. However, relatively high inter-observer agreement among clinically trained reviewers suggests that our definitions were consistent.

Our study was also limited by our sample, which was restricted to Web sites displaying information about three popular herbs. Searches on other herbs or different alternative therapies may have different results. Also, we did not evaluate all possible technical quality criteria. Instead, we evaluated only criteria that were used in three or more studies as reviewed by Eysenbach et al [18] and were found to be reliably assessable using pre-determined operational definitions [1]. It is possible that other quality criteria will be more effective.

Since the primary reviewers (SS, DS) were aware of the study hypotheses, they may have been biased by this knowledge. However, inter-observer agreement between the primary and validation reviewers (who were unaware of the hypotheses) was high. Therefore this potential bias appears to have minimal effect on the results.

As we search for quality measures, we must keep in mind that some potentially useful criteria are easily manipulated. For example, one study found sites that claimed copyright were more accurate [2]. Such very specific and objective criteria are appealing since they may be automatically assessed using software, and evaluated by consumers by simply searching for the word "copyright" or © symbol. However, it is easy for site builders to claim copyright without changing the health information displayed on their site.

Although we restricted our analysis to individual sites, consumers may not make health-care decisions on the advice of one site, but rather on the collective information learned, confirmed or refuted from a multitude of online sources. Future work can assess the degree to which confirmatory evidence present on a range of sites can screen out undesirable information. In addition, it would also be important to understand why consumers search for CAM information. After all, some may turn to CAM only after conventional treatment fails, whereas others may reject traditional therapies.

The Internet provides a constantly changing, endless variety of information from innumerable sources. Ideally, we would like to empower consumers to evaluate health information for themselves. Currently available technical quality criteria, however, are not adequate to evaluate CAM information. For the time being, it may be prudent to recommend that consumers looking for CAM information online rely on known, authoritative providers of information. With this in mind, we must continue to search for ways of alerting consumers to potentially harmful information without restricting them to known sources.


Acknowledgements

Special thanks to Kalyan C. Kanneganti MBBS, School of Public Health, University of Massachusetts at Amherst and Swapna Muppuri MBBS, School of Public Health, University of Texas at Houston, who served as validation reviewers in this study.

Supported in part by a training fellowship from the Keck Center for Computational and Structural Biology of the Gulf Coast Consortia (NLM Grant No. 5T15LM07093) (M.W.), and a grant from the Robert Wood Johnson Foundation Health-e-Technologies Initiative (E.V.B., F.M-B).


Conflicts of Interest

None declared.


Appendix 1

Critical Facts: St. John's Wort (hypericum perforatum)

INTRODUCTION

Also known as Saint Johns wort, hypericum, goatweed, God's wonder plant, witches herb. Generally is used for depression, seasonal affective disorder, and anxiety. St. John's wort should not be used for patients with severe depression. Studies also show possible efficacy in the management of anxiety and premenstrual syndrome, although additional research is necessary.

INDICATIONS AND USAGE

WARNINGS

CONTRAINDICATIONS

ADVERSE REACTIONS

DRUG INTERACTIONS

DAILY DOSE


Appendix 2

Critical Facts: Ginkgo (ginkgo biloba)

INTRODUCTION

PURPORTED USES

WARNINGS

PRECAUTIONS AND ADVERSE REACTIONS

DRUG INTERACTIONS

CONTRAINDICATIONS


Appendix 3

Critical Facts: Ginseng

A) GINSENG*

DAILY DOSE

INDICATIONS AND PURPORTED USES

PRECAUTIONS AND ADVERSE REACTIONS

DRUG INTERACTIONS

B) ASIAN GINSENG (panax ginseng)*

INTRODUCTION

PURPORTED USES

WARNINGS

DRUG INTERACTIONS

CONTRAINDICATIONS

ADVERSE REACTIONS

[Usually well tolerated.]

C) AMERICAN GINSENG

INTRODUCTION

PURPORTED USES

ADVERSE REACTIONS

DRUG INTERACTIONS

D) SIBERIAN GINSENG (eleutherococcus senticosus, acanthopanax senticosus)

PURPORTED USES

WARNINGS

CONTRAINDICATIONS

ADVERSE REACTIONS

DRUG INTERACTIONS

*We evaluated Web sites with content on ginseng using the general ginseng critical facts and Web sites with content on the specific types of ginseng (Asian, American, and Siberian) with the critical facts on the specific types of ginseng.


References

  1. Sagaram S, et al. Inter-observer agreement for quality measures applied to online health information. In: Fieschi M, Coeira E, Li YC, editors. Medinfo. Amsterdam: IOS Press; 2004. [in press].
  2. Fallis D, Frické M. Indicators of accuracy of consumer health information on the Internet: a study of indicators relating to information for managing fever in children in the home. J Am Med Inform Assoc 2002 Jan;9(1):73-79. [PMC] [Medline]
  3. Griffiths KM, Christensen H. Quality of web based information on treatment of depression: cross sectional survey. BMJ 2000 Dec 16;321(7275):1511-1515. [FREE Full text] [PMC] [Medline] [CrossRef]
  4. Martin-facklam M, Kostrzewa M, Schubert F, et al. Quality markers of drug information on the Internet: an evaluation of sites about St. John's wort. Am J Med 2002 Dec 15;113(9):740-745. [Medline] [CrossRef]
  5. National Center for Complementary and Alternative Medicine, National Institutes of Health. What is complementary and alternative medicine (CAM)? 2002.   URL: http://nccam.nih.gov/health/whatiscam/ [accessed 2004 Jun 24]
  6. Eisenberg DM, Kessler RC, Van Rompay MI, et al. Perceptions about complementary therapies relative to conventional therapies among adults who use both: results from a national survey. Ann Intern Med 2001 Sep 4;135(5):344-351. [FREE Full text] [Medline]
  7. Fox S, Rainie L. Vital decisions: how Internet users decide what information to trust when they or their loved ones are sick. Pew Internet & American Life Project. 2002 May 22.   URL: http://www.pewinternet.org/pdfs/PIP_Vital_Decisions_May2002.pdf
  8. Sagaram S, Walji M, Bernstam E. Evaluating the prevalence, content and readability of complementary and alternative medicine (CAM) web pages on the internet. Proc AMIA Symp 2002:672-676. [Medline]
  9. Morris CA, Avorn J. Internet marketing of herbal products. JAMA 2003 Sep 17;290(11):1505-1509. [CrossRef] [Medline]
  10. Ernst E, Schmidt K. 'Alternative' cancer cures via the Internet? Br J Cancer 2002 Aug 27;87(5):479-480. [CrossRef] [Medline]
  11. Risk A, Petersen C. Health information on the internet: quality issues and international initiatives. JAMA 2002;287(20):2713-2715. [Medline] [CrossRef]
  12. Meric F, Bernstam EV, Mirza NQ, et al. Breast cancer on the world wide web: cross sectional survey of quality of information and popularity of websites. BMJ 2002 Mar 9;324(7337):577-581. [FREE Full text] [PMC] [Medline] [CrossRef]
  13. Eysenbach G, Köhler C. Does the internet harm health? Database of adverse events related to the internet has been set up. BMJ 2002 Jan 26;324(7331):239. [Medline] [CrossRef]
  14. Sullivan DE. Nielsen netratings search engine ratings.   URL: http://searchenginewatch.com/reports/netratings.html [accessed 2003 Feb 7]
  15. Most popular herbs and supplements in the United States.   URL: http://yoga.about.com/library/weekly/aa022501a.htm [accessed 2004 Jun 24]
  16. Health Improvement Institute. HIIQA definitions and abbreviations.   URL: http://www.hii.org/343definitions.htm [accessed 2004 Jun 24]
  17. HTTrack.   URL: http://HTTrack.com/ [accessed 2004 Jun 24]
  18. Eysenbach G, Powell J, Kuss O, et al. Empirical studies assessing the quality of health information for consumers on the world wide web: a systematic review. JAMA 2002;287(20):2691-2700. [Medline] [CrossRef]
  19. Medical Economics Company. PDR for Herbal Medicines. In: , 2nd edition Montvale, NJ: Medical Economics Company; 2000.   URL: http://dsc.ucsf.edu/view_pdf.php?pdf_id=23
  20. Sloan Kettering AboutHerbs website.   URL: http://www.mskcc.org/aboutherbs [accessed 2003 Apr 25]
  21. Berland GK, Elliott MN, Morales LS, et al. Health information on the Internet: accessibility, quality, and readability in English and Spanish. JAMA 2001;285(20):2612-2621. [Medline] [CrossRef]
  22. Markman M. Safety issues in using complementary and alternative medicine. J Clin Oncol 2002 Sep 15;20(18 Suppl):39S-41S. [FREE Full text] [Medline]
  23. Commission of the European Communities, Brussels. eEurope 2002: Quality Criteria for Health Related Websites. J Med Internet Res 2002 Nov 29;4(3):e15. [FREE Full text] [Medline]
  24. Byrt T, Bishop J, Carlin JB. Bias, prevalence and kappa. J Clin Epidemiol 1993 May;46(5):423-429. [Medline] [CrossRef]
  25. Adler SR, Fosket JR. Disclosing complementary and alternative medicine use in the medical encounter: a qualitative study in women with breast cancer. J Fam Pract 1999 Jun;48(6):453-458. [Medline]
  26. Cooke A, Gray L. Evaluating the quality of internet-based information about alternative therapies: development of the BIOME guidelines. J Public Health Med 2002 Dec;24(4):261-267. [Medline] [CrossRef]
  27. Silberg WM, Lundberg GD, Musacchio RA. Assessing, controlling, and assuring the quality of medical information on the Internet: Caveant lector et viewor--Let the reader and viewer beware. JAMA 1997 Apr 16;277(15):1244-1245. [Medline] [CrossRef]


Abbreviations

BI: Bias index
CAM: Complementary and alternative medicine
MAOI: Monoamine oxidase inhibitor
NSAIDS: Non-steroidal aniti-imflammatories
PABAK: Prevalence-adjusted bias-adjusted kappa
PI: Prevalence index
SPSS: Statistical Package for the Social Sciences
SSRI: Selective serotonin reuptake inhibitor


Submitted 20.11.03; peer-reviewed by S Bhavnani, J Seidman, J Fogel; comments to author 28.11.03; revised version received 07.05.04; accepted 21.05.04; published 29.06.04

Please cite as:
Walji M, Sagaram S, Sagaram D, Meric-Bernstam F, Johnson C, Mirza NQ, Bernstam EV
Efficacy of Quality Criteria to Identify Potentially Harmful Information: A Cross-sectional Survey of Complementary and Alternative Medicine Web Sites
J Med Internet Res 2004;6(2):e21
<URL: http://www.jmir.org/2004/2/e21/>