pmc logo imageJournal ListSearchpmc logo image
Logo of jamiaJAMIA - The Journal of the American Medical Informatics AssociationSubscribe to JAMIAAMIA Home PageAMIA MembershipSearch the AMIA Web Site
J Am Med Inform Assoc. 2006 Sep–Oct; 13(5): 485–487.
doi: 10.1197/jamia.M2084.
PMCID: PMC1561795
Improving Efficacy of PubMed Clinical Queries for Retrieving Scientifically Strong Studies on Treatment
Salvatore Corrao, MD, a [low asterisk] Daniela Colomba, MD, b Sabrina Arnone, MD, c Christiano Argano, MD, PhD, d Tiziana Di Chiara, MD, e Rosario Scaglione, MD, f and Giuseppe Licata, MD g
aBiomedical Department of Internal Medicine, University of Palermo, Palermo, Italy
bBiomedical Department of Internal Medicine, University of Palermo, Palermo, Italy
cClinical Methodology, Epidemiology and Statistics Unit, National Relevance Hospital Trust Civico e Benfratelli, Palermo, Italy
dBiomedical Department of Internal Medicine, University of Palermo, Palermo, Italy
eBiomedical Department of Internal Medicine, University of Palermo, Palermo, Italy
fBiomedical Department of Internal Medicine, University of Palermo, Palermo, Italy
gBiomedical Department of Internal Medicine, University of Palermo, Palermo, Italy
[low asterisk]Correspondence and reprints to: Salvatore Corrao, MD, Associate Professor, Biomedical Department of Internal Medicine, Piazza delle Cliniche 2, 90127 Palermo, Italy. (Email: s.corrao/at/tiscali.it).
Received February 16, 2006; Accepted April 23, 2006.
Abstract
The authors evaluated the retrieval power of PubMed “Clinical Queries,” narrow search string, about therapy in comparison with a modified search string to avoid possible retrieval bias. PubMed search strategy was compared to a slightly modified string that included the Britannic English term “randomised.” The authors tested the two strings joined onto each of four terms concerning topics of broad interest: hypertension, hepatitis, diabetes, and heart failure. In particular, precision was computed for not-indexed citations. The added word “randomised” improved total citation retrieval in any case. Total retrieval gain for not-indexed citations ranged from 11.1% to 21.4%. A significant number of Randomized Controlled Trial(s) (RCT)s (9.1-18.2%) was retrieved for each of the selected topics. They were often recently published RCTs. The authors think that correction of the Clinical Queries filter (when they focus on therapy and use narrow searches) is necessary to avoid biased search results with loss of relevant and up-to-date scientifically sound information.
Introduction

PubMed is a web free resource and a powerful bibliographic citation search engine widely used by health-care professionals.1 It was developed and maintained by the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM) located at the National Institutes of Health.2 It covers many fields of biomedical knowledge: medicine, nursing, dentistry, veterinary medicine, the health care system, and preclinical sciences. At present, you can search more than 16 million bibliographic citations and abstracts, accessing both MEDLINE database and articles in selected life sciences journals not included in MEDLINE. All the articles in PubMed Central (free-full text journals) are included. PubMed offers clinicians the “Clinical Queries” page, a specific web page for clinical searches.3 You can access by clicking “Clinical Queries” from the PubMed sidebar. Then, you can choose one of the clinical study categories and enter your search term in the search box, selecting a scope. You can get a narrow and specific search or a broad and sensitive one. Clinical Study Categories tally with search filters based on the paper of Haynes and colleagues.4

Clinical Queries are widely used by clinicians. If you focus your search on therapy and select a narrow search, you are joining your term onto a search string from the PubMed filter table5 as showed below:

(randomized controlled trial[Publication Type] OR (randomized[Title/Abstract] AND controlled[Title/Abstract] AND trial[Title/Abstract])).

This is because randomized controlled trials (RCTs) are the sound study design to recognize strong evidences about therapy.

However, the term “randomized,” used after the Boolean operator OR, may introduce a bias because it may cut off studies about therapy in which “randomized” could be exclusively written “randomised.” Haynes and colleagues,4 in the second part of the search string (after the Boolean operator OR), raise the need to find citations not indexed in PubMed. However, just in this case, not-indexed bibliographic citations could be lost when Britannic English had been exclusively used by authors. Moreover, we think that a bug in a search filter, though a little one, could twist clinician awareness of the searched topic. In our opinion, a slightly modified search string that includes the word “randomised” could avoid relevant information loss.

The aim of this paper was to evaluate the retrieval power of “Clinical Queries” specific/narrow search string about therapy in comparison with a modified search string that includes the word “randomised.”

Methods

The search strategy utilized by PubMed “Clinical Queries” is:

(randomized controlled trial[Publication Type] OR (randomized[Title/Abstract] AND controlled[Title/Abstract] AND trial[Title/Abstract])).

We compared retrieval efficacy of the string, mentioned before, to the subsequent one:

(randomized controlled trial[Publication Type] OR ((randomized[Title/Abstract] OR randomised [Title/Abstract]) AND controlled[Title/Abstract] AND trial[Title/Abstract])).

The proposed search strategy has the term “randomised” joined onto “randomized” by an OR (a Boolean operator) within round brackets. This term should improve recall of the PubMed string resolving a significant matter for clinicians: to reach right recall by the string used to narrow the search.

To compare the effects on recall of our change we tested the two strings joined onto each of four terms concerning topics of broad interest: hypertension, hepatitis, diabetes, and heart failure. Moreover, we used such modified search strategy to retrieve citations indexed for MEDLINE and, on the other hand, citations as supplied by publisher, in process, or not-indexed-for-MEDLINE. The search was performed on 28 December 2005 at 1:18 p.m. We calculated absolute and percentage variations of citations retrieved by the modified searching strategies compared with the PubMed ones. Moreover, we explored differences in total retrieved citations by both MEDLINE subset (citations included only in MEDLINE) and premedline, “as supplied by publisher,” and “not in medline” subsets. We used PubMed capabilities to modify search strategies to perform this kind of analysis using subset keywords. In the first case, we joined “AND medline [sb]” onto each string. In the second case, we joined on “AND (“Premedline”[sb] OR publisher[sb] OR pubmednotmedline[sb]).” Two independent observers (DC and SA) evaluated retrieved citations, and formally checked for randomiz(s)ed controlled trials. A third person (SC) reviewed the entire process. Absolute and percentage variations of citations retrieved by the modified strings have been compared with the PubMed ones. Precision was computed, as reported by Haynes and colleagues,4 only for strings used to retrieve citations in process, as supplied by publisher, or not-indexed-for-medline. Moreover, we considered variations of the number of retrieved RCTs.

Results

The introduction of the word “randomised” improved total citation retrieval in any case. However, the main gain was achieved searching not-indexed-for-MEDLINE or in process citations. In this case, total retrieval gain ranged from 11.1% to 21.4%. Hypertension had the greatest gain among considered topics (19.4%). Searching total or indexed for MEDLINE citations, the gain was of about 1% compared with classic PubMed “Clinical Queries” strings. Table 1 shows comparisons between search strings used to retrieve in process, as supplied by publisher or not-indexed-for-MEDLINE citations, each of them joined onto the four main topics. Our data show that precision did not significantly change, but a significant number of RCTs was retrieved by modified string for each of selected topics. RCTs retrieval gain ranged from 9.1% to 18.2% (Table 1). Hypertension topic had the greatest gain also in this case. Often, Extra retrieved RTCs were recently published citations.

Table 1Table 1
Table 1. Comparisons between Search Strings Used to Retrieve Citations in Process, as Supplied by Publisher or Not-Indexed-for-MEDLINE, Each of Them Joined onto the Four Main Topics (PubMed Clinical Queries Vs. Modified Search String—see at the (more ...)
Discussion

Health care professionals widely use PubMed, one of the most relevant and powerful bibliographic citation search engines.6 Clinical Queries represents a clinician-oriented section of PubMed web site. Its scope is to make easier clinician task to search scientifically sound literature. However, some inaccuracies could act as confounders for unskilled users or bring a loss of information.7 Indeed, we found a bias in the Clinical Queries search filter that focuses on therapy and narrows the search. It does not merely consider the Britannic English variant of the term “randomized” (that is, “randomised”). We suggest a slight change in the search string by adding the term “randomised.” Our results point out a retrieval capacity of modified search strategy wider than PubMed Clinical Queries filter for retrieving scientifically strong studies on treatment. The modified search strategy was always able to retrieve a wider amount of citations and, in particular, of RCTs. These are methodologically sound studies that evaluate evidences about treatment. Notwithstanding, precision did not significantly differ between the two strategies (Clinical Queries versus modified string) joined onto each topic. When citations are in process, as supplied by publisher, or not-indexed-for-MEDLINE, PubMed filter has shown a significant loss of information. We think one could expect this observation. Indeed, the term “randomised” is utilized by journals using Britannic English like standard language. In this case, the PubMed filter systematically fails citation retrieval. On the contrary, “randomized” is always recognized by PubMed filter after NLM indexing process is completed (you have to consider that some citations will never be indexed). Then, up-to-date and relevant information can be lost focusing PubMed searches on studies about treatment by “Clinical Queries” filter. In conclusion, we think that clinicians have to be warned of the actual risk of losing relevant information when they focus PubMed search on therapy and select narrow search. Finally, in this case, correction of Clinical Queries filter is necessary to avoid biased search results with loss of relevant and up-to-date scientifically sound information.

Footnotes
The authors thank Dr. Armando Gregorini and Dr. Mariastella Colomba (University of Urbino) for English language revision of this manuscript.

S Corrao had the idea and dealt with planning, conducting, reporting, and writing the work. He is the guarantor, together with G Licata, of the overall content. G Licata contributed to validate the idea and give substantial advice about planning, conducting, and reporting the work. D Colomba and S Arnone conducted literature revision and participate to the analysis and interpretation of data. C Argano and T Di Chiara contributed to manuscript revision and data reporting. D Colomba and R Scaglione validated the idea and dealt with critical revision of the manuscript. All the authors approved the final version of the manuscript.

References
1.
Haynes RB, McKibbon KA, Wilczynski NL, Walter SD, Werre SR, Hedges Team Optimal search strategies for retrieving scientifically strong studies of treatment from Medlineanalytical survey. BMJ 2005;330:(7501):1179. [PubMed].
2.
National Center for Biotechnology Information (NCBI) at the National Library of Medicine. Available at: http://www.ncbi.nlm.nih.gov/. Accessed Jan 22, 2006..
3.
PubMed Clinical Queries. Available at: http://www.ncbi.nlm.nih.gov/entrez/query/static/clinicals.html. Accessed Jan 22, 2006..
4.
Haynes RB, Wilczymski N, Mckibbon KA, Walker CY, Sinclair K. Developing optimal search strategies for detecting clinically sound studies in medline J Am Inform Assoc 1994;1:447-458.
5.
National Center for Biotechnology Information (NCBI). Clinical Queries Filter Table. Available at: http://www.ncbi.nlm.nih.gov/entrez/query/static/clinicaltable.html. Accessed Jan 22, 2006..
6.
McEntyre J, Lipman D. PubMedbridging the information gap. CMAJ 2001;164:(9):1317-1319. [PubMed].
7.
Corrao S. Some explanations on search strategies for retrieving systematic reviews from Medline via Pubmed. Available at: http://bmj.bmjjournals.com/cgi/eletters/330/7482/6893135. Accessed Jan 22, 2006..