Volume: Vol1, Issue 4/2012

Published on: Dec 2012

Contains 20 articles

Exploring the temporo-spatial trends of keyword occurence using the Medline/Pubmed database

 

Authors

ALEXANDRU DAN CORLAN *,1

 

Affiliation

1Spitalul Universitar de Urgență București

 

 

We announce a new version of the online data mining tool “Medline Trend” (http://dan.corlan.net/medline-trend.html) that extends the yearly trends with spatial (geographic) statistics. In a nutshell, the user introduces a normal query, such as the name of a disease, and obtains counts of PubMed entries (papers) for each country and time interval. The country of origin is detected from the address (AD) field. In contrast with the date of publication, the presence and significance of at least one recognisable country name in the address field is more variable and the address field itself seems to only have been introduced in Medline about 20 years ago. For example, at the date of writing, in the 2008–2012 interval, only about 3.34 million entries out of 4.052 million had a recognisable country name. For earlier years, the rate is even lower. In this paper, we propose a number of indices that are not directly influenced by the address field variability. They are based on the relative value of one spatial index to others, computed for the same time interval or region. The annualised rate of change of the number of papers for a keyowrd and region over a time interval is the compund anual rate of change of the number of entries fulfilling the search criteria fitted to the actually observed counts. The relative interest is the proportion of entries on a topic (such as ‘tuberculosis’) in a geographical region and over a specified time interval compared to all entries originating from that region during the same time interval. We found that, at least for tuberculosis, there is a strong and consistent log-log relationship between the relative interest and the prevalence of the disease in that area and period. The interest–size corelation index is the Spearman (ρ) correlation between the absolute output of a country and the relative interest for a given keyword. It might grossly indicate whether a topic attracts more interest in countries with more rather than less developed scientific systems. It is for example 0.26 for ‘echocardiography’, -0.48 for ‘tuberculosis’, 0.72 for ‘stem cell’, 0.15 for ‘diabetes’, -0.06 for ‘malaria’. Spatiotemporal trends might sometimes provide insightful clues into the quantitative mechanisms that lead to adoption of a particular research thematic, but their application requires attension to numerous limitations and caveats, in addition to the usual limitations of paper count statistics.

 

Pages

281-292

 

Full article access  Total downloads: 3728 (Distinct clients:737)

Download citation (bibtex)

*Correponding author