Someone else is fooling around with numbers.

Via Peter Suber, I came across this editorial in the Journal of Vision:

Measuring the impact of scientific articles is of interest to authors and readers, as well as to tenure and promotion committees, grant proposal review committees, and officials involved in the funding of science. The number of citations by other articles is at present the gold standard for evaluation of the impact of an individual scientific article. Online journals offer another measure of impact: the number of unique downloads of an article (by unique downloads we mean the first download of the PDF of an article by a particular individual). Since May 2007, Journal of Vision has published download counts for each individual article.
The author goes on to compare download vs citation (counts and rates, and downloads or citations over time). It's a pretty good analysis of an important topic, but something vital is missing:
Where are the data? Can I have them? What can I do with them?1
In fact, the data are approximately available here. Why "approximately"? Well, I can get a range of predigested overviews: DemandFactor (roughly, downloads/day/first 1000 days) Top 20, total downloads Top 20 and article distributions by DemandFactor and total downloads. I can also get the download information for any given article -- one article at a time, and once again predigested in the form of a graph from which I have to guesstrapolate if I want raw, re-useable data.

This is disappointing, for both general and specific reasons. It's always disappointing to see data locked away in a graph or a pdf or some similar digital or paper oubliette, there to languish un(re)used. It's also disappointing to see a journal getting way out ahead of the curve on something as important and valuable as download metrics (is there another journal besides J Vis that provides this information, even predigested?), and then missing an opportunity to continue to innovate by providing real Open Data.

It's also disappointing in this specific instance, because I have a question: why is Figure 1 plotted on a log scale and, more importantly, was the correlation coefficient calculated from log-transformed data? I could understand showing the log scale for aesthetic reasons, but I can't think of a reason to take logs of that kind of data -- and doing so can alter the apparent correlation. For instance, remember Fig 1 from this post? Here it is again, together with a plot of log-transformed data, both shown on natural and log scales:


logarithmssarehard.PNG



I could answer my own question quickly and easily if I could get my hands on the underlying data -- which leads me right back to one of the primary general arguments for Open Data. If I, statistical ignoramus and newcomer to these sorts of analyses, have questions after a brief skim through the paper, what questions might a better equipped and more thorough reader have? It's simply not possible to know -- the only way to find out is to make the data openly available!

I realise it's not possible for journals to demand Open Data from their authors -- that's what funder-level mandates are for, though there's much discussion still to be had regarding whether Open Data mandates would be a good idea. Nonetheless, when journals publish analyses of their own data, it would be great to see them leading the way by providing unrestricted access to that data.

-------------
1 Astute readers, both of you, will remember that howl of anguish refrain from this post.


Comments
Post a comment

















RSS Feed

CC0
To the extent possible under law, I have waived all copyright and related or neighboring rights to this weblog. This work is published from the United States. Further information.


Links:
(formerly Malice Aforethought)
me
spousal unit
Bloglines account
Simpy account
Connotea account
OpenWetWare userpage
monthly irregular column on 3QuarksDaily


Please sign the petition in support of the European Commission's proposed Open Access Self-Archiving Mandate

googlebombs for good
Roe; Wade; Roe v Wade
abortion
Jew
Seldovia Herald


blogroll:

Archives:
August 2010
June 2010
April 2010
March 2010
February 2010
January 2010
October 2009
July 2009
June 2009
May 2009
April 2009
March 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003









Design thrown together haphazardly by frykitty.
Powered by the inimitable MovableType.