|
Does the "green road"1 lead off a cliff?
Further to my complaints about the copyright thicket in which data are being lost, Charles W Bailey Jr points out that, in fact, it's worse than that: a good deal of the potential functionality of existing Open Access archives is jammed up in the same thicket: If... repositories could not be trusted, then libraries would have to attempt to archive the postprints in question themselves; however, since postprints are not by default under copyright terms that would allow this to happen (e.g., they are not under Creative Commons Licenses), libraries may be barred from doing so.(Emphasis mine.) Charles is talking about the question of whether or not self-archiving of scholarly articles (the "green road" to Open Access) will cause libraries to cancel journal subscriptions. I touched on this issue in an earlier entry, and don't want to revisit it here. What interests me here is the fact -- which I initially had trouble grokking, as you'll see if you read the comments on Charles' entry, where he patiently explains it -- that digital objects in Open Access repositories carry their own copyrights, rather than being covered by a blanket license provided by the repository. For instance, PubMed Central refers to Open Access (using the Bethesda Statement), and then says: Note that this definition of open access goes beyond the simple free access that applies to all full-text content viewable directly in PubMed Central (PMC) from the National Institutes of Health (NIH).So PMC is OAI-PMH-compliant, but contains digital objects that are not themselves Open Access. I suspect the same is also true of the majority of institutional and centralized repositories (though I only checked ePrintsUQ, arXiv.org and Cogprints, none of which make any mention of copyright at all). To get an idea of what that actually means, read carefully this brief discussion by Peter Suber of the BBB definition of Open Access: The best-known part of the BBB definition is that OA content must be free of charge for all users with an internet connection. However, the BBB definition doesn't stop at free online access. It adds an extra dimension that isn't as easy to describe, and consequently is often dropped or obscured. This extra dimension gives users permission for all legitimate scholarly uses. It removes what I've called permission barriers, as opposed to price barriers. The Budapest statement puts the extra dimension this way:Because each digital object carries its own copyright, e-print repositories do not remove permission barriers. Here's Peter Suber again:By "open access" to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.The Bethesda and Berlin statements put it this way: For a work to be OA, the copyright holder must consent in advance to let users "copy, use, distribute, transmit and display the work publicly and to make and distribute derivative works, in any digital medium for any responsible purpose, subject to proper attribution of authorship". Permission barriers are more difficult to discuss than price barriers. First, there are many kinds of them, some arising from statute (copyright law), some from contracts (licenses), and some from hardware and software (DRM). They are not like prices, which differ only in magnitude. Second, their details are harder to discover and understand. Third, different users in different times, places, institutions, and situations can face very different permission barriers for the same work. Fourth, authors who deposit their articles in open-access archives bypass permission barriers even if they also publish the same articles in conventional journals protected by copyright, licenses, and DRM.As far as I can tell, that fourth point is simply not true of any existing archives. If you want to do anything with an article in, say, PubMed Central, other than simply read it -- if you want to copy it and distribute the copies, if you want to make a derivative work, if you want to pass it to text-mining or other software -- you will have to determine, on an article-by-article basis, whether you are allowed to do that. Take, for example, the following paper from the lab I work in, available free from PubMed Central: Deletion of Mnt leads to disrupted cell cycle control and tumorigenesis.Right above the title on the linked page is a copyright notice: "Copyright © 2003 European Molecular Biology Organization". The link provided goes to a PMC page which makes it very clear that an article's presence in PMC tells you nothing about what rights the copyright holder(s) reserve or waive. Searching the EMBO site for "copyright" brings up nothing useful, but the EMBO Journal (which is actually part of Nature Publishing Group) has this to say: Nature Publishing Group does not require authors of original research papers to assign copyright of their published contributions. Authors grant NPG an exclusive licence to publish, in return for which they can re-use their papers in their future printed work. NPG's author licence page provides details of the policy and a sample form. Authors are encouraged to submit their version of the accepted, peer-reviewed manuscript to their funding body's archive, for public release six months2 after publication. In addition, authors are encouraged to archive their version of the manuscript in their institution's repositories (as well as on their personal web sites), also six months after the original publication.Apart from the foul six-month embargo (Do you have any idea how many experiments I can do in six months? But I digress.), this seems reasonable, and it leaves permissions up to the authors. So "copyright EMBO" is misleading, and it's likely that EMBO J authors, having reposited their articles, wish them to be fully Open Access. As it happens, in this case the corresponding author is my boss so I can assure you that he knows about Open Access and is all in favour. The point, though, is that you have to dig around to find out that it's up to Peter, and then you have to contact him to find out that he fully intends you to have the permissions you need. You are not going to be able to do that for more than a handful of papers; it certainly puts an effective brake on text-mining. I think this brief example makes clear that, in practice, you cannot do anything much with repository content but read it ("fair use", of course, still applies). You simply don't have the time to uncover the necessary permissions for anything else. Which in turn means that there are no, or very few, actual Open Access repositories currently in existence. I'll say it again: e-print repositories do not provide Open Access. They provide free access to human eyes, one paper at a time; as the accepted definitions make clear, that's not at all the same thing. Since self-archiving in such repositories is the current focus of many, if not most, efforts to provide 100% Open Access to the world's scholarly literature, this is a big deal. There are two obvious solutions: 1, ignore the whole issue; and 2, start applying labels to digital objects. In the short term and for individual researchers, solution 1 has considerable appeal. There's even precedent: a recent study pointed out that patents do not slow research down much, mostly because researchers ignore them. The majority of e-prints are probably in a repository because their authors want Open Access; the likelihood of running afoul of copyright and actually being called to account for it seems pretty low. I think, however, that this head-in-the-sand approach is a very bad idea. What authors want is not always what counts, as when the copyright is actually owned by a publisher. I've been trying to think of the kinds of things you might do with a body of OA literature -- build a text-mining robot that offers novel ways to look for deep connections between ideas and among data, make a local database of papers on your research specialty, and so on -- but in fact, much of the point of Open Access is to make possible things I cannot think of. Look what the Web has made possible, and ask yourself: how much of that could I have predicted in 1991? It seems to me that anything which makes use of a substantial number of papers, or relies on being able to mine an entire corpus, runs the risk of being shut down or co-opted just when it starts to get interesting and useful. Suppose, for instance, that I write that text-mining robot: while I am using it to feed ideas into my own benchwork, I'm OK, but the minute I give that robot to someone (or, as is my preference, everyone) else, I run the risk of being sued for copyright violations. Nonetheless, the fundamental OA definitions include rights beyond simple reading access for good reason. As I discussed in my earlier entry, rights management is going to be at the heart of Open Data, and I have argued elsewhere that licensing and standards/metadata are also going to be crucial to bringing the "openness" of Open Access to science as a whole. I think the Open Science field is headed for some serious problems if permissions barriers are not given more attention. I might concede that the most important thing to achieve right now is removal of access barriers to human eyeballs, but why make trouble for ourselves by -- as seems to be happening3 -- ignoring the rights issue? There's no reason why the process of encouraging authors to self-archive, and building tools to make that easier, should not include information and tools that focus on rights management. At the very least, we should be making authors who are already on-side, who are self-archiving and using the SPARC Author Addendum and so on, aware of the issue -- and giving them the tools to label their own papers with clear statements of the rights they wish to retain or waive. At least then the rate of growth of the backlog problem will begin to slow down, and should approach zero as we approach 100% OA (even on the green road) rather than continuing to grow unchecked.
2 In fact, EMBO J deposits papers in PubMed Central for free access after 12 months, and most authors probably do not place copies in PMC, or anywhere else, at the 6-month mark. I know that these authors didn't, and I bet NPG is relying on this behavior to get an effective 12-month embargo. Bastards. 3I am, of course, re-inventing the wheel here. In comments on the entry that sparked all of this, Charles notes that he was debating this issue with OA movers and shakers before the BOAI went public. Here's a 2005 article by Richard Poynder covering the same ground and then some. I mentioned Project RoMEO above. Pretty much anything I have to say about OA will have been said before, since I'm a newcomer to the field, but I write out my thoughts here in order to collect and organize them. Comments Thanks for that link to the ASOA thread -- I note that it's from 2003, I wasn't kidding in my footnote (3) about covering well-trodden ground! I don't know what I can say that wasn't said by Mike Eisen, Peter Suber, Jan Velterop and others in that thread. Like them, I think the distinction between free-as-in-beer access and Open Access is important and necessary. Post a comment |
RSS Feed Links: spousal unit me copy Bloglines account Simpy account Connotea account OpenWetWare userpage blogroll: Archives: May 2008 April 2008 March 2008 February 2008 January 2008 December 2007 November 2007 October 2007 September 2007 August 2007 July 2007 June 2007 May 2007 April 2007 March 2007 January 2007 December 2006 November 2006 October 2006 September 2006 August 2006 July 2006 June 2006 May 2006 April 2006 March 2006 February 2006 January 2006 December 2005 November 2005 October 2005 September 2005 August 2005 July 2005 June 2005 May 2005 April 2005 March 2005 February 2005 January 2005 December 2004 November 2004 October 2004 September 2004 August 2004 July 2004 June 2004 May 2004 April 2004 March 2004 February 2004 January 2004 December 2003 |
At the end of the Green Road lies nothing but 100% OA
There is no permissions problem or barrier whatsoever for self-archived full-texts deposited by their authors in their institutional repositories, free for all. All the rest of the legitimate uses come with the territory: harvesting, indexing, linking, finding, downloading, reading, storing, data-crunching, printing-off (own use). Nothing else is needed. Course packs can list URLs. And no one ever said OA referred to anything but the online draft: distributing multiple printed copies would be a whole 'nother matter. So would "republishing" (though i can't think why anyone would want to, once it's already OA. (See the " Free Access vs. Open Access thread, began Aug 23.)