|
ttl
|
Thursday, 12 August
self portrait (another cameraphone snap)
Saturday, 19 June
Juneteenth
I'm ridiculously busy (at work today, Saturday, for instance), but I wanted briefly to mark the occasion. It's Juneteenth, an emancipation anniversary which -- if I understand it correctly -- focuses on the joy without needing to downplay the grim realities, and is for everyone who feels that, for all its unresolved legacies, the end of US slavery is something to celebrate. I'm so white I'm actually pink, but Juneteenth makes me happy too. Fortunately, my friend Abel has done a great job of introducing and explaining the day and its background, so I'll just point you there: What is this Juneteenth of which you speak? Sunday, 18 April
more phone snaps
advantage, schmantage
In a recent post at The Scholarly Kitchen, Philip Davis takes issue with a recent article by Alma Swan regarding the controversial Open Access citation advantage, the idea that any given paper is, ceteris paribus, more likely to be cited if published under an Open Access model than it would be if published behind a paywall. The FUD merchants want to claim that, if no citation advantage exists, there is no point to Open Access: that unless OA papers are currently garnering more citations than their TA equivalents, current levels of access must be adequate; or that if OA papers, which presumably are read more, are not cited more, then OA must be a repository for the second rate. Hence the controversy: it's an easy way to obscure the debate, sending up a cloud of statistical argument like a fleeing cuttlefish squirting ink. "Look over there, OA proponents are wrong about this, surely they must be wrong about everything, pay no attention to the massive profits behind the curtain." Considering that:
it's something of a miracle if any OA citation advantage shows up anywhere. More importantly, though, the citation advantage was always a minor point in the list of reasons to prefer Open to Toll Access: (1) Not everyone who needs to read the primary literature is going to write anything citing it. That doesn't make providing them with access to the literature any less important, and no payment or institutional affiliation is required to read Open Access information. (2) Toll Access confines data- and textmining to isolated, artificial commercial sections of the body of knowledge, hindering progress on mining methodologies, restricting the reach of existing work and precluding any idea of a comprehensive protocol. (3) OA provides better value for money than Toll Access. Regardless of where the money comes from, OA is a one-time up-front expense that covers all subsequent use: pay the midwife, but keep the baby. Peter Suber has written a careful exposition of this argument from the taxpayer perspective, but most if not all of his points map readily onto any research funder. (4) Open Access scales where Toll Access doesn't; my own recent estimate (caveat lector!) is that library access, even at the best funded libraries, runs to around half of the total available scholarly journal literature. What use is a system that enables publication without enabling access?The subscription model divorces (part of) the cost of dissemination from the overall cost of production of scholarly information, which has allowed research funders to overlook that part of the cost of their mission. It's been historically picked up by libraries, but that's easily revealed as a shell game when you look at where library funding comes from. Who loses the shell game? Academics whose work is less widely available than it should be, and anyone who wants to read the primary literature. Who wins? Publishers, whose prices have been allowed to escalate because they have largely escaped scrutiny (except by librarians, who for no good reason that I can see have been largely ignored, at least until relatively recently, by academic and political decision makers). Wednesday, 14 April
from my commute this morning
On the Train Monday, 15 March
estimating ullage
Ullage, the word for the empty space at the top of a wine bottle, is Peter Suber's term for the gap between a library's actual holdings and its patrons' access needs. That's a difficult thing to measure, but I might have found a way to estimate it with reference not to patron needs but to all published journals, as follows.
At $1200/journal, $5.8 million1 would buy subscription access to about 4,800 titles, which is less than 23% of the number of active, refereed, academic/scholarly journals. At $700/journal, ARL members -- some of the largest and best funded libraries in America (indeed, in the world) -- are able to afford access to less than half of the scholarly literature. This seems reasonably consistent with the earlier LANL estimate, given that Varjabedian looked only at the top 100 most-cited journals, which must surely be at the top of any research library's "must-have" list. It's important to point out that what I'm estimating here is not ullage sensu Suber, but rather library holdings relative to all possible holdings. But I would argue that the access needs of all the scholars and other patrons served by ARL libraries is surely a decent proxy for "all possible journals", if not a significantly larger body of information! Put another way, here I am estimating the gap between current access levels and the information availability of a 100% Open Access world. Sunday, 14 March
an interesting mind
This entry is especially for those of my readers who do not work in science or related fields (librarians, publishers, etc), and who are not quite sure why I am so obsessed with Open Science. (Hi, Mom and Dad!) This is Pawel Szczesny at TED Warsaw, describing for the lay public what Open Science is, and what it can mean. Pawel's is the interesting mind to which I refer in the title. I finally met him in person at Science Online earlier this year, but I have been following him around online for years. He never fails to come at a question or problem from an interesting and useful angle, and his TED talk is just the latest example. What if? Do yourself a favour, watch the whole thing. Sunday, 07 March
Where indeed?
AJ Cann has a post up that neatly summarizes the dilemma facing Open Science advocates/enthusiasts, and asks useful questions arising therefrom. In the current competition-focused environment, says Alan: Open science is an iterated prisoner's dilemma, which is a messy and unpredictable business. Too unpredictable for most people to try to build a career on. Thinking about strategies which are likely to be successful leads me towards the concept of an open science community rather than unilateral complete openness - a long term multiplayer collaboration. Does such a community already exist? If not, how do we build one?Having taken a job in biotech, I feel a bit cut off from any such community -- industry is notoriously protective of IP and fond of secrecy besides. I feel a bit of a fraud, for instance, taking part in discussions of Open Science issues on FriendFeed (such as the conversation kicked off by Alan's blog post), knowing that I can't talk openly about my own work. It doesn't keep me from shooting off my yap, of course, but it's a nagging icky feeling -- and I keep getting the meta-feeling that it doesn't have to be this way. Just as secrecy in academia only makes sense within the existing reward structure, secrecy in industry could be at least partly offset by policy decisions that recognize the gains in efficiency that collaboration can bring. I've heard multiple times from multiple sources that industry may close itself off from the rest of the world, but within a company, the teamwork ethic is amazing. Clearly, the value of co-operation is recognized. Why shouldn't that also work for (larger and larger) groups of companies? What you lose by not being the only company to know something from which profit can be made (call it X) is offset by the fact that you might never have learned X without the collaboration -- and in the meantime, the world gets X that much faster. It seems clear, though, that such top-down decisions are more likely to be made in academia, and perhaps the nonprofit sector, than in profit-driven industry -- at least until there are enough concrete examples of success to tip the perceived balance of risk. If I'm -- if we Open Foo types are -- right, it's actually riskier to compete than to cooperate in the long term. Better to own a share of X sooner than to delay any return on your investment in the hope of owning X outright later. This is especially true when the resources required to try to own X could be used to get you shares in multiple other projects at the same time. Even then, openness in industry seems to me unlikely to go beyond consortia. Complete openness (open notebook science) precludes patent protection, and in the dog-eat-dog world of business driven by the insatiable demands of disconnected shareholders, I don't think we are ever going to wean the beancounters off their patents. (We could improve the situation by overhauling the patent process so that teeny incremental changes were not granted full protection, of course; but I digress, and don't get me started.) So to return to Alan's analogy, "multiplayer" means different things in academia (and perhaps the nonprofit sector) and in business. In business, it means defined communities of co-operation; in academia, I see no good reason why it shouldn't mean everyone (except, perhaps, where the two intersect and academics enter a business-defined collaboration1). In academia, communities with an open science focus are beginning to form. The best example is still the one which continues to coalesce around Jean-Claude Bradley's UsefulChem initiative, but it's no longer the only one as it was just a few years ago. Chemist Mat Todd has funding for an open science project to improve synthesis of the anti-schistosomiasis drug, praziquantel. Biophysicist Steve Koch has a labful of open science enthusiast grad students. And so on; there's a list of Open Notebook practitioners on wikipedia, and my own feeling is that technical rather than philosophical barriers are keeping quite a few labs from that list. By being discoverable on the public web, all of these labs can do what Jean-Claude is doing: accumulate collaborators and get more work done. Try searching Google for "DNA tweezers kinesin" -- the second and fifth hits will hook you up with Steve Koch. "Praziquantel synthesis" -- the third hit will take you to the schisto community on The Synaptic Leap, where you'll soon meet Mat Todd, and the seventh hit will take you to a brief discussion of Mat's project on the UsefulChem blog. "Antimalarial Ugi" -- most of the first ten hits will introduce you to UsefulChem. If you're doing something that's in any way related to the work that goes on in these labs, you're one Google search away from a collaboration. In business, too, more and more companies are recognizing the benefits of wider sharing. Details of private collaborations are hard to come by, but just try searching for "precompetitive sharing" -- even Big Pharma can see that they stand to make net gains from sharing their datasets. For an even better example, check out Sage Bionetworks. I was lucky enough to hear Stephen Friend speak at the Science Commons Symposium a couple of weeks ago, and one of the points he made was that the really big questions in biology require such immense amounts of data that the only way to collect them is to do it in the open. Any impediment at all, be it CC-BY attribution requirements or IP protections, will derail the whole process; the only answer in the end is the public domain. So, the seeds are there. I think continued crystallization is inevitable, but it's certainly worthwhile to try to monitor and direct the process -- by way of questions like those Alan is asking. ------------- Monday, 01 March
no art without
I remember reading somewhere about a school of philosophical thought which holds that there can be no art without the resistance of the medium -- that the art is in the difficulty the artist overcomes when trying to make the medium express his or her message. I don't know that I buy the idea, but I do notice that my cell phone camera doesn't have a very broad color or contrast palette, so it tends to blow out highlights and lose shadow detail -- and that I'm starting to recognize opportunities to exploit those weaknesses: ![]() I'm not sure I like being trained to a particular visual style like that, though. I picked up a camera in the first place in order to see differently, and I've been very pleased with the change in my world that this practice has rendered. I don't think I want to put blinders on it. Friday, 19 February
Panton Principles for Open Data in Science
The Open Knowledge Foundation has just announced the Panton Principles for Open Data in Science. Here's the point-form version of the Principles (but do go and read the whole thing, including the concise but important preamble; and please consider endorsing):
I've written elsewhere about my feeling that Open Data/Open Science will eventually need a set of core Declarations to do for the wider movement what the BBB definitions have done for Open Access. A set of widely accepted terms and definitions provides a framework within which ongoing discussions can be much more efficient, focused and useful, as well as a point of reference and a standard introduction for newcomers to a field. Kudos to OKF and partners for making a strong start in this direction. I do have one small quibble. Following Peters Suber and Murray-Rust, I want Open licenses to be three things:
The Panton Principles come right out and say "explicit", and "machine-readable" is largely covered because the recommended licenses are available in machine-readable versions (though I'd have preferred to see that actual phrase in the text of the Principles). What's missing, to my mind, is "conspicuous". The point of Open licensing is to enable and promote re-use, so it's important to make your license as obvious as possible to potential users. This might seem trivial, but I think it bears spelling out. My own Open Data mantra is:
and again, the PPs are 2 for 3 by my count. The licensing covers what I can have and what I can do with it, but there's no mention of where I can find it in the first place. When we're talking about a database, the question doesn't arise since the license is in the same place as the data. But if we're talking about data which underlie a published paper, those data are very often not in the same place as the paper, even if the license is there. So it's important to make sure that your data are available: find or build them a stable online home and then let potential users know where it is. There's not much point in placing something in the Public Domain if the only copy is on your desktop. I'd have liked to see an explicit discussion of storage, access and signposting in the Principles... though come to think of it, this is really a different (and enormous) set of questions. So perhaps "conspicuous" covers this as well, and the missing Principle is simply that there should be a highly visible link to the license and the data themselves in every place where they are used, mentioned or otherwise likely to be encountered. Of course, there are always unresolved questions no matter how carefully you craft your Declarations and Statements and Principles -- which is why the OKF has wisely built a companion tool, the Is It Open Data? web service. This is a brilliant way to remove ambiguity once and for all, on a case by case basis, by making public enquiry into the openness or otherwise of specific data sets. You can browse previous enquiries, so as to avoid redundant questioning of data owners; and naturally, recipients of multiple enquiries can use the service in a different way, simply linking to the record of their first response by way of answer to subsequent queries. Searchability might be a concern once the database of enquiries starts to grow, but that functionality can be added as needed. A central public service for asking questions about data availability and archiving the answers could go a long way towards improving access to data, simply by making clear the level of demand for Openness, and the degree to which supply falls short. |
RSS Feed email: sennomaATfastmailDOTfm about
about: (formerly Malice Aforethought) me spousal unit Bloglines account Simpy account Connotea account OpenWetWare userpage googlebombs for good Roe; Wade; Roe v Wade abortion Jew Seldovia Herald blogroll: category archives: er, sorry, these are broken at the moment although the links on each post still work monthly archives: August 2010 June 2010 April 2010 March 2010 February 2010 January 2010 October 2009 July 2009 June 2009 May 2009 April 2009 March 2009 January 2009 December 2008 November 2008 October 2008 September 2008 August 2008 July 2008 May 2008 April 2008 March 2008 February 2008 January 2008 December 2007 November 2007 October 2007 September 2007 August 2007 July 2007 June 2007 May 2007 April 2007 March 2007 January 2007 December 2006 November 2006 October 2006 September 2006 August 2006 July 2006 June 2006 May 2006 April 2006 March 2006 February 2006 January 2006 December 2005 November 2005 October 2005 September 2005 August 2005 July 2005 June 2005 May 2005 April 2005 March 2005 February 2005 January 2005 December 2004 November 2004 October 2004 September 2004 August 2004 July 2004 June 2004 May 2004 April 2004 March 2004 February 2004 January 2004 December 2003 |