At the World Wide Web Conference, Building a Semantic Web in Which Our Data Can Participate panel. A few notes, loosely joined.
Open Street Maps generates and annotates street maps from open sources of data. In the UK and Canada, unlike in the U.S., street map data is protected by Crown Copyright, so folks who want to annotate maps generally can’t. Can we compare the range of map-based products available between US and UK/Canada to see whether openness or closure is better for this data, for the public? It would cost $400,000CAN to collect all the maps of Canada from official sources, an audience member says, and even then you wouldn’t be allowed to post and annotate them. In the US, $30 buys them all on a CD, in the public domain.
Freebase aims to create a meta-database of free information that can connect multiple sources of information. Jamie Taylor positions free information in Geoffrey Moore’s terminology of core versus context. If data is not your core competency, then you should open it up, let the community contribute to your costs of maintaining it — and helping you to find new uses for it. Along the business lifecycle, opening (or modularising) your data can allow you to focus on the core where you have comparative advantage, and force weaker competitors to move there too.
With collaborative databases, questions of the trustworthiness of the data come to the fore. Metadata becomes even more important, particularly metadata about origin, as well as validation by corroboration among multiple datasets. Freebase uses internal foreign keys to trace the source of datasets.
And thinking about the validity of contributed data can make us think about better ways to validate internally sourced data too. Can we trace its origins, compare it to others’ measurements? Can we build in the metadata fields that allow us to rate the trustworthiness of elements and collaborate to focus on the weak spots? Defensive programming is good for everyone’s data, even our own.
Peter Murray, talking about open access to scientific data, gives the example of PubChem. Before PubChem, each chemical supplier claimed copyright and proprietary interests in its catalogues. Now, if you’re not in PubChem, you might as well not exist, so they’ve opened up, opening access to chemical information as well as expanding their markets.
Just found a PDF of presentations.