andypowe11

Wednesday 3 February 2010: More famous than Simon Cowell

Bear with me on the title...

I don't usually write about work-related stuff here but am going to make a brief exception just this once. I am supposed to be up in London today at a meeting to discuss 'persistent identifiers' - such is the exciting life I lead. However, I also now have a hospital appointment as well, to get my ears fixed, and hospital appointments are not that easy to come by so I decided to miss the London meeting and go to the hospital instead.

Persistent identifiers are a regular topic of conversation for me, particularly as they pertain to the academic and cultural heritage sectors. An identifier is a name or label for something (e.g. a research paper, a museum object, or the digitised versions of those things) that one can use to refer to that thing even when you don't have it immediately to hand. For identifiers to work, there has to be some level of agreement about what is being identified, at least by all the parties that need to make use of it. This is usually achieved by having some widely agreed way of moving from the identifier to the thing being identified (sometimes known as 'dereferencing'). The persistence aspect comes about because one wants to be able to refer to things for very long periods of time - 100s of years in some cases, particularly in the cultural heritage sector. In the context of relatively new technology (like the Internet), thinking about how to make things work persistently for 100s of years is a non-trivial task.

Are you still reading?

This blip entry has an identifier. So did yesterday's. So does the whole Blipfoto website. Many of you will know these identifiers as URLs (or Uniform Resource Locators to use the full name). But it is worth noting that in a detailed technical sense, the term URL is somewhat disputed. In technical discussions I now tend to prefer to use the term 'http URI' - the 'http' bit referring to the way the URL starts and the 'URI' bit referring to Uniform Resource Identifier. This is usually, but not always, less contentious.

Let's look at the 'identifier' for my blip yesterday:

http://www.blipfoto.com/view.php?id=464924&month=2&year=2010

What makes this URL persistent or not? How long will it remain useful as an identifier? A year? Yes definitely. Two years? Yup. Five years? Very likely. Ten years? Probably. 20 years? Maybe. 50 years? Who knows! (I'll be dead anyway). 100 years? Haven't got a clue! In a sense, paying the Blipfoto membership fee is a vote of confidence in the persistence of Blipfoto identifiers for a reasonable time into the future - but I'm not sure what 'reasonable' actually means here.

So long term persistence isn't guaranteed by any means. But there are practical steps we can take to help things run more smoothly. The URL above is what the W3C (the body that looks after the Web) would call 'uncool' (as opposed to it being a cool URI). That means that it is potentially not as persistent as it could be? Why? Because it contains the string '.php'. PHP is the programming language used to build the Blipfoto website. If, in 10 or 20 years time, the Blipfoto team decide to stop using PHP to deliver the site (if use of the PHP programming language dies out for example) then they will either have to make the '.php' URLs work using some other language, or they will 'break' all the existing identifiers. It won't be an impossible task to work around this... but it will be a factor in making Blipfoto identifiers persistent into the future.

There are also other aspects to the persistence of this identifier - the domain name for example, i.e. the 'www.blipfoto.com' part of the URL. Domain names are essentially rented, and, like any rented commodity, agreements about the right to use a particular domain name can lapse for various reasons. In extreme cases (remembering that we are talking about very long periods of time here) the whole infrastructure of the Internet might change.

Now, of course, the persistence of Blipfoto URLs isn't a major worry for most of the world - even for most Blipfoto members. But in an academic context, the identifiers used to refer to academic research papers, or the research data that underpins those papers, is important - and it's important over the long term. For example, one can imagine that researchers in 20 years time will want to be able to refer back to the papers being used now to predict trends in global warming.

And so to today's blip - which is of an identifier. It's the identifier for a railway bridge in the UK. I assume (though I don't know for sure) that all UK railway bridges have such an identifier. I don't know how long such identifiers have been in use and it is somewhat hard to predict how long they will work into the future but it's probably fair to say that it will be for quite some time.

The community of use for this identifier is fairly small - extending not much beyond the people who work on the railways and the emergency services I would guess. If I said to you, "go to bridge 179-40", you wouldn't have a clue what I was talking about and you probably wouldn't have any way of finding out.

Likewise, if I said to you, "go and read doi:10.1037/0003-066X.59.1.29" I'm guessing that most of you wouldn't know what to do? (This is an example of the kind of identifier, called a DOI, typically used to refer to academic papers - it's actually the identifier for a paper called "How the Mind Hurts and Heals the Body" by Ray, Oakley published in American Psychologist. Vol 59(1), Jan 2004, 29-40).

What makes the bridge identifier persistent? It's essentially a social construct. It's not a technical thing (primarily). It's not the paint the number is written in, or the bricks of the bridge itself, or the computer system at head office that maps the number to a map reference. These things help... but it's mainly people that make it persistent.

What's interesting, and so powerful, about http URIs is how widely they are understood. I don't know what proportion of the world's population you could show 'http://www.blipfoto.com' to and they would understand it and know what to do with it? I'm guessing it would be more than 50% - well over 50% probably. More than would recognise a picture of Simon Cowell, David Beckham or Michael Jackson? I guess so?

That's fame that is!

February 2010
1234567
891011121314
15161718192021
22232425262728
       
       

DSLR-A200 : f/3.5 : 1/80" : 50mm : ISO 400

viewed 2319 times : 11 comments : show tags

Tags

identifiers jiscpid bridges

Comments

  1. Well that was a very satisfying read. I'm fascinated as much by the future as I am by the past. I think having blipfoto journals accessible in 100 years time would be of immense value to others, of course, and things like URLs, php, jpg and such are the certainly limiting factors for that to happen. In fact the whole digital medium becomes a point of concern. It's so flexible, dynamic and mutable but they are not attributes that lend themselves well to longevity. I'm impressed with the images on shorpy that are over a hundred years old but I feel they, the negatives, have so much more longevity than their digital counterparts. The digital medium leaves no fossilized remains I fear.

    ~ constant

  2. This sort of thing is hard enough on the personal level - I still haven't sorted out how my grandchildren are going to be able to view the photos I'm taking now (print them out and put them in a cupboard like my grandparents did, probably) - so I can only imagine the complexity involved in trying to archive and identify the many millions of published documents.

    ~ drcraig

  3. Yeah, Constant makes a good point that digital mediums have no fossilised remains - it's a worry thought that in a hundred years we might not be able to view the millions of photos on flickr etc - we'll have so little show if everything is stuck on computers we can no longer access.
    That's the good thing about analog technology I guess - negatives, film strips, vinyl.
    Interestingly, this is exactly what I'm basing my current Culture project on - this move to more digital technology that might not be as long lasting.

    Oh, and my way to combat the fact that in yeeeaaaarrrs to come blipfoto might be long gone, I started a visual diary this year - one photo a day stuck on a page. Sor'ed.

    ~ paulransom

  4. There is a saying In the Finnish language "pudota kelkasta", the word to word translation would be "fall from the sledge". I just fell.

    ~ nina54

  5. Andy..................................... you are your Father's son.

    ~ jamjaragain

  6. Found blipfoto for the first time after reading your efoundations blogpost this afternoon, thanks for that, it looks great! Trying to convince institutions, higher education and cultural, of the need for persistent identifiers isn't easy, keep up the fight!

    ~ newfolder

  7. I have just read it .... but I think I'll have to try again tomorrow!

    ~ jayneebevan

  8. Are you glad you got all that off your chest now? I hope you went away and had a stiff drink, because that's what I'm about to do!

    Funnily enough, I nearly blipped a railway bridge URI the other day!

    ps yes, the card arrived on time; and how did your appointment go?

    ~ nicky

  9. there are signs near these that ask you to call in the event of an accident involving a bridge - i guess the identifiers are used then. you'll see similar identifiers all along the sides of motorways and all our roads have numbers. i suspect there are plenty of other examples of identifiers used (and maintained) by society for its benefit (albeit created by a single organisation and, as you suggest, probably not widely shared).

    i don't know if there was a long debate in a smokey back room at Paddington Station, where men in top hats discussed the form of the bridge identifier for days, but i suspect it more likely someone made a unilateral decision.

    the Web's blessing, and its curse, is that it is something of a free for all and i wonder does this fact prevent the people-based commitment to persistence you identify? (for i'd argue that the bridge identifiers persist on account of an organisation rather than people in general).

    would it be better to hand Web identifiers to a single, unilateral authority that provides similar organisation and thus ensures the public good of Web identifiers? and is that what DOI (misguidedly though not using http URIs) is attempting?

    ps. sorry to bring work to blipfoto - you started it! :-) i'll get my coat now.

    also hope your ears are better!

    ~ pixelatedpete

  10. Fascinating read - really. I think I have learnt something today. And you were right about my proposal I was writing. It only came together two days before hand in!

    ~ seaurchin

  11. Never mind ears, I need an eye appointment after reading all of that! It started off in English, then turned into gobble-dy-gook language for a while, before ending with a paragraph in English again! The title makes sense now! :-)

    ~ Fi

You haven't been invited to comment on this journal. Why not start your own journal and join the community? It's free, only takes a minute and you'll be able to comment on everyone's journals. Click here to get started.

Sign in to comment Help
Email address Password
Keep me signed in
Get your own Blipfoto journal now
Terms of Use