Sunday 5 October 2008

The Internet doesn't know everything any more...

...not that it ever did, but for a long time there were lots of people like me around the world, spending time getting as much information into the Internet as possible (with the added aim of making as much of it accessible as possible as well). But now there are gaps. There always have been gaps, but where they existed and were important to someone or anyone, they were filled with information by enthusiasts of model trains, language quirks (big nod to Tylor Jones and his June29 language site), research papers (citeseer) and a whole host of things that most people didn't know existed til then. But that doesn't happen so often anymore. I was an early adopter of the Internet (see under Geek Code) but I think I blinked, and in the time that I was blinking, the Internet went from being an inderground collective dedicated to freedom of information and became a series of newsletters, with editors and rules and regulations. I think I've been aware of this change happening, but it really hit home last week. I tried to look up my road race times over the last few years. I run - slowly but steadily - and I needed proof of that for a race entry. And so I looked myself up, in the comfortable expectation of finding at least 10 years of staggering round courses all over the country at just under 10kph. And nothing. Or nearly nothing. I've run some very big races: the Great South, Great North, London; and some very small races: little local things that one man and his dog turned out to watch (and the dog was more interested in the local rabbits). And I thought, naively, that the times and field for each of these races would still be held somewhere on the 'net, easily accessible for me to check whether I came in at 6704th or 6705th. Nope. Not there. Now back in the day, runners were an enthusiastic bunch: you ran, the results got posted and stayed on the net for you to look at later. Not any more.

I did find a couple of little local results back from -ahem- 1999, which was nicely parish-magazine. But I was quite shocked at the lack of big-race data. F'instance: the Great X series. One year's data only. Now given how much it costs to store data for a 10000 person race (name and time) compared to even a small video... heck, let's do the thought experiment on that one here... a race field of 10000 people: give them 30 digits each for their name and two digits for the time they finish in: I make that 320,000 bytes. Now compare that with a single 480*640 pixel digital camera image: 307,200 bytes in black-and-white, and three times that for colour (each pixel in a typical colour image has one byte each for red, green, blue). So if storage isn't the issue, what is? Is it access costs? Publication rights? If the race organisers put up the results for each year, they must actively remove the old ones every year, and I'm puzzled about why. I mean, it's bad enough with people selling rights to view public information without that information being completely removed as well. What's going on, folks?