Sunday, 5 October 2008

The Internet doesn't know everything any more...

...not that it ever did, but for a long time there were lots of people like me around the world, spending time getting as much information into the Internet as possible (with the added aim of making as much of it accessible as possible as well). But now there are gaps. There always have been gaps, but where they existed and were important to someone or anyone, they were filled with information by enthusiasts of model trains, language quirks (big nod to Tylor Jones and his June29 language site), research papers (citeseer) and a whole host of things that most people didn't know existed til then. But that doesn't happen so often anymore. I was an early adopter of the Internet (see under Geek Code) but I think I blinked, and in the time that I was blinking, the Internet went from being an inderground collective dedicated to freedom of information and became a series of newsletters, with editors and rules and regulations. I think I've been aware of this change happening, but it really hit home last week. I tried to look up my road race times over the last few years. I run - slowly but steadily - and I needed proof of that for a race entry. And so I looked myself up, in the comfortable expectation of finding at least 10 years of staggering round courses all over the country at just under 10kph. And nothing. Or nearly nothing. I've run some very big races: the Great South, Great North, London; and some very small races: little local things that one man and his dog turned out to watch (and the dog was more interested in the local rabbits). And I thought, naively, that the times and field for each of these races would still be held somewhere on the 'net, easily accessible for me to check whether I came in at 6704th or 6705th. Nope. Not there. Now back in the day, runners were an enthusiastic bunch: you ran, the results got posted and stayed on the net for you to look at later. Not any more.

I did find a couple of little local results back from -ahem- 1999, which was nicely parish-magazine. But I was quite shocked at the lack of big-race data. F'instance: the Great X series. One year's data only. Now given how much it costs to store data for a 10000 person race (name and time) compared to even a small video... heck, let's do the thought experiment on that one here... a race field of 10000 people: give them 30 digits each for their name and two digits for the time they finish in: I make that 320,000 bytes. Now compare that with a single 480*640 pixel digital camera image: 307,200 bytes in black-and-white, and three times that for colour (each pixel in a typical colour image has one byte each for red, green, blue). So if storage isn't the issue, what is? Is it access costs? Publication rights? If the race organisers put up the results for each year, they must actively remove the old ones every year, and I'm puzzled about why. I mean, it's bad enough with people selling rights to view public information without that information being completely removed as well. What's going on, folks?

Monday, 16 June 2008

SQL, the -erm- sequel

And so to SQL. I've found a lovely little site called SqlZoo to brush off my sql rust with: it's very enthusiastic children's television meets quite gentle learning. I like. Thank you chaps.

Blog tracking

Well, I did try Google Analytics, but didn't really get on with it. So now it's time to play with IceRocket's tracker; link courtesy of a visibly shrinking (courtesy of the diet) Gemma. It will be interesting to see what happens, and whether I really do have an audience of 1.

Saturday, 7 June 2008

c sharp

Okay, time to learn c#. So far, all I know about it is that it's a C++ nd Java-like hybrid that's nanny-like in its control of variable use (in fortran, implicit none was a good idea; removing the ability to shadow global variables with local ones seems to be very market-specific, i.e. if you have to hack in this language, it's going to be even messier than usual).

Anyway, onwards: onto a tutorial or two, then back to TopCoder to test out my new skills...

Friday, 30 May 2008

Teeny weeny 'puter things

Gad but I have a cool job sometimes. Right now, it's playing with teeny-tiny components, some of which I want to remember here (just in case I feel like building some strange bots of my own). So... cool little gumstix computer, C328RS camera, and a cute little 5dof imu. Not that this is fun or anything, but... well, it's fun okay. Although the soldering-related screaming and sticking-plaster bonanza are bound to start soon...

Monday, 26 May 2008

First python

Okay, so I didn't break into Google (not in the conventional sense anyways. There was this misunderstanding with an oversimple instruction and an unlocked door, but we'll gloss over that...), but I did get the building-stuff bug back again, which can only (as long as you don't have to share any space with me when I'm coding at 3am) be good.

So. Today. My first Python program. Had bought the book and skimmed the language ready for the above, and was just having a snuffle round the big G's website when I fell over their coding pages. And the Google App Engine. It's a cute little beastie (so far), but we'll see how I do when I get past the 'Wahey! Hello World!' stage. And I really need to sort my path out asap; it's not good typing in the whole thing every time...

Things to remember include: http://localhost:8080/

Monday, 5 May 2008

What do Microsoft need now?

I'd never really thought about the internet advertising market til the Microsoft-Yahoo thing came up. I mean, I know that when I search on G, a load of adverts appear above and beside my search results and there's an order to the search results that may or may not have a commercial bias (at least it will if the companies are canny about things like keywords and Dmoz), but I'd never really thought about how they got there and what it means.

So. The Microsoft-Yahoo marriage is off, the bride's run away, the Maid of Honour (News Corp) won't be forgiven because she took her dress off and played best man instead, and the groom's still on the hunt for a pretty girl to hook up with (but choose wisely my friend; divorce costs). Except it's all gone a bit Babylonian bride market.

So again. If Microsoft wants to take on the internet advertising market, what does it need? In one word: eyeballs. In more words: eyeballs, search histories, processing and trust. And that's the hard part. People stay in business because they understand transactions (or genuinely do have a product that a) people want and b) not many other people can provide. But that's rare). And there are at least three sets of people in these transactions: the companies paying to gain access to people who might want to view their sites, the viewers and the matchmakers (staying on the marriage theme for a moment) between them.

Google (and this is at base all about Google)'s internet advertising works because it brings in an incredible volume of traffic (through both search and all the other tools it punts, e.g. Google Earth etc), knows what that traffic thinks it wants (search histories and profiles) and doesn't chuck too many unsuitable matches at the user (does Google have a dating service? Maybe it should...silly me; it's horrifying). Microsoft wants to match this. Although it could be argued that a more lateral approach than building its own Google would be more sensible, I'll indulge that idea for a moment. So who would I buy if I were Steve B at the moment?

Well. Potentially Murkysoft has its own eyeballs and processing, but not much in the way of search histories or trust. So a search engine or information site, preferably something large and generic (Ask? Dogpile?) or failing that a set of more topic- specific ones (Kelkoo? Multimap? About?). Trust is a difficult one; to pull this off, Microsoft would either have to get its own teams working closer together (it seems it suffers from the same internal markets that made NASA fail) or front its efforts through a more-trusted third party (in a hand-sweepy way, nobody trusts big business but everyone likes a friendly puppy). It has enough access to ideas (through its own research teams and university links); it just needs the front, the stringy bit that holds them together. Me, I'd be kicking myself for not having started a small esoteric search site, if I hadn't been doing so many cool things instead. But it will be fun to watch all the other girls applying their makeup and practicing their winks. I just hope they don't fight too badly if the answer is a harem.