Simon Willison’s Weblog

Subscribe

May 2009

May 20, 2009

Google Maps Data API (via) I’m disappointed by this one—it’s really just a CRUD store for the KML files used in Google MyMaps. It would be a lot more useful if it let you perform geospatial calculations against your stored map data using some kind of query API—a cloud service alternative to tools like PostGIS.

# 9:07 pm / postgis, google-maps, google-maps-api, googlemapsdataapi, gdata, apis, gis, kml

Yahoo! Geo: Announcing GeoPlanet Data. The Yahoo! WhereOnEarth geographic data set is fantastic, but I’ve always felt slightly uncomfortable about building applications against it in case the API went away. That’s not an issue any more—the entire dataset is now available to download and use under a Creative Commons Attribution license. It’s not entirely clear what the attribution requirements are—do you have to put “data from GeoPlanet” on every page or can you get away with just tucking the attribution away in an “about this site” page? UPDATE: The data doesn’t include latitude/longitude or bounding boxes, which severely reduces its utility.

# 9:12 pm / attribution, creativecommons, data, geoplanet, geospatial, gis, whereonearth, yahoo

Yahoo! Placemaker. Really exciting new API from Yahoo!—Placemaker accepts a block of text (or a URL to HTML or RSS) and extracts and returns geographical locations mentioned in the text. I just ran my djng blog entry through it and it pulled out “Prague” as the only location mentioned. This should be really useful for adding geodata to existing textual content.

# 9:34 pm / yahoo, geo, gis, placemaker, geocoding

May 21, 2009

AWS Import/Export: Ship Us That Disk! Andrew Tanenbaum said “Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway”, and now you can ship your storage device direct to Amazon and have them load the data in to an S3 bucket for you.

# 11:22 am / amazon, aws, s3, andrew-tanenbaum, bandwidth

If you review your first site version and don’t feel embarrassment, you spent too much time on it.

Reid Hoffman

# 9:56 pm / embarrassment, startups, reidhoffman

Working with Python and RabbitMQ. Nathan Borror eliminates the boilerplate needed to talk to RabbitMQ (or any other AMQP queue server) from Python.

# 11:10 pm / nathan-borror, rabbitmq, amqp, message-queues, python

TwitterAlikeExample—redis. Excellent example of how you design a moderately complex system against a scalable key-value store (in this case redis). Most “how to build Twitter” code examples fail to address the hard problem of scaling user inboxes, but this one tackles it head on.

# 11:14 pm / redis, keyvaluepairs, twitter, scaling

May 22, 2009

Dinky pocketbooks with WebKit transforms. Nat used 90 degree CSS transform rotations in print stylesheets for WebKit and Safari to create printable cut-out-and-fold pocketbooks from A4 pages. Very neat.

# 12:33 am / natalie-downe, css, csstransforms, webkit, safari, printstyles, rotation, pocketbooks

Fake Reviews. Now now kids, play nice... Not at all surprised to hear this—nefarious iPhone app developers (in this case the team behind “London Tube”, an inferior version of Malcolm Barclay’s marvellous “Tube Deluxe”) have been caught leaving fake negative reviews on rival applications in the App Store. This is an excellent argument for adding friends/followers or importing an existing social graph—I’d much rather see reviews from people in my social network than strangers who may turn out to be sock puppets.

# 12:49 am / socialgraph, iphone, apple, socialnetworks, malcolmbarclay, tubedeluxe, londontube, appstore, sockpuppets

Flickr Shapefiles Public Dataset 1.0. Another awesome Geo dataset from the Yahoo! stable—this time it’s Flickr releasing shapefiles (geometrical shapes) for hundreds of thousands of places around the world, under the CC0 license which makes them essentially public domain. The shapes themselves have been crowdsourced from geocoded photos uploaded to Flickr, where users can “correct” the textual location assigned to each photo. Combine this with the GeoPlanet WOE data and you get a huge, free dataset describing the human geography of the world.

# 6:12 pm / flickr, shapefiles, geospatial, geo, geoplanet, yahoo, maps, creativecommons, crowdsourcing

Muck Rack: Links posted by Guardian Journalists on Twitter. I’m rather impressed by the Sawhorse Media collection of Twitter aggregation sites (Muck Rack aggregates journalists)—a simple idea very well executed. Here’s a nice example—this page shows links posted to Twitter by known Guardian journalists, but goes a step further and scrapes in the favicon, the real title of the page and resolves the domain from any shortened links.

# 10:02 pm / twitter, aggregation, mashups, journalists, guardian, sawhorsemedia, muckrack, favicon

Introducing Yardbird. I absolutely love it—an IRC bot built on top of Twisted that passes incoming messages off to Django code running in a separate thread. Requests and Response objects are used to represent incoming and outgoing messages, and Django’s regex-based URL routing is used to dispatch messages to different handling functions based on their content.

# 11:13 pm / irc, django, twisted, yardbird, threading, regular-expressions

May 23, 2009

JS-Placemaker—geolocate texts in JavaScript. Chris Heilmann exposed Placemaker to JavaScript (JSONP) using a YQL execute table. Try his examples—I’m impressed that “My name is Jack London, I live in Ontario” returns just Ontario, demonstrating that Placemaker’s NLP is pretty well tuned.

# 12:36 am / placemaker, yahoo, christian-heilmann, javascript, jsonp, yql, yqlexecute, nlp, geospatial, geocoding

iPlayer usage, for streaming, peaks about 10pm - just a little later from TV. But interestingly, iPlayer on the iPhone peaks at about midnight. So people are clearly going to bed with their iPhone and watching in bed. And we also see on the weekends, there's a peak of Saturday and Sunday morning usage at about 8 to 10am in the morning on iPhone.

Anthony Rose

# 12:42 am / iplayer, iphone, bbc

May 24, 2009

On the Anonymity of Home/Work Location Pairs. Most people can be uniquely identified by the rough location of their home combined with the rough location of their work. US Census data shows that 5% of people can be uniquely identified by this combination even at just census tract level (1,500 people).

# 1:14 pm / bruce-schneier, privacy, location, census

May 25, 2009

uuidd.py. Neat implementation of an ID server from Mike Malone—it serves up incrementing integers over a socket (using Python’s asyncore for fast IO) and records state to a file only after every 10,000 IDs served, so most of the time it’s not reading or writing to disk at all. If the server crashes it doesn’t matter because it can start up again at an integer it’s sure hasn’t been used before.

# 9:34 pm / uuid, idserver, python, mike-malone, asyncore, scaling

The Web vs. the Fallacies. Tim Bray on how the architecture of the Web helps developers handle the Fallacies of Distributed Computing.

# 11:49 pm / tim-bray, fallacies, web

May 27, 2009

Testing Django Views for Concurrency Issues. Neat decorator for executing a Django view under high concurrency in your unit tests, to help spot errors caused by database race conditions that should be executed inside a transaction.

# 10:01 am / django, testing, concurrency, raceconditions, python, threadsafety

geocoders. A fifteen minute project extracted from something else I’m working on—an ultra simple Python API for geocoding a single string against Google, Yahoo! Placemaker, GeoNames and (thanks to Jacob) Yahoo! Geo’s web services.

# 10:02 am / geocoders, github, projects, geocoding, placemaker, google, yahoo, geonames, jacob-kaplan-moss, python, web-services

You ask, they answer: Neal’s Yard Remedies. After reading the comments, something tells me Neal’s Yard Remedies may be regretting their decision to answer questions from Guardian readers.

# 10:35 am / guardian, nealsyardremedies, homeopathy

Dice-O-Matic hopper and elevator. An outstanding piece of applied geekery, now generating dice rolls for GamesByEmail.com. “It is a 7 foot tall, 104 pound, dice-eating monster, capable of generating 1.3 million rolls a day.”

# 7:32 pm / randomness, dice, gamesbyemail, random

May 28, 2009

Changes in Opera’s user agent string format (via) How depressing... Opera 10 will ship with 9.80 in the User-Agent string because badly written browser sniffing scripts can’t cope with double digits.

# 1:16 am / opera, browsersniffing, browsers, useragent

Announcing Google Maps API v3. Sounds like a complete rewrite, with performance as the key goal. Only a developer preview at the moment, but my favourite feature is that API keys are no longer required.

# 1:22 am / google, api-keys, google-maps, googlemaps3, mapping

TiddlyPocketBook. Paul Downey took Nat’s dinky pocketbooks CSS and combined it with TiddlyWiki to create a single page pocketbook editor.

# 1:24 am / pocketbook, natalie-downe, paul-downey, tiddlywiki, css, javascript

optfunc. Command line parsing libraries in Python such as optparse frustrate me because I can never remember how to use them without consulting the manual. optfunc is a new experimental interface to optparse which works by introspecting a function definition (including its arguments and their default values) and using that to construct a command line argument parser. Feedback and suggestions welcome!

# 7:38 pm / optfunc, github, introspection, commandlines, optparse, projects, python

PostgreSQL Development Priorities. The top two for 8.4 are “Simple built-in replication” and “Upgrade-in-place”, Josh Berkus is seeking feedback on priorities for future work on 8.5.

# 8:08 pm / postgresql, replication, josh-berkus, databases, open-source

Perl 6: The MAIN sub (via) "Calling subs and running a typical Unix program from the command line is visually very similar: you can have positional, optional and named arguments." - that's exactly what I was thinking when I came up with optfunc.

# 9:32 pm / perl, optfunc, commandlines, perl6, python, unix

May 30, 2009

Knockbrex Castle. I’m off to a Scottish castle with 11 fellow geeks for /dev/fort—offline for six days, back next Saturday.

# 10:51 am / devfort, castle, sedf, scotland, holiday, offline, knockbrex

2009 » May

MTWTFSS
    123
45678910
11121314151617
18192021222324
25262728293031