Simon Willison’s Weblog

Subscribe

October 2009

Oct. 17, 2009

Whenever you build a security system that relies on detection and identification, you invite the bad guys to subvert the system so it detects and identifies someone else. [...] Build a detection system, and the bad guys try to frame someone else. Build a detection system to detect framing, and the bad guys try to frame someone else framing someone else. Build a detection system to detect framing of framing, and well, there's no end, really.

Bruce Schneier

# 4:55 pm / bruce-schneier, security, framing

Oct. 19, 2009

Our industry has collectively taught average people over the last few decades that computers should be feared and are always a single misstep from breaking. We’ve trained them to expect the working state to be fragile and temporary, and experience from previous upgrades has convinced them that they shouldn’t mess with anything if it works. [...] The upgrade market for average PC owners is dead. We killed it.

Marco Arment

# 8:30 pm / marcoarment, upgrading, pc

This shouldn’t be the image of Hack Day

I love hack days. I was working in the vicinity of Chad Dickerson when he organised the first internal Yahoo! Hack Day back in 2005, and I’ve since participated in hack day events at Yahoo!, Global Radio and the Guardian. I’ve also been to every one of Yahoo!’s Open Hack Day events in London. They’re fantastic, and the team that organises them should be applauded.

[... 445 words]

Oct. 20, 2009

High-end Varnish-tuning. Tuning the Varnish HTTP cache to serve 27K requests/second on a single core 2.2GHz Opteron.

# 9:25 am / varnish, performance, caching, http

With ubiquitous mobile broadband not far over the horizon, a hyper-connected society might also turn out to be a hyper-indignant one.

Martin Belam

# 3:27 pm / martinbelam

Oct. 21, 2009

Comcast: Twitter Has Changed The Culture Of Our Company. “Frank Eliason (@Comcastcares on Twitter) now has 11 people working under him simply to respond to information about Comcast being broadcast on Twitter.”

# 9:56 am / comcast, twitter, frank-eliason

You count the "value" that is lost by people who would have made money selling rival goods, but can't now because they can't compete with free. But you don't count the value that is created by people who build upon the freely given goods. [...] In other words, you only look at the first-order effects. It's the same mistake a lot of people make when they accuse open source developers of "dumping" and ruining the market for competing software. That's true, in a very narrow sense, but it ignores all the other people who took that software and used it to create something else of value.

Mark Pilgrim

# 9:59 am / mark-pilgrim, open-source, free

Introducing Cloudera Desktop. It’s a GUI for Hadoop, and under the hood is a whole stack of open source software, including Python, Django, MooTools, Twisted, lxml, CherryPy, Mako, Java and AspectJ.

# 6:48 pm / hadoop, open-source, cloudera, python, django, mootools, twisted, lxml, cherrypy, mako, java, aspectj

How We Made GitHub Fast. Detailed overview of the new GitHub architecture. It’s a lot more complicated than I would have expected—lots of moving parts are involved in ensuring they can scale horizontally when they need to. Interesting components include nginx, Unicorn, Rails, DRBD, HAProxy, Redis, Erlang, memcached, SSH, git and a bunch of interesting new open source projects produced by the GitHub team such as BERT/Ernie and ProxyMachine.

# 9:14 pm / github, scaling, nginx, unicorn, rails, drbd, haproxy, replication, redis, erlang, memcached, ssh, git, proxymachine, ruby, ernie

Introducing BERT and BERT-RPC. Justification for inventing a brand new serialisation protocol: Thrift and Protocol Buffers both use IDLs and code generation, XML “is not convertible to a simple unambiguous data structure in any language I’ve ever used” and JSON lacks support for unencoded binary data. The result is BERT—Binary ERlang Term—which extracts a format from Erlang in much the same way that JSON extracted one from JavaScript.

# 10:11 pm / protocolbuffers, json, erlang, javascript, serialisation, thrift, xml, github

Oct. 22, 2009

Why I like Redis

I’ve been getting a lot of useful work done with Redis recently.

[... 900 words]

Oct. 23, 2009

Remember when blogs were more casual and conversational? Before a post's purpose was to grab search engine clicks or to promise "99 Answers to Your Problem That We're Telling You You're Having". Yeah. I'd like to get back to that here.

Dan Cederholm

# 4:17 pm / dan-cederholm, blogging

Oct. 25, 2009

Bits of Evidence (via) A slide deck from Greg Wilson: “What we actually know about software development, and why we believe it’s true”.

# 12:13 pm / evidence, software-engineering, development, gregwilson

Play framework for Java. I’m genuinely impressed by this—it’s a full stack web framework for Java that actually does feel a lot like Django or Rails. Best feature: code changes are automatically detected and reloaded by the development web server, giving you the same save-and-refresh workflow you get in Django (no need to compile and redeploy to try out your latest changes).

# 11:21 pm / play, java, frameworks, web, django, rails

Twisted inlineCallbacks and deferredGenerator. inlineCallbacks are a brilliant (but seemingly under-promoted) feature of Twisted which use the ability to return a value from a yield statement to make asynchronous callbacks look much more like regular sequential programming.

# 11:30 pm / python, twisted, async, callbacks, generators, yield

Oct. 26, 2009

Toiling in the data-mines: what data exploration feels like. Useful advice from Tom Armitage on the exploratory development approach required when starting to build a project against a large, complex dataset. Tips include making sure you have a REPL to hand and using tools like gRaphael to generate graphs against pretty much everything, since until you’ve seen their shape you won’t know if they are interesting or not.

# 9:34 am / data, tom-armitage, repl, exploratoryprogramming, programming, graphael, graphing, berg

Django 1.2 planned features. The votes are in and the plan for Django 1.2 has taken shape - features are split in to high, medium and low priority. There's some really exciting stuff in there - outside of the things I've already talked about, I'm particularly excited about multidb, Model.objects.raw(SQL), the smarter {% if %} tag and class-based generic views.

# 10:38 am / django, multidb, python, classbasedviews, orm

I was thinking the other day how long it had been since I used the acronym "IRL" or the expanded phrase "In Real Life." It used to be the thing we'd say when we meant "not on the internet", and I'm glad that it has become gradually obsolete over the years, now that the internet is accepted as part of life.

Meg Pickard

# 9:59 pm / megpickard, slang, irl, internet

Oct. 28, 2009

PostgreSQL 8.5 alpha 2 is out. “P.S. If you’re wondering about Hot Standby and Synchronous Replication, they’re still under heavy development and still (at this point) expected to be in 8.5.”—Hot Standby is PostgreSQL-speak for MySQL-style master/slave replication for scaling your reads.

# 9:02 am / scaling, postgresql, replication, hotstandby, databases

Underscore.js. A new library of functional programming primitives for JavaScript—each, map, all, any, inject, detect etc. Unlike some similar libraries this one doesn’t extend the built-in objects, instead opting to bind the new functions to the underscore symbol. A jQuery-style noConflict() option is available if even that is too much namespace pollution for you.

# 5:08 pm / underscore, javascript, documentcloud, functional, jquery, noconflict

JSLitmus. “A lightweight tool for creating ad-hoc JavaScript benchmark tests”. Includes an ingenious hack for graphing the results—it generates a Google Chart, then provides a TinyURL for viewing that chart in the future. The TinyURL is generated by pointing an inconspicuous iframe at the TinyURL API and letting the user copy-and-paste the resulting shortened URL directly out of the iframe.

# 5:11 pm / jslitmus, benchmarking, javascript, tinyurl, google-charts, iframes

Oct. 29, 2009

memcache-top. Useful self-contained perl script for interactively monitoring a group of memcached servers.

# 8:32 am / perl, memcached, monitoring

Oct. 30, 2009

The Secret Identity of the Peep Show Tweeter. Like many others, I had assumed the Peep Show character accounts were “official”—especially when they started live-tweeting their thoughts in real time as the episodes were aired. Turns out it was actually a very clever fan.

# 6:46 pm / peepshow, twitter

2009 » October

MTWTFSS
   1234
567891011
12131415161718
19202122232425
262728293031