Simon Willison’s Weblog

Subscribe
Atom feed for metadata

11 posts tagged “metadata”

2025

RDF has the same problems as the SQL schemas with information scattered. What fields mean requires documentation.

There - they have a name on a person. What name? Given? Legal? Chosen? Preferred for this use case?

You only have one ID for Apple eh? Companies are complex to model, do you mean Apple just as someone would talk about it? The legal structure of entities that underpins all major companies, what part of it is referred to?

I spent a long time building identifiers for universities and companies (which was taken for ROR later) and it was a nightmare to say what a university even was. What’s the name of Cambridge? It’s not “Cambridge University” or “The university of Cambridge” legally. But it also is the actual name as people use it. [It's The Chancellor, Masters, and Scholars of the University of Cambridge]

The university of Paris went from something like 13 institutes to maybe one to then a bunch more. Are companies locations at their headquarters? Which headquarters?

Someone will suggest modelling to solve this but here lies the biggest problem:

The correct modelling depends on the questions you want to answer.

IanCal, on Hacker News, discussing RDF

# 6th September 2025, 6:41 am / hacker-news, metadata, rdf, sql

2019

Microbrowsers are Everywhere (via) Colin Bendell introduces a new-to-me term, “microbrowsers”, to describe the user-agents which hit websites to generate unfurled link previews in messenger apps. Twitter and Facebook first popularized them, but today you’re likely getting far more preview-generating traffic from chat clients such as iMessage, WhatsApp and Slack (which won’t execute script and ignore cookies, and hence won’t show up in Google Analytics). Lots of great tips here—one example: if you provide three og:image meta tags iMessage will render them as a collage.

# 18th December 2019, 8:32 am / 24-ways, metadata, urls

2010

Facebook’s Open Graph Protocol from a Web Developer’s Perspective. Best explanation I’ve seen yet of what the Open Graph protocol actually does. Add the RDFa-inspired metadata and a Like button to a standard web page representing a place, group, product, website or one of another limited set of object types and people can “Like” it just like they might join a fan page within Facebook itself. You can then send news feed updates to all of that page’s subscribers. The bootstrapped metadata can then benefit other services as well.

# 26th April 2010, 1:21 pm / dare-obasanjo, facebook, metadata, opengraph, opengraphprotocol

WildlifeNearYou can now tag your Flickr photos for you. I’m really excited about this feature: if you opt-in, WildlifeNearYou will now write name and latin name tags to your Flickr photos after you’ve marked the species in the photo. This is even more interesting when you combine it with our suggest-a-species feature (the photo won’t get tagged until you’ve approved the suggestion). We also set the location on photos which don’t yet have one, but the real fun is the machine tags we’ve added, which allow developers to use the Flickr API to find photos by their WildlifeNearYou metadata (trip, species and place IDs). As a neat extra touch, the identifiers we use in the machine tags are the same as the ones used by our custom wlny.eu URL shortener, so it’s trivial to turn a machine tag in to the URL for that page on the main site.

# 4th February 2010, 5:01 pm / flickr, machinetags, metadata, tagging, wildlifenearyou

2009

Revving up. Jeremy Keith advocates adding the revcanonical attribute to regular A elements as well as / instead of hiding it in the head of the document, following the microformats design principle that invisible metadata is less valuable than augmenting visible links. I’ve updated my shorten bookmarklet to handle this case.

# 12th April 2009, 12:29 pm / jeremy-keith, metadata, microformats, revcanonical

Specify your canonical. You can now use a link rel=“canonical” to tell Google that a page has a canonical URL elsewhere. I’ve run in to this problem a bunch of times—in some sites it really does make sense to have the same content shown in two different places—and this seems like a neat solution that could apply to much more than just metadata for external search engines.

# 14th February 2009, 11:28 am / canonical, google, metadata, relcanonical, search-engines, seo, urls

2008

freebase-suggest (via) A jQuery plugin that performs auto-completion against the Freebase JSONP API, and allows the results to be limited to specific categories or subsets.

# 24th September 2008, 11:58 pm / autocomplete, freebase, freebasesuggest, javascript, jquery, jsonp, metadata

2007

Amazon SimpleDB overview. Attribute values are limited to 1,024 bytes; Amazon suggest that you store larger fields in S3 and use SimpleDB to query metadata about those objects.

# 14th December 2007, 11:39 am / amazon, metadata, s3, simpledb, web-services

Audio Fingerprinting for Clean Metadata. Last.fm have started using audio fingerprints to help clean up misspelled artists and duplicate track information.

# 13th September 2007, 5:46 pm / audio, audiofingerprinting, lastfm, metadata, mp3

Harper’s Magazine (via) The site with the best metadata on the Web just relaunched, with even MORE metadata.

# 14th April 2007, 12:05 am / harpers, metadata, paul-ford

2003

A better definition of Metadata

Ned Batchelder: Metadata is nothing new. Ned includes a far better definition of metadata than the standard “data about data” phrase:

[... 77 words]