OEmbed is a pretty neat idea. A site that serves up content, like YouTube or Flickr, decides that it wants to allow users elsewhere to easily integrate with their site. They set up an endpoint that accepts RESTful requests for information about the site's content and respond with JSON or XML metadata. This makes it a snap for web developers to integrate third party content into their site.

So providers are responsible for an endpoint which, when given a URL and any other optional parameters (format, maxwidth, etc), they will return an encoded response with metadata (title, author, thumbnail_url, etc) -- and in the case of video and rich resource types, the HTML necessary to embed the object.

Consumers work with the providers to parse URLs, match them against valid endpoints, and retrieve metadata.

So what's missing

The discovery part of OEmbed is tricky. Currently the spec has providers put a link tag in the head of their HTML documents where an oembed-able object lives. If I was visiting the page http://www.blah.com/pics/a-picture-of-something/ in the head of the HTML documents I would find:

<link rel="alternate" type="application/json+oembed" 
    href="http://www.blah.com/oembed/?url=http%3A//www.blah.com/pics/a-picture-of-something/";>

I could extrapolate from this that the OEmbed provider lived at http://www.blah.com/oembed/ and that it seemed to match URLs of pictures, but I have no way of knowing what other site content is provided and what URL patterns the provider will match. If I wanted to automate this kind of behavior in some way for any link, I would have to scrape the page looking for the link tag pointing me to the OEmbed provider, which is not very efficient. Sure, I could pre-configure my consumer to work with the providers I know about, but if OEmbed takes off (as I hope it will!) it will become difficult to maintain.

A couple ideas on improving the spec

OpenSearch specifies the following in their head element:

<link rel="search" type="application/opensearchdescription+xml" 
    href="/opensearch.xml" title="Flickr" />

Perhaps there could be a convention of putting OEmbed provider data in /oembed.xml at the site root, and allowing the location to be explicitly defined in the above manner. What sort of data would go in that XML file? A list of url patterns, the endpoint(s) they mapped to, and the type of resource returned. As one of my coworkers pointed out, additionally, it might be useful to specify a time-to-live for the oembed definition so consumers could update themselves. It could be something like:

<oembed expires="86400">
    <provider match="/pics/*/" endpoint="/oembed/" type="photo" />
    <provider match="/videos/*/" endpoint="/oembed/" type="video" />
</oembed>

I'm very interested in seeing this technique get adopted by the sites we use. It seems like a lot of developer energy goes into building pipelines between disparate tech, creating unnecessary duplication of infrastructure. Content should only have to be created once, and how it is proliferated can be greatly simplified by standards like OEmbed. I've been working on an OEmbed provider/consumer and plan on open-sourcing it soon! In the meantime, enjoy one of my favorite videos:

Comments (0)

Commenting has been disabled for this entry