Jan 08 2006

Of the use of standards: a look at the Flickr API authentication mechanism

Tags: , , , , , , Filed under: Written in Englishhugo @ 20:25

I wrote my first Flickr desktop application, Offlickr, but something has been bothering me in the way the authentication is done.

The Flickr Authentication API Desktop Applications How-To explains how user authentication works. Basically, an application has an API key which will appear as an api_key parameter, and a shared secret which is used to sign every call, the signature being included as a api_sig parameter. If a call needs to be authenticated, an authentication token will be used (which is given to you by Flickr when you as a user authorize an application to access your data) that you include as an auth_token, and again, the application signs the whole thing with the shared secret.

A closer look at the authentication procedure

So the security relies on the api_sig signature, i.e. on the shared secret. Only the application with the shared secret can generate the right signature to do a call.

However, replay attacks are easy: everything is contained in the query parameters. Just do the HTTP request again. This is probably why people use HTTP POST instead of HTTP GET even for things like getList: to try to hide what’s going on and not have the authentication token show up in a log or in a Web cache.

In addition, one can get somebody’s authentication token somewhere (e.g. in a Web proxy log file), and then could reuse the same application (after all, the api_key identifies the application) and pretend to be this somebody. But this limits the attacker to using one particular application, as the so-called shared secret prevents you from doing arbitrary operations as you need it to sign the HTTP requests.

This is where another problem lies: the shared secret is not so secret; it doesn’t appear on the wire, but if you have an open-source application, you can easily find it in the source. If the code is compiled, maybe the developer can hide it from the public, but I was writing a program in Python. I spent some time wondering if one should or not include the secret in the code, but looking at other programs, it seems that the answer is that shared secrets are always included as you don’t want everybody to register the program with Flickr before they start using it.

So, if you can get a hold of somebody’s authentication token for say an uploader tool for which you can get the shared secret (e.g. because you have the source code), as the uploader will have read-write access, then you have full read-write access on somebody’s account. And again, the authentication token is shown in the clear on the wire, so you just need to go to a conference and snoop on the WiFi network to wait for somebody to use the uploader for example, and bingo.

The real secret here, since you already have the API key and the shared secret of the application, is therefore the authentication token that the user has gotten from Flickr. This is a piece of information that only the user which authorized the application has. It stays a secret until it gets on the wire as a query parameter. Well, it’s currently returned on a standard HTTP response so one could technically get a hold of this when the token is issued, but that’s a minor problem as it could easily be changed by having flickr.auth.getToken use SSL.

The question is therefore: why send the authentication token on the wire then? As it seems to be the real secret, it should be kept hidden.

I am assuming that Flickr doesn’t want to use SSL as it would be expensive CPU-wise. It seems that a stronger authentication technique could be used fairly easily and would improve security greatly.

One way to keep this authentication token secret would be to use it to sign the call, i.e. have a parameter auth_sig akin api_sig to hold a signature generated with the authentication token.

Using standards

It may seem that I’m excessively picking on the Flickr API, when actually this is a general HTTP problem. After all, basic authentication in HTTP is weak, and the current practice of using cookies is not better; capturing just a little unencrypted HTTP traffic is guaranteed to give you access to something you shouldn’t have. Plus I like Flickr, which is why I started playing with it and looking at the API in detail; I’m sure similar comments could be made of others. Qui aime bien châtie bien as one would say in French.
However, it seems that, with an API, we have a greater power and therefore can achieve better solutions as we’re not limited to the solutions supported by the intersection of what the major browsers implement.

A way other than the auth_sig solution I was talking about to solve this problem, could be to use the authentication token obtained from Flickr as the password in HTTP digest authentication (RFC2617) and it would even be standard so people wouldn’t need to do much in their code.

I admit I’m a little surprized that with all the standardization work happening on the Web, ad-hoc solutions are used. Digest authentication has failed to be adopted by browsers, so it sadly cannot be used for Web applications targeted to browsers (maybe that will change), but it certainly be used in the context of APIs (e.g. in Python, we have urllib2.HTTPDigestAuthHandler, Jakarta has a AuthPolicy.DIGEST, etc.).

Also, the Flickr API, like many others, exists in several flavors: REST, XML-RPC, and SOAP. But it’s basically the exact same messages, with a different envelope. It’s no wonder that people prefer REST APIs: they’re basically identical to the SOAP ones, except that you need to use less libraries in your code and your messages are simpler.

In this particular case, SOAP-based Web services bring you WS-Security that could solve this authentication problem nicely. Then the SOAP API would show the advantages of SOAP, and people may actually see the point of all this standardization (including Web services) work.


Dec 29 2005

Offlickr: backing up metadata and photos from Flickr

Tags: , , Filed under: Written in Englishhugo @ 1:28

I started uploading my pictures to Flickr, and spending time writing descriptions, titles, adding tags. Other people have also contributed by adding notes, etc. So I really want to be able to get this data back, as I don’t want to solely rely on Flickr to keep my data.

I also wanted to play with the Flickr API. I looked into toolkits provided, and they are all built on the REST API. I ended up choosing Beej’s Python Flickr API. However, it uses an HTTP POST for all its calls. It’s especially weird for Offlickr which is a series of information retrieval (the permission requested is read-only). As there is some authentication information in the parameters sent, maybe it makes sense. It might have been better to use another authentication technique (digest auth?).

Anyway, I ended up with a small Python script which does the trick for now: Offlickr. I probably will improve it, but I now feel confident that I can get my data back from Flickr.

I called it Offlickr as once you have the metadata in XML form and the pictures, it should be fairly easy with some simple XSLT to have an offline copy of your Flickr space.