May 09 2008

Looking for HTTP / Web services people

Tags: , , , , Filed under: Written in Englishhugo @ 18:18

My group at Yahoo! is looking for talented people to join our team. We build tools and infrastructure for developing Web services in the company (the HTTP kind, not the SOAP one), and we also set standards and provide guidance to developers when designing them. We are part of the Yahoo! Open Strategy group.

We’re looking for different profiles:

  • People with knowledge of Web services technologies and concepts: XML, JSON, HTTP, resources, etc.
  • Developers coding in C/C++, who understand HTTP; knowing how to write PHP, Perl, Java extensions a plus
  • Developers coding in C/C++, who understand HTTP and in particular authentication; knowledge of Apache internals a plus
  • A product manager for those tools and working with the rest of the company

If you’d like to join the fun, drop me an email.

If you want to learn more about Yahoo! Open Strategy, here’s the presentation from our CTO at the Web 2.0 Expo:

And below is a deeper look provided by Neal Sample:


Apr 30 2007

Some activity on the HTTP Web services front in the .Net framework

Tags: , , , , Filed under: Written in Englishhugo @ 6:11

Omri Gazitt writes about the new .Net framework:

Deeper Support for the Web
A big focus area has been to support more web protocols and formats out of the box.  In .NET FX 3.0, we focused on nailing some key enterprise scenarios, like reliable exchange, transaction flow, end-to-end security, and queuing transports, to name a few.  For .NET FX 3.5, we offer some nice features for public web services as well:

  • Syndication: we have some classes for publishing and consuming RSS and Atom feeds.  These formats (Atom especially) are quickly emerging as payload formats for all kinds of schematized data, not just blog entries or newsfeeds.  You can use our classes independently from WCF for a simple OM on top of either format, and we integrate into WCF by implementing the obvious serialization interfaces, so that you can pass SyndicationFeeds into and out of service operations.  I used an early version of this feature to create WCF-based RSS/Atom endpoints for my dasBlog instance.
  • webHttpBinding: a new standard binding that has all the right defaults for “web” services - including support for GET as a verb, and a “bare” encoding (losing the SOAP envelope so that you have a POX message).  Stay tuned, beta2 has MUCH more in store for the Web programmer… including deeper support for URL-based dispatch, more declarative support for GET and other verbs, and support for content-types beyond text/xml :-)
  • […]

It looks like Microsoft is going to come and play in the world of non-SOAP-based Web services. They’re not giving up on SOAP (which is not surprising), but are starting to give more credit to alternative ways of doing things, as they should.

Something which is very interesting to me is the following: deeper support for URL-based dispatch. Part of the whole EPR / reference properties discussion a couple of years ago was, according to their proponents – which included Microsoft, about dispatch and how much better dispatching on XML elements was, and that basically the EPR address was not enough. And now that they’re looking beyond SOAP, guess what, it looks like the URL (the EPR’s address) may be useful for dispatching in the end. Since this is presented as a binding, this means that it’s something that could be used with their SOAP stuff as well. Well, this is supposing that my interpretation of binding is right, since binding is one of the most overused terms in Web services land. I’m staying tuned indeed for this one!

I have to admit that I love the quotes around Web. Since (SOAP) Web services have not much to do with the Web in the end or at least not the way they are being used (for example because of the EPR mess), then “Web services” – meaning services on the Web, the HTTP ones – cannot be referred to as Web services in the mind of people who have a SOAP mindset, but (insert air quotes here as you read this) “Web” services. Ironic, isn’t it?

Anyway, it’s good to see new tools coming for HTTP Web services. Read Omri’s post for the full story.


Jan 08 2006

Of the use of standards: a look at the Flickr API authentication mechanism

Tags: , , , , , , Filed under: Written in Englishhugo @ 20:25

I wrote my first Flickr desktop application, Offlickr, but something has been bothering me in the way the authentication is done.

The Flickr Authentication API Desktop Applications How-To explains how user authentication works. Basically, an application has an API key which will appear as an api_key parameter, and a shared secret which is used to sign every call, the signature being included as a api_sig parameter. If a call needs to be authenticated, an authentication token will be used (which is given to you by Flickr when you as a user authorize an application to access your data) that you include as an auth_token, and again, the application signs the whole thing with the shared secret.

A closer look at the authentication procedure

So the security relies on the api_sig signature, i.e. on the shared secret. Only the application with the shared secret can generate the right signature to do a call.

However, replay attacks are easy: everything is contained in the query parameters. Just do the HTTP request again. This is probably why people use HTTP POST instead of HTTP GET even for things like getList: to try to hide what’s going on and not have the authentication token show up in a log or in a Web cache.

In addition, one can get somebody’s authentication token somewhere (e.g. in a Web proxy log file), and then could reuse the same application (after all, the api_key identifies the application) and pretend to be this somebody. But this limits the attacker to using one particular application, as the so-called shared secret prevents you from doing arbitrary operations as you need it to sign the HTTP requests.

This is where another problem lies: the shared secret is not so secret; it doesn’t appear on the wire, but if you have an open-source application, you can easily find it in the source. If the code is compiled, maybe the developer can hide it from the public, but I was writing a program in Python. I spent some time wondering if one should or not include the secret in the code, but looking at other programs, it seems that the answer is that shared secrets are always included as you don’t want everybody to register the program with Flickr before they start using it.

So, if you can get a hold of somebody’s authentication token for say an uploader tool for which you can get the shared secret (e.g. because you have the source code), as the uploader will have read-write access, then you have full read-write access on somebody’s account. And again, the authentication token is shown in the clear on the wire, so you just need to go to a conference and snoop on the WiFi network to wait for somebody to use the uploader for example, and bingo.

The real secret here, since you already have the API key and the shared secret of the application, is therefore the authentication token that the user has gotten from Flickr. This is a piece of information that only the user which authorized the application has. It stays a secret until it gets on the wire as a query parameter. Well, it’s currently returned on a standard HTTP response so one could technically get a hold of this when the token is issued, but that’s a minor problem as it could easily be changed by having flickr.auth.getToken use SSL.

The question is therefore: why send the authentication token on the wire then? As it seems to be the real secret, it should be kept hidden.

I am assuming that Flickr doesn’t want to use SSL as it would be expensive CPU-wise. It seems that a stronger authentication technique could be used fairly easily and would improve security greatly.

One way to keep this authentication token secret would be to use it to sign the call, i.e. have a parameter auth_sig akin api_sig to hold a signature generated with the authentication token.

Using standards

It may seem that I’m excessively picking on the Flickr API, when actually this is a general HTTP problem. After all, basic authentication in HTTP is weak, and the current practice of using cookies is not better; capturing just a little unencrypted HTTP traffic is guaranteed to give you access to something you shouldn’t have. Plus I like Flickr, which is why I started playing with it and looking at the API in detail; I’m sure similar comments could be made of others. Qui aime bien châtie bien as one would say in French.
However, it seems that, with an API, we have a greater power and therefore can achieve better solutions as we’re not limited to the solutions supported by the intersection of what the major browsers implement.

A way other than the auth_sig solution I was talking about to solve this problem, could be to use the authentication token obtained from Flickr as the password in HTTP digest authentication (RFC2617) and it would even be standard so people wouldn’t need to do much in their code.

I admit I’m a little surprized that with all the standardization work happening on the Web, ad-hoc solutions are used. Digest authentication has failed to be adopted by browsers, so it sadly cannot be used for Web applications targeted to browsers (maybe that will change), but it certainly be used in the context of APIs (e.g. in Python, we have urllib2.HTTPDigestAuthHandler, Jakarta has a AuthPolicy.DIGEST, etc.).

Also, the Flickr API, like many others, exists in several flavors: REST, XML-RPC, and SOAP. But it’s basically the exact same messages, with a different envelope. It’s no wonder that people prefer REST APIs: they’re basically identical to the SOAP ones, except that you need to use less libraries in your code and your messages are simpler.

In this particular case, SOAP-based Web services bring you WS-Security that could solve this authentication problem nicely. Then the SOAP API would show the advantages of SOAP, and people may actually see the point of all this standardization (including Web services) work.