07/16/08 :: [Other] Protocol Buffers :-)... [permalink]
I said it many times, the remoting bunch could be circling for 50 years and perpetually come back to its roots and then bitch about it. Of course the REST inquisition got totally offended by the fact that the Grand Priests of Google could be so RESTless but how could you resist to such arguments?
Protocol buffers have many advantages over XML for serializing structured data. Protocol buffers:
- are simpler
- are 3 to 10 times smaller
- are 20 to 100 times faster
- are less ambiguous
- generate data access classes that are easier to use programmatically
"XML for serializing structured data", yeah, that's really so remoting, using XML just to serialize data would be like using an airplane to drive (not fly) from point A to point B.
Less ambiguous? what's so ambiguous about XML? Have you tried to do this kind of things with PBs?
Generate Data Access Classes...? I mean you can't do that from an XML Schema Definition? How about SDO, DataSets... I could not find any reference to a functionality equivalent to a "change summary" in Protocol Buffers.
Does anyone understands what forward compatibility is at Google? and how XML's extensibility is essential to connected systems:
I came out with this little comparison table:
| Protocol Buffers | XML | |
| are simpler | can be transformed | |
| are 3 to 10 times smaller | can be queried (semantically, without full knowledge of the entire schema) | |
| are 20 to 100 times faster | support a stream based processing model | |
| support "change summary" functionality | ||
| can be validated (with a lot richer set of rules than generated classes) | ||
| is extensible both at the schema and instance level | ||
| (in all fairness, it looks like they introduce a bit of that in PBs though it is not clear how from the introduction) | support forward compatible versioning schemes |
Sure enough if the properties on the left are important to you, why not (I am not religious about XML, but I take issue when people try to trash XML with "speed" or "size" arguments), now, if you think you can build connected systems without the properties on the right...
Protocol buffers are now Google's lingua franca for data – at time of writing, there are 48,162 different message types defined in the Google code tree across 12,183
.protofiles. They're used both in RPC systems and for persistent storage of data in a variety of storage systems.
...good luck !
In many ways, this is why I don't understand Tim Bray's pushing the (other) REST, XML is the essence of connected systems, it is the essence of a contract-first approach that does not break at the first change, like remoting technologies. Promoting contract-less approaches or using XML solely to "serialize" data is missing on 90% of what XML was designed for.
Now you gotta love developers at Google (a little bit of coupling and CRUD never hurts right?):
people started to use protocol buffers as a handy self-describing format for storing data persistently (for example, in Bigtable).
I think Google just created the technology that is going to kill it because it violates all principles of loose coupling ! It's like pouring concrete over your connected systems. Google killed Overture with an epic Business Process Innovation. It is clear from what I read that if Microsoft or anyone else innovates at the process level, Google will be unable to follow, just like Overture in its time. With PBs, Google has simply given the keys to compete with it. Let's talk about PBs in a couple of years...
