This post is a synthesis of two posts I originally published on my other blog "unRelated".

One of the key foundations and most attractive principle of Agile or Lean methodologies is that  "Everyone can help each other remain focused on the highest possible business value per unit of time".

I am certainly a strong supporter of that principle. However, value is often difficult to assess, I would actually argue that it is easier to identify what has less or little value, but what we think as valuable can potentially lead to many false positive, or simply be "business-as-usual" and hide the broader structure of the solution. 

"User Stories" are the corner stone of identifying and delivering value:

An argument can be made that the user story is the most important artifact in agile development, because it is the container that primarily carries the value stream to the user, and agile development is all about rapid value delivery.

In practice, very few people focus on the benefits part of a user story. All user stories I see are either what we used to call "requirements" (just phrased slightly differently but isomorphically) or "tasks" needed to advance the state of the project.

However, there is a fundamental flaw in the construction of user stories, even when they are properly written, because they somehow make an assumption about the shape of the solution, and drive the author to turn almost immediately in solution mode, leaving no room for creative and out-of-the-box thinking.

Let's compare the metamodel of a User Story and to the formal definition of a Problem. The metamodel of a User Story looks like that (using the BOLT notation):

As a <role> I want to <action> so that <benefit>


I define a problem formally as a non existing transition between two known states [1],  the metamodel of a problem looks like that:



A solution is a way to transition between these two states. Please note that both the actors and the actions are part of the solution:



This is where the problem lies when using User Stories, you are specifying the requirements with the solution in mind. There is, of course, a general relationship between some of the actors and entities of the system with the "start" and "end" states of the problem. The problem states are always defined in terms of their respective states (possibly as a composite state), but it is a mistake to think that the actors and entities that perform the actions, as part of the solution, are always the same as the actors and entities related to the (problem) states.

Hence, an action is solution centric and should not be part of the problem definition. As soon as you pick one, you have put a stake in the ground towards the direction you are going to take to solve the underlying problem. The other issue is that the start and end states are never clearly identified in a user story leading to confusion in the in the solutioning and verification process, since the problem is not defined with enough precision. Benefits could sometimes align with the target/desirable state, but the definition is often too fluffy and more goal centric, not effectively representing that (problem) state.

Ultimately, the relationship between problems and solutions is a graph (states, transitions as problems, actions as solutions), and this is where the coupling between the problem space and the solution space at the User Story level becomes unfortunate. This means that User stories cannot be effectively nested and clearly cannot fit in hierarchical structures (which is common to most Agile tools I know). This problem is quite accute as teams struggle to connect business level user stories and system level or solution level user stories. The concept of having a single parent directly conflicts with the possibility of having multiple possible transitions into a single state and decomposition principles where the same problem appears in the decomposition of several higher level problems. 

I feel that distinction is profound because we can now clearly articulate:

a) the problem statements with respect to each other (as a graph of states and transitions)

b) we can articulate the solution in relation to the problem statements

c) we can articulate the verification (BDD) in relation to the problem and solution [2]

d) we can actually articulate the Business Strategy [3], the Problem Statement, the Solution and the Verification with the same conceptual framework

e) derive the Organizational IQ from the problems being solved on an every day basis

To the best of my knowledge none of these articulations have been suggested before and no one has ever provided a unified framework that spans such a broad conceptual view from the Business Strategy to the Verification. In the proposed framework the business strategy is simply a higher level and specialized view of the problem and solution domains, but using the exact same semantics (which are described here). In other words the enterprise is a solution to a problem, which is a composition of smaller problems and more fine grained solutions, etc. This has an extremely important implication for the execution of the strategy because now both the Strategy and its Execution are perfectly aligned, at the semantic level: the strategy, problem, solution and verification graph represent a map that everyone in the organization can refer to. 

To take advantage of this new conceptual framework. I suggest that we make a very simple and easy change to Agile and replace "user stories" by "problem statements". Each problem must be "solutioned", either by decomposing it into simpler problems or solutioning it directly. Value can still be used to prioritize which problems are addressed first, that part of the Agile and Lean movement is very valuable, so too speak, but the focus on problems and solutions opens a new flexibility in how we handle the long range aspects of the solution while enabling the highest level of creativity and ultimately a direct articulation with the IQ of the organization. 

As problems are decomposed, we will eventually reach a point where the subproblems will be close to or isomorphically related to the solution. But it would be a mistake to not clearly delineate the problems from solutions, simply because at the lowest level, they appear isomorphic. 

If we start drawing some BOLT diagrams, a problem lifecycle can be defined as:

The fact that the lifecycle is pretty much identical as the one of a user story enables most of the Agile processes and tools to work nearly unchanged.

You may want to know "How do I write a Problem Statement?". Personally, I don't like canned approaches. Oviously here, the mere definition of the two states (low value and high value) is enough to describe the problem. If a solution already exists (i.e. it is possible to transition between these two states) you may want to describe some characteristics of the new solution. I googled "How to write a Problem Statement?" and I felt there was already a good alignment betweent the results and the abstract definition provided above. For instance:

We want all of our software releases to go to production seamlessly, without defects, where everyone is aware and informed of the outcomes and status. (Vision)

Today we have too many release failures that result in too many rollback failures. If we ignore this problem; resources will need to increase to handle the cascading problems, and we may miss critical customer deadlines which could result in lost revenue, SLA penalties, lost business, and further damage to our quality reputation. (Issue Statement)

Here we see two states for the releases: initial state (low value) tested, and the high value state (in production). There is also an undesirable state (failure) that the new solution will prevent reaching. For me the most important thing is that the problem statement must avoid at all cost to refer to the solution. Even if the people specifying the problem statement have an idea about the solution, they should capture it separately.

This new focus on problem & solution provides a rich conceptual framework to effectively organize the work of a team. After all, we have been innovating, i.e. creating solutions to problems, for thousands of years, so it is no surprise that our vocabulary is quite rich. Here are a few concepts that could be used:

Goal: a goal is not a problem, but you often need to solve problems to reach goals, so it's important to keep them in mind

Fact: a fact often constrains the solution, so they need to be clearly surfaced and accounted for

Assumption: assumptions are very important because they also constrain the solution, but in a more flexible way. Assumptions can be changed, facts generally cannot.

Statement: the problem statement is what physically replaces the user story.

Hurdle: During the decomposition of a problem, hurdles might be identified, they are not a problem per say, but they impact the solution. It could be for instance that a resource is not available in time to meet the deadline.

Snag: A problem can be downgraded to a snag as the solution is obvious to the team and represent a low level of effort. It can also be a small unexpected issue, that need to be quickly resolved.

Dilemma: A problem can be upgraded to a dilemma, when several solutions are possible and it is not clear which one to chose

Setback: The team can suffer a setback when it thought it had found the solution but it didn't, or could not find a solution and need to reassess either the problem or the approach

On the solution side, we can also capture different elements and stages of the solutioning process:

Answer: Findings related to a question raised in the problem statement.

Result: A validation that the solution conforms to a Fact

Resolution: The choice made after reaching a dilemma

Fix: a temporary solution to a problem or a snag to make progress towards the solution to the greater problem

Development: An element of the solution, usually the solution to a subproblem or a snag

Breakthrough: The solution found after reaching a setback

Way out: A solution was not found, nevertheless, the project reached a satisfactory state to meet some or all of the initial goals


From a management perspective. The Solution or Delivery Manager can escape the bureaucracy that Agile has created. Ironically, moving stickers around is a zero value activity, with zero impact on the organizational IQ. The solution manager can and should be responsible for the IQ of the project, which rolls up and benefits from the IQ of the organization. It should keep track of the elements that are incorporated in the solution as problems are solved. It should encourage team members to be creative when necessary and to shamelessly adopt existing solutions when it makes sense. It should help resolve dilemmas and push for breakthroughs.

The PMO organization becomes the steward of the Organization's IQ.

As we define problems and solutions in terms of entities, state, transitions and actions, the BOLT methodology provides a unified conceptual framework that spans from Business Strategy to Problem and Solution Domains to Verification (BDD).

To summarize,

1) We have provided a formal model of a problem and a solution, and how they relate to each other

2) This formal model offers the ability to compose problems and solutions at any scale, over the scope of the enterprise

3) Problems and Solutions can be composed from Business Strategy down to Verification

4) We suggest that Agile methodologies replace User Stories by Problem Statements

5) With the renewed focus on "problems", we can also integrate the work of Prof. Knott on Organizational IQ in the whole framework

Last, but not least, decoupling problem definition and solution yields a tremendous benefit in the sense that both can evolve independently during the construction process. 


[1] For instance, you build a vehicle, obviously you want to car to transition to the "in motion" state. Different "actions" will lead to the vehicle to reach that state (a horse pulling, an engine, transmission and wheels, a fan, ...).

[2] BDD Metamodel (Scenario):



[3] Living Social Business Strategy mapped using the same conceptual framework (Source: B = mc2)




Subbu wrote a post earlier this month on the Architect role, here he goes:

  • You always talk about the big picture.
  • You think you know how the system ought to be built.
  • You are unhappy that the team is not executing your ideas the way you want them.
  • You don’t have a working build.
  • You spend a lot of time on documents that are not code.
  • You can prototype – but your code is not production worthy.
  • You spend too much time in meetings.
  • The best code you wrote is a few years old.
  • When asked for opinions you tend to speak in general terms.
  • Your team members secretly joke about you.
  • You start to take analysts and tech blogs too seriously.
  • You are a dinosaur.

It's hard to disagree, though I find the list widely incomplete, specially when it comes to "everyone but your mum has an opinion on what the architecture should be", "vendors tell your boss you suck at archicture (unless you buy their product)"...

I am a bit more at odds with his conclusion: Code. Don’t wiki. Don’t powerpoint.

To code or not to code, that seems to be the question: in our industry these days, you code, therefore you are.

Just take any interview with Microsoft, Amazon, Facebook, Google ... and you'll face insipid Comp Sci 101 questions such as: how do you compare two binary trees? So, you bring 25 years of experience and passion, and the best and first thing these companies can do is to have a 25 year old kid ask you to write 5 lines of code. Everything else you have done is irrelevant. That is Subbu's world.

Allow me to take a real example to illustrate how a "coder" architects. A few years ago I was invited at Microsoft for a pubic event, and in a rare insight, a Microsoft architect was proudly explaining why Windows 2003 (R1 if I recall correctly) did not scale as a web server? The developer in charge of the connection manager had used a  ... linked list to manage the connections. I bet this guy had brilliantly passed all the interview CS 101 questions ... Shall we also speak about the kids at Facebook who code your privacy away? Isn't a primary key a convenient way to fetch your favorite piece of information?

The reality, is that Subbu makes the deeply and totally erroneous assumption that we have reached a level of maturity that gives us the perfect communication tools and hence with that in mind of course, writing code shall be the answer.

If you think that a) communicating, b) how and c) when you communicate is irrelevant, I would strongly suggest you watch this presentation from start to finish. Don't get me wrong, coding is great, I love coding, specially metaprogramming, but the question still remains, is an architect, JUST-A developer, hence Archaic, or is his or her role more essential (as in essence), i.e an Archetype ?

Why not think out of the box slide for once? beyond the trees, and the queues, the arrays and the lists, the maps and the sets? Why not ask how Model Driven Engineering could shape the role of an architect and establish a strong and precise articulation between architecture and coding? and with it, enable architecture refactoring ...

Let's review Subbu's list in the light of MDE:

You always talk about the big picture In MDE, you talk about the Solution Model with the highest degree of precision possible. There is no "big" picture, there is an objective, traceable representation of the blueprint of what you are building.
You think you know how the system ought to be built You don't know, with MDE you rapidly try different technical architectures and enable architecture refactoring, just in case.
You are unhappy that the team is not executing your ideas the way you want them A solution model can go a very long way in bringing people on the same page and eventually, the quicker you get on the same page, the faster you execute. Heck, you can even implement part of the solution model yourself.
You don’t have a working build With MDE, the build starts with the solution model
You spend a lot of time on documents that are not code With MDE, you spend all your time having a meaningful impact on the code.
You can prototype – but your code is not production worthy Your solution model is production worthy and drives the code of your team.
You spend too much time in meetings you don't have to spend much time in meetings. With a clear solution model, you facilitate meetings  and rapidly drive to the right conclusion be it technical or functional. How many people "scrumly" hash and rehash the same arguments for weeks, months and eventually request to build something that is not what they had in mind while cutting the scope to the bone, just to claim they shipped something? Who doesn't want more precision in meetings? Are PPTs or Wikis bringing that precision? Is code bringing that precision?
The best code you wrote is a few years old Your best models are still ahead, you love what you do and you can't have enough of 24 hours per day.
When asked for opinions you tend to speak in general terms when asked your opinion, you can articulate your ideas and the ones of others with the highest degree of precision, creating a clear understanding of their architectural impact and the corresponding level of effort.
Your team members secretly joke about you you feel part of the team and you communicate effectively your ideas. The code developers write is 10x more interesting to write.
You start to take analysts and tech blogs too seriously Analysts and tech bloggers have not idea what MDE is, by the time they catch up, you'll be retired.
You are a dinosaur yeah right ...

Shall I also mention that the ultimate beauty of MDE is "No Middleware"?

So, for a RESTafarian, who recently admitted:

Hypertext made a lot of sense when I was looking from the server. A few months spent writing real-world client code changed all that.

I am not really sure that Subbu is in the position to teach us some lessons about what to do, or not to do. Very few "Coders" seek to understand the context in which they code, or poke up to the solution model, and I am not even talking about the problem model. So yes I am disappointed, but not surprised to see someone the caliber of Subbu, thinking that we have to return to the early sources (so to speak) of software engineering. The RESTafarians have already thrown us back 15 years, what's another 15 years on top of that?

It's time to let the Architect surface the Arche (In ancient Greek Philosophy, Aristotle foregrounded the meaning of arche as the element or principle of a thing, which although undemonstrable and intangible in itself, provides the conditions of the possibility of that thing.)


I must admit, MDE is a joy to work with, especially as an architect. MDE allows you to express how a certain class of solutions should be built, far away from the intricacies of general purpose languages, conventional type systems and SDKs. It allows you to solve problems, in ways that simply would never have any room in a GPL world. I may never know if MDE would have changed the face of software engineering, but for an industry that has looked for the holy grail of programming languages for so long, a paradigm shift seems innevitable. If for nothing else, all our software development technologies and processes have been aligned along a "monolithic" assumption, when today solutions are composite and often requires several architecture variants to support to different use cases.

Let me take 3 examples from the Canappi 1.2 release. to show where and how MDE makes a powerful difference.

1) MDE enables solution architecture variants : one of the key features of 1.2 is an enhanced level of support for Universal Binary iPad/iPhone applications. You can now define device specific layouts which will be displayed when the application runs on that type of device. The kitchensink demo has been updated to take advantage of that feature. Here Canappi uses tablet variants, but it would be just as easy to add it for different orientations.

layout sessionDetailLayout_iPad {
    // iPad specific layout


layout sessionDetailLayout {
	text  sessionId	'' (-20,-20, 10,20) //hidden parameter
	text  sessionTitle '' (7, 45, 305, 65) { Left ; }
	text  sessionPresenter '' (7, 110,290, 20) { Left ; }
	tablet sessionDetailLayout_iPad ;

There is simply no other mechanism that allows you to do that so elegantly because you are constrained by the underlying architecture of the  SDK you are building your solution onto. When you look at Apple and Android SDKs, you can clearly see that the support for different form factors simultaneously is an "after-thought", not a key architectural foundation.

Furthermore, any architectural deviation from a given SDK translates into lots of boiler plate code, scattered across the entire solution's code base. MDE completely hides this boiler plate code from the solution model.

2) MDE enables the creation/addition of new logic semantics. Mobile apps often need to call more than one API to get the data you need to populate a user interface. For instance, when you need to display the thumbnails of a Flickr gallery, you would think Yahoo's Web API GET /gallery/{id} would return all you need? wrong, it only returns some gallery information with just the photo ids, you need then to make a second call for each photo_id to get the thumbnail URLs to download the thumbnail image files.

So in this case, not only do you need a solution architecture variant (call 2-N APIs given a call to a single API), but you also need to express the logic of that call. No mystery here, that kind of logic is probably a few thousand of years old, and in modern day computing, this is simply called a join:

connection flickr {

	operation init getPhotos GET '' {
		resultSet 'photos' ;
		join getSizes on photo_id = id where _label = 'Square';

	operation getSizes  GET '' photo_id {
		resultSet 'sizes' ;

Yes, it's that simple ! With just a bit of code written once for Flickr and abstracted a tiny bit, I can implement most mashups.

But that's not all, a "side effect" of MDE, in this case, is the ability to refactor your architecture. I happen to believe that mashups should mostly run on the client, unless there are some reason not to bring the data on the client (e.g. the end user is not allowed to see part of the data, bandwidth constraints...), but this point is irrelevant, there will always be the need to do some mashup both on the server or the client. Subbu's team has come up with a nice way to do that on the server side,

So how would I enable such a massive architecture refactoring? a simple keyword and a tiny bit of code in the code generator (that generates scripts and some code on the client that invokes the mashup). I'll try to illustrate that for the next release.

join getSizes on photo_id = id where _label = 'Square' with qlio ;

3) MDE enables you to create solution oriented type systems. Many would argue that the main purpose of a computer is not to "compute", but to manage state persistently. One essential aspect of managing state is data structures and the building blocks of data structures are described by a type system.

One of the challenges when building apps and mobile apps in general is that the UI is quite varied. Take the example of a Map for instance, it often comes with points of interest. How do you bind a data set to a Map such that it displays push pins? just like you want to display rows of values in a table or pictures and picture information in a gallery?

In the case of Canappi we built a simple binding framework that lets you express that the result set from a Web API call is bound to a "layout" (a set of controls).  You can optionally use a fromKey / toKey mapping when the attribute names of the result set are different from the names of the control.

layout myMap bindings pointOfInterest with mapping aSimpleDataMapping ; 

In this statement,pointOfInterest points to one (or more) APIs that gather the data that is automatically bound to the map control which supports binding to a single push pin (lat, long, title and subtitle) or an array of pushpins. The reason why we make it that simple, is because we took the decision early to normalize the response before we bind it to the UI, i.e. we built an implicit type to which Web API response formats map to, on one end, and from which UI views (with any control) bind to.

Ok, I understand this is quite basic, but incredibly productive for at least 95% of the data that moves from a server to a client and back: I don't know any programming language that allows this type of consision (e.g. a twitter app in 30 lines of code). Yet, hundreds of thousands of developers, every day, write some code that picks up some data from some kind of back-end API call and puts it in some kind of UI element.

These examples are the bread and butter of MDE, they are very easy to implement and I would argue that they add less than 20-30% overhead to your first implementation of a variant, type of business logic or data binding. That means on your second implementation you already gained overall 20-30% of your time. So next time you look at the architecture of your solution why not ask these questions:

a) Is there any variant in your solution architecture?

b) Does your architecture deviates from the architecture of the underlying SDK / Platfrom on which it is built?

c) Are you using some business logic that is not easy to express with the programming language that you are using?

d) Would you benefit from extending the type system of your underlying programming model?

If you answer yes, to one or more of these questions, I would certainly take a good look at MDE.


Is HTML 5 losing mindshare?

Robert Scoble thinks so.

The new Path? The one that won a Crunchie last night for great design? It’s not done in HTML 5.

This morning I saw something new coming soon from Storify. Not done in HTML 5. This morning I visited Foodspotting which just shipped hot new apps on iOS, Android, and Blackberry. Not done in HTML 5.

More and more I’m hearing that designers and developers are ignoring HTML 5, especially for the high end of apps.

After building Canappi, I can only wholeheartedly agree with Bob. HTML5 has certainly a place in the mobile landscape, if for nothing else to "mobilize" your web site, but when it comes to user experience, we don't see how it can compete with Native Frameworks at least in the years to come: For every successful mobile web app, someone will build a superior native experience that users will adopt. There is simply no room to compete.

I am currently building Canappi's jQuery Mobile code generator and from what I can tell, it is extremely limited compared to the native iOS and Android SDKs.

The fundamental reason for Web Apps to emerge in the late 90s and 2000s was the "client update" problem and to a lesser degree the number of platforms on which the app should run. In the 90s, assistants used to roam the buildings with CD-ROM (and Floppy disks) to update every single client.

Today, App Stores have completely solved the update client problem (for mobile and desktop apps), and running on multiple platforms really means running on iOS and Android (although the fragmentation of Android is a bit of a concern).

Web Apps have many fundamental flaws, from flowing layouts which simply don't work in small form factors, to cluncky off line experience, not to mention bandwidth and power consumption.

If you combine a horrible developer experience to a horrible user experience, (Mobile) Web Apps have simply no room to grow from here.

AT&T released last week at CES a new API platform that exposes a number of Network APIs:

  • Location
  • Device Capability
  • In-App Payment
  • WAP Push
  • ...

The APIs implements OAuth 2.0 and capture user consent wherever applicable. I think the in-app payment (that goes directly to thesubscriber's bill) is a game changer for mobile applications, in particular Web  based mobile apps.

The APIs come with an SDK and some code samples.

In order to use these APIs, you need to join the developer program and create a developer account.

You would then create an "application" which basically gives an application id, a shared secret and a short code (yes ! your own short code). These credentials are used to get an access token (See figure below).

If you use the Java SDK (there are others available: PHP, Ruby...), all you have to do is to insert these values in the conf/ file and build the project with Eclipse or Ant directly.

Once it is built, you just start the server by executing and you deploy the Sencha client application in your web server. The API server aquires access tokens automatically.

Here is how I deployed my server:

SDK -> /var/www/att/server -> sh

Sample app -> /var/www/client -> open

If you want to change the location of the application, just update the ResourceBase value in the com.sencha.jetty.EmbeddedServer class. In my case, it was set to  webapp.setResourceBase("../../client");

That's it ! And if you wonder, with your very own shortcode, you can also receive SMS too.

I have a few promo codes from AT&T that give you free access to the platform for one year.


SQL, NoSQL and Web APIs



I am truly amazed at the degree of innovation of our industry today. In a way, it's quite scary: Are we building tools just in search of a problem? are we really innovating or is it just another "hyped-stand still cycle" ? In the midst of massive scalability improvements and seamless operational environments it is easy to lose track of the semantics view.

I built a "Composite Application Framework" at Attachmate between 2003 and 2005. Even though our team did get to the 1.0 release. The product was shelved before it could get to its first customers when Attachmate was acquired. A composite application is an application which is capable of interacting with systems of record that are beyond its control (typically 3rd party Web APIs, as we call them these days). As we architected the framework, we were faced to a big philosophical question: should we group these Web APIs (they were known as services and operations back then) under a facade that expose a query language interface or should we rather "bind" these APIs to our UI and develop orchestration-based mashup logic in the API consumer (middle-tier or client). I was softly voting for the former, while my team prefered the later. I didn't push back because I knew that implementating a query engine would delay the project and I didn't have a strong rationale for it. It was just cool. At the time, we had done some experiments with a couple virtual database technologies which allowed an in-memory SQL engine to connect to a number of different databases (not APIs) and make them appear as a single database from the client perspective. These product were interesting but they never really got traction and their API bindings were nascent to say the least.

Hence, I have been quite intrigued by Subbu's project. It is still incomplete since it does not handle updates yet, but it is already quite mature to look like what I had in mind then. I don't want to make too many conclusions until I use it, or other people provide some feedback. When I designed Canappi, I decided to take a binding approach, rather than building a SQL-like facade. So far I have been quite happy with that choice mainly because composite applications rarely need the full power of a query language. The Views map reasonably well to the model (unless it is the other way around ...) and applications follow a "navigational" pattern which is generally well supported by Web APIs. If you need to create some reports, of course, the answer would be quite different.

As REST has forced most of us to CRUD our way to the data, I think that our industry has reached a point where we need to answer objectively: what is different about SQL, NoSQL and (Web) APIs? We have built a "composite world", great, I don't think we will be able to make much progess without establishing a clear articulation between the way we store, relate and access information.

In his introduction to, Subbu points to a very interesting paper from Erik Meijer and Gavin Bierman, in which they argue: "Contrary to popular belief, SQL and noSQL are really just two sides of the same coin".

I like their analysis, but I am a semantic guy, I really like to see at the semantic level what's new, so I created a simple metamodel:

On the left end side, you have the traditional RDBMSs / SQL model on the right end side the new "No SQL" model (both key-value pairs and document oriented). The color coding is used to (roughly) map concepts from one world to the other. I have adopted a JSON metamodel for the structure of the values of NoSQL databases.

Erik and Gavin pointed out, the key deifference is that in the NoSQL world, an identity is generally a "key":

In the object-graph model, the identity of objects is intensional—that is, object identity is not part of the values themselves but determined by their keys in the store. In the relational model, object identity is extensional—that is, object identity is part of the value itself, in the form of a primary key.

It is of course true, but does it really matter? Did we really create new semantics by making an identity a subclass of a key? (incidentally, it would be very interesting to ask the opposite question, what happens when a "key" IS-An "identity" within a given scope (row, collection, database)?) So, my answer is no, claiming that an identity IS-A key does not change the semantics of what an identity is, regardless of the position of an identity in a result set or in the store. I have argued many times that this was true 8000 years ago when man invented writing and hence data. I have also argued that for instance, in the absence of a reasonable numbering system, ancient writers were left to techniques such as acrostic peotry to enable random access and even sorting.

The authors also argue that another key difference is the "open" data structure of NoSQL databases, but is this openess inherent to NoSQL or, is it inherent to the design of storage engine (and the query language designed to harness it)? First, a "link" is a link semantically speaking, it relates two pieces of information using one or more identity. Nothing would have prevented us to create a "unique id" concept in RDBMSs and allow for querying at the database level (with a statement like SELECT * WHERE ID = '123'). As a mater of fact, if you look at MongoDB which is not a key-value pair store per se but a document store, its data structure includes the concept of collections (~Tables) and the identity of a document is stored separately from the document itself. Second, all these data structure concepts are simply man made, for good reasons of course, but they do not introduce or uncover any new semantics.These "open data structures" simply provide a better opportunity to physically associate data that is often queried together. For instance a purchase order has line items or a prescription has a number of treatments, but that does not remove the need to uniquely identify all individual line items or treatments to associate additional details or independently reference these pieces of information.

Hence, the only semantics that matters in a data store are identity and link, which are common to all data stores, even the ones man used 8000 years ago when writing was invented. Incidentally, we can also notice that would-be data centric technologies like JSON, XML or even ATOM do not reflect correctly the importance and semantics of "links". As a matter of fact, the Web confuses identity and link. A major faux-pas if you ask me.

In the end, the "link" or "relation", or however you want to call it, is the most important concept of a data store and is unlikely to change for another million years, at least. Even the Web or the RESTafarians didn't change that, though its scope world-widened at the expense of bi-directionality. So I don't really view SQL or NoSQL as being semantically different, I don't really see (semantically) an inversion between the association between an identifier and the information being identified (this is merely visual attribute and an implementation decision of the query engine). Of course, I am abstracting all the non functional differences between each type of storage engine, I am just claiming that semantically, we are standing still (and that's good, it would be shocking otherwise).

Web APIs don't change the semantics of a link either. If a given API call were to return any kind of identifier, are long as we know which API(s) can consume it to produce the information associated to it, just like Tables or Collections, we will be in the position to navigate it, even though we might not have an engine that can do it automatically for us. However, the API world introduces a significant asymetry between the way we read and write data. It is clear that you can easily map a Table or a Collection to a set of data returned by a read-only API, but the converse is not true. A set of APIs which all have a side effect will rarely, if ever, map to the ability to write freely to a table or a collection. Again, let's go back to the invention of writing to better understand that point:

Farmers needed to keep records. The Sumerians were very good farmers. They raised animals such as goats and cows. Because they needed to keep records of their livestock, food, and other things, officials began using tokens. Tokens were used for trade. Clay tokens came in different shapes and sizes. These represented different objects. For example, a cone shape could have represented a bag of wheat. These tokens were placed inside clay balls that were sealed. If you were sending five goats to someone, then you would put five tokens in the clay ball. When the goat arrived, the person would open the clay ball and count the tokens to make sure the correct number of goats had arrived. The number of tokens began to be pressed on the outside of the clay balls. Many experts believe that this is how writing on clay tablets began.

Well, obviously, we made some progress on idempotency, but no matter how flexible the data structure is, the semantics of writing information (in that case changing the state of the clay ball to "delivered") at the API level are widely different. Sure, a query language such as SQL and the like in the NoSQL world, would allow full control at the data structure level, just like a scribe would have had on a piece of papyrus thousands of years ago, but who in their right mind would give such privileges to just about anyone?

So I remain unconvinced, today, that a query engine on top of APIs is not the right model to adopt in general. It is a little bit the same argument that RESTafarians used to debunk the WS-Resource Framework specification, how many layers of query language can you stack? How could a "composite" query engine convey business exception as updates fail behind the Web API layer, for instance?

IMHO, in the context of Composite Applications (not analytics for instance), it would be far more efficient to define information entities (with an adequate, shall I risk -modern-, data structure) and associate an access layer composed of entity specific queries on the read side and action model on the write side and bind these queries and actions to specific APIs. Hence it is far more important to understand how Web APIs and modern ("open") data structures can be bound to the programming model (i.e. business logic and views). It seems to me that, problems like tracking transparently the identities of the pieces of information, transparent off-line / on-line operation, ... should be far more interesting to support at the programming level model than offering a general query language.

As a matter of fact, API2MOL is a project lead by Javier Luis Canovas Izquierdo that reverse engineers a set of APIs into a set of classes that describe the information model behind the APIs. I think it would be nice if API2MOL could focus on establishing a clear binding between specific, i.e. concrete, APIs and the views in which these entities are represented and manipulated.

<< 1 ... 4 5 6 7 8 9 10 11 12 13 14 ... 22 >>



blog engine