I am very excited to announce the SCM project, an open source STAR-based Component Model. SCM is derived from the TLA+ Specification Language and is the direct product of email conversations Dr. Lamport had the kindness to have with me during the month of January. 

The SCM project is open source (Apache 2.0) and is currently implemented in Java only. Javascript will be next. 

Formal methods are increasingly used in our industry to solve complex problems. For instance, the AWS team recently published this article summarizing their experience using TLA+:

Dangerous bugs turn out to be errors in design; the code faithfully implements the intended design, but the design fails to correctly handle a particular “rare” scenario. We have found that testing the code is inadequate as a method for finding subtle errors in design, as the number of reachable states of the code is astronomical. 

According to the AWS team, the key to solving that problem lies in increasing the precision of its designs. They found in the TLA+ Specification Language an ability to express all the legal behavior of the system. 

The SCM project delivers some of the semantics of TLA+ directly at the programming level (with a couple of noticeable deviations) and as such, enables a broad community of developers to harness the power of formal methods. What kind of power can you expect? Here is how the AWS teams sees the benefits of TLA+:

In industry, formal methods have a reputation for requiring a huge amount of training and effort to verify a tiny piece of relatively straightforward code, so the return on investment is justified only in safety-critical domains (such as medical systems and avionics). Our experience with TLA+ shows this perception to be wrong. At the time of this writing, Amazon engineers have used TLA+ on 10 large complex real-world systems. In each, TLA+ has added significant value, either finding subtle bugs we are sure we would not have found by other means, or giving us enough understanding and confidence to make aggressive performance optimizations without sacrificing correctness. Amazon now has seven teams using TLA+, with encouragement from senior management and technical leadership.

Just like in the case of TLA+, SCM allows you to express:

  • Safety. What the system is allowed to do. 
  • Liveness. What the system must eventually do.

Even though, SCM amounts to writing code, which could potentially be deployed in a running system (as a Component), it is not its primary intent. An SCM Component is optimized for rapidly converging to its most precise design, which in turn, may not be the highest performance implementation possible.

SCM is based on the STAR conceptual framework (State, Type, Action, Relation):

STAR framework

Fig. 1 - The STAR conceptual framework

STAR is a style where every concept we express is clearly positioned as a State, Type, Action or Relationship. STAR has applications in Strategy/Innovation, Problem/Solution Specification and today with SCM, in Programming/Verification as well. 

Combining the foundations of STAR and TLA+, I derived the following SCM Metamodel: 

Fig. 2 - The SCM Metamodel

What the model tells us is that an SCM component has a set of behaviors which is defined by a combination of States, Action, Types (and Relationships). The models deviates somewhat from TLA+ in the sense that the semantics of State (in STAR) would be called "Control State" in TLA+. In TLA+ a state is simply an assignment of variables (which is embodied by the Type) and control states appear as a variable (generally named pc) which is assigned values which are the labels of the control states. However, unlike TLA+, the Type and not the Action is responsible for the determination of the resulting state. That is why the second association between Actions and State is dashed, because it is infered, not specified. In SCM, Actions are purely functional and their results are presented to the Type which may or may not accept them, based on internal constraints. Once the Type accepts the new values, a determination of the new state is made, based on the mapping between a set of Ranges of values and States. (Control) States seem to be Ranges of values, not "assignments of values".

This is how the model operates:

Of course, these semantics can be implemented in TLA+ but with a substantial complexity, even though the notion of "Range of Values" clearly belongs to mathematics. When the action is not in the position to deterministically determine the target state, at design time (which is a very common case), people are forced to implement a synthetic state (generatlly called "test"), which purpose is to make the the determination. This can be seen in the Factorial example of Dr. Lamport's paper on "Computation and State Machines" (see page 12 , TLA+ definition).

Let's explore now how we use SCM. A couple of examples are provided in the SCM repository (the Factorial and DieHard algorithms). 

Based on Dr. Lamport's article, the design of the Factorial algorithm can be summarize as (BOLT Notation):

Note that:

  • Start is a standard action of the component model which determines the initial state of the component based on the values
  • Multiply and Set are associated to an automatic transition, i.e. as long as the control state of the system is "mult", the multiply action is invoked automatically
  • We chose not to implement a Zero state and rather forbid any initial value which is less than one

The initialize action illustrate well the relationship between action-type-state since the resulting state can only be determined once the action has been completed. Of course the action could make that decision, but that would be an obvious and unwanted coupling between the type and the action, since the resulting state could be influenced by parameters outside the scope of the action.

It is that decoupling between action and types which confer to SCM a balanced articulation between Functional Programming (Action), Object Orientation (Types/Actions) and State Machine (State/Actions). 

The design of the component is expressed in the Component constructor's, this is where elements of the metamodel are assembled. 

public Factorial(int i) throws Exception {
		
	//Type
	fact = new FactorialType(i) ;
	
	//Actions
	multiply = new MultiplyAction("multiply()",fact) ;
	initialize = new InitializeAction("initialize()",fact) ;

	//States
	mult = new State("mult", multiply,true) ;
	def = new State("default", initialize) ;
	forbidden = new ForbiddenState("inputLessThanOne") ;
	desired = new DesiredState("lastMultState") ;
		
	//Ranges
	//SCM comes with a couple of configurable ranges
	//SimpleRange and Interval
	SimpleRange defaultRange = new SimpleRange("i",Operator.equals,new Integer(1)) ;
	SimpleRange multRange = new SimpleRange("i",Operator.greaterThan,new Integer(1)) ;
	//Concrete ranges are OK too
	LessThanOneRange fRange = new LessThanOneRange() ;
	EqualsTwoRange desiredRange = new EqualsTwoRange() ;
		
	fact.addRange(mult, multRange)
	    .addRange(def,defaultRange)
	    .addRange(forbidden, fRange) 
	    .addRange(desired, desiredRange) ;

	behavior = new FactorialBehavior(fact) ;
	behavior.add(mult) 
	        .add(def) ;
		
	this.add(behavior) ;

	State currentState = fact.currentState() ;

	//Add trace
	Element.setWarnOnErrroneousAction(true);

	trace = new Trace("fact_trace") ;
		
	trace.setTrace(this)
	     .setTrace(mult)
	     .setTrace(def)
	     .setTrace(forbidden)
	     .setTrace(desired)
	     .setTrace(multiply)
	     .setTrace(initialize)
		     
	     .trace(currentState);				
}

To facilitate the study of the Component's behavior, SCM generates PlantUML activity and state diagrams both as traces of execution or after a "walk". For instance the Factorial algorithm will generate that state diagram:

Factorial Algorithm State Diagram

Please note that the "d1" state is a desired state, not a control state. Here is corresponding PlantUML descriptor (generated by SCM):

@startuml

[*]->mult:initialize
mult->mult:multiply
mult->d1:multiply
d1->default:multiply
default->[*]

@enduml

I'd be more than honored if you would give SCM a try. If you have any question, feel free to reach me at jdubray@xgen.io

jdubray
03/01/15

The Essence of (my) Life

Today was a beautiful day in the Puget Sound. I took this picture in downtown Bellevue and it got me thinking. 

 

The Essence of (my) Life

My kids are going to be 14 and 18 in just a few months, and I just crossed 50. So it's probably a good time to tell them the handful of things that I learned in life. 

1. There is nothing more powerful than Human Dignity

When you hesitate about which way to go, pick Human Dignity above all. There is never a time you will regret traveling that path, ever.

2. Be an Heartist

Saint-Exupery wrote in the Little Prince that "It is only with the heart that one can see rightly; what is essential is invisible to the eye". There is simply no amount of rationale that will ever weigh as much as a heart beat. 

3. Don't let "things" dictate what you do

There is no possession that's worth anything: a house, a car, a boat, jewelry whatever it is you think you should own, it's not worth the time you spend working for it. Yes, you need some money to live and raise a family, but there is always a way to make it work. Connecting with someone’s heart is priceless.

4.  Stay away from Sociopaths

It took me a while to understand that Sociopaths are actually quite common in our society. They are easy to spot once you understand how they operate. When you meet someone and they behave in a sociopathic way, run, run as fast and as far as you can

5. Live free

You can introspect everything you do, every action, every choice you make and figure out if it is dedicated to reach its purpose. There is nothing you are not in control of as long as you don’t react to what someone else does. Freedom is a state of mind, it is not about being able to do anything you want, it is about choosing everything you do, with purpose, heart and human dignity, in otherwords it is about becoming. 

I know that's pretty simple, perhaps even pathetically naive, but I don't have stronger certainties today. I know that's all I would have needed to know to guide my life properly.

____________

I'd like to add a couple of notes:

1) Seattle is a pretty interesting place with people like Bill Gates or Jeff Bezos who are constantly in the news. What's fascinating about them is that we can now see that their fortune means nothing. Bill Gates can buy Da Vinci's manuscripts, but he’ll never be his equal, not even a fraction. Bezos may have built the best commerce engine in the world and funded Cloud Computing, but where's the human component in Amazon? is Bezos shaping the future of humanity as algorithms, drones and robotic workers who help us buy more crap?

Even Bill Gates Fortune is puny when you put it in human terms. Bill Gates, the richest man in the world, who was able to collect for himself, one way or another, $70/Microsoft customer (~1 Billion) can only ... hire the extended city of Seattle (Everett, Bellevue, Tacoma) for one year! (700,000 people at $100,000 GDP per worker = $70B). That's it, 1 person out of every 1000 on earth would work for him for a year and he'll run out of money. In the end he'll just be remembered as the guy who brought us Windows 95, or Clippy, not to forget bullet point thinking. 

On the other hand there are also some very successful people who seem to live these values every day like Howard Schultz or companies like REI. What can be more satisfying than giving decent, human-centric, sustainable jobs across the world with health care and education?

2) At the time of this writing, I wanted to add that I am not able to see my children when they need me. I suggested their mother that a teenager has a complex enough life between school and friends that a parental schedule should not be a document that can dictate his life. He should be in charge of the schedule, not me, not his mother, he should be free to follow his heart, and become a fully functioning human on his term. His mother replied to me:

"You cannot decide when you will see [your son] or not.  I do not want that sort of thing for [him] as it can hurt him. I want him to have clear expectations. There is a schedule for that reason. All divorcee parents work with it. So we will do as everybody else. You will not rule. Bringing the argument that I don't want you to see [him] WILL NOT work."

This is what the parental plan my ex-wife demanded that I sign so she could maximize the amount of child support she could get:

"The father may have up to 8 hours of residential time with the children every weekend, provided that he gives mother 24 hours’ notice that he intends to exercise the time.

The father may also see the children every Wednesday, or another mutually agreed upon weekday, from 5 pm to 8 pm provided that he gives the mother 24 hours’ notice that he intends to exercise the time."

As part of the divorce settlement, my ex-wife, a finance manager in a large Telco, got our two houses and condo, with a rental income of $30,000/year. She also demanded to have full decision control over the children. 

During that beautiful day, yesterday, my son was left alone at home until 2pm as her mother had things to do. That is the kind people our society rewards, people who take everything, have no heart, no empathy and want to control everyone else's actions.

I am free.

 

Following my post last summer detailing how Chorus.js would solve the problem of expressing message formats in relation to a "Common Information Model", I am happy to report that I have completed the development of a DSL, which I believe is a major advance in the way we manage API, Resource and Service contracts (Yes Cord includes an equivalence between Resource-Orientation and Operation-Orientation, how could they not be equivalent?). 

This DSL solves a key problem of schema technologies (XSD, JSON-Schema,...) which cannot properly manage the relationship between a data model and message formats. These technologies are capable of either defining the structure of a data model, or the validation rules of message formats, but they are not capable of managing both at the same time. The flaw in these schema languages stems from the import mechanism which forces any change to the data model to become visible to message format definitions without any possible isolation. For more details, please read my earlier post.

The solution I developed, not only decouples the evolution of the data model from the message formats, but the DSL explicitely manages the versions of both the entities of the data model and the message formats, which is unheard of in most software technologies (Object Orientation, Functional Programming, ...). 

I always found it quite ironic for an industry which is proud to enable "change" that none of its technologies, languages and frameworks can deal with versioning (i.e. change) explicitely, let alone effectively. Versioning code, schemas, config files... ends up as a nightmare of epic proportion. This is particularly true of message formats.

The CIM DSL is part of the "Cord" language, developed as part of the Chorus.js project. The DSL was developed using Eclipse Xtext. Currently the plugin generates XSDs, WSDLs and PlantUML class diagrams. I am currently working on generating Swagger API definitions as well.

There are only four steps you need to do to use this tools:

  1. Download Eclipse Xtext here
  2. Install the Chorus.js plugin into Eclipse: http://www.chorusjs.com/latest/site.xml
  3. Create a new project 



  4. Add a new file to the project (with a .cord extension) and copy the code below (please copy/paste the code below in the .cord file, the WSDL/XSD generations happens automatically when you save the file): 
 
package cim {
	 
	native integer 
	native string 
	native boolean 
	native any 
	 
	entity Entity1 {
	       identifier id : integer { 
                 ## 'unique identifier' 
               } 
               prop1 : string 
               prop2 : string? 
               prop3 : integer 
               prop4 : string* 
               version 2.0 { 
                     ## 'Here are the properties of version 2' 
                     prop5_v2 : integer 
                     prop6_v2 : integer 
               }
        }
 
        entity Entity2 { 
               identifier id : integer 
               prop1 : integer 
               prop2 : Entity1 

               version 2.0 { 
                       //This version substitutes the Entity1 type 
 //for its 2.0 version 
 //note: the _v2 suffix is not required, it is just added for clarity 
                       alter prop2 { cast Entity1(2.0) } 
                       prop3_v2 : string 
               } 
     } 
     projection QueryEntity1[Entity1] { 
             // We alter the multiplicity and remove properties of Entity1 
             alter id { min 0 } 
             alter prop1 { min 0 } 
             - prop2 
             - prop3 
             - prop4 
     } 

     projection CreateEntity2[Entity2] { 
             alter id {min 0} 
      } 

      projection CreateEntity2_v2[Entity2(2.0)] { 
             alter id {min 0} 
      } 

     message QueryEntity1Request { 
             part input : QueryEntity1 
     } 
 
   message Entity1Response { 
       part output : Entity1 
   } 

   operation Entity1Response queryEntity1(QueryEntity1Request) { 
        GET 
   } 

   message CreateEntity2Request { 
        part input : CreateEntity2 
              
        //message versions match service interface versions 
 //this message type will be used to generate 
 //the v1.1 wsdl 
        version 1.1 { 
            //projections cannot be versioned 
 //we create a new projection  
 //which is based on Entity2 v2.0 
            part input : CreateEntity2_v2 
        } 
  } 

  message CreateEntity2Response { 
        part output : Entity2  version 1.1 { 
              part output : Entity2(2.0) 
        } 
  } 

  operation CreateEntity2Response createEntity2(CreateEntity2Request) 

  service example { 

       version 1.0 { 
              interface crud { 
                    queryEntity1 
                    createEntity2 
              } 
              port test { address 'v1/entities/' } 
       } 

       version 1.1 { 
              interface crud { 
                   queryEntity1 
                   createEntity2 
              } 
              port test { address 'v1.1/entities/' } 
        } 
   }
} 

Here is the PlantUML diagram which is generated from the CIM definition:

The cim.cord file can also be downloaded here.

The generated WSDLs file can be downloaded here (v1.0) and here (v1.1)

Here is the full documentation of the DSL:

 

I don't think there is an industry that is more pompous than the Information industry. Yet, the hype went up a notch lately: "I" no longer stands for Information but for Intelligence. I can already see the who's who of Gartner's new shiny IT magic quadrant!

But who could be so stupid as to believe that Artificial Intelligence is around the corner?

You want some tangible proof? just look at Google, the company who mastered Information, now touts its "Intelligence" in just about every network packet.

What happened this week is beyond me, and a sad reminder that we, as humans, need to understand what we are so "intelligently" designing.

I have run my blog ebpml.org since 2001. I am proud to be one of the oldest blog talking about Service Oriented Architecture. I never engaged in SEO for myself or someone else. I am a scientist, I respect people's ideas and work, and provide references (i.e. links) religiously. This is not true for everyone, recently someone took one of my posts to build half of a conference's presentation without any acknowledgement. I have no respect for that kind of behavior whatsoever. His name is not even worth mentioning here. Strangely enough, Google's algorithms don't pick up on such a despicable behavior. They might even reward it...

I just received the following response from Google after they had sent me an email complaining about a suspicious link on my blog (the name of the page was pointing clearly at a hack, and it was not even there), so after checking that the link was not there I asked Google for a "reconsideration request":

 

 

Yes, that was not good enough, the fact that the link was not there, was not enough to return to a clean status: I was "violating" Google's rules and 15 years of work needed to be thrown into darkness "to preserve the quality of Google's Search Engine".

A few days earlier, I had also received a panicked email from a web site I reference for the simple reason that I use their content for free, per their licensing policy.

 

 

And this is what happened to Jordi Cabot who runs a popular and very useful blog on Model-Driven Engineering

 

What's next? self-driving cars texting drivers up the road to Get out of the way, or else?

So here is what I have to say -if there is anyone from Google who reads this post:

Dear Google,

Do you think that as software engineers we can believe more than a nanosecond that humans should have to take any action to facilitate the work of your algorithms?

Do you think, as humans, it is fair that you tell us that "something is wrong with our online presence and you'll censor it if we don't change it"? When you say "violated", are you referring to some kind of crime? (bonus question: are you some kind of sociopath?)

While we are at it, I also wanted to thank you for giving us a glimpse at the world to come, we'll no longer have to deal senseless "processes" or even cruel "dictators", we'll have to deal with your "intelligent" algorithms.

Congratulations, you've just earned the #1 spot in the "superficial intelligence" magic quadrant. Your depth of vision and market leadership is simply in a class of its own.

Sincerely,

Jean-Jacques Dubray

P.S.: I never used Bing before, but it is now my default search engine. Bing Maps are actually a log better than Google Maps.

 

 

Abstract

The goal of this post is to start developing the metamodel of computing, i.e. when you write an algorithm, or any piece of code, what are the foundamental elements that you need to build and assemble to achieve the desired goal(s). The post itself was inspired by, and heavily relies on, Dr. Lamport's article on "Computation and State Machines".

The major issue discussed here is the concept of state. The general view of "state" in our industry is "an assignment of variables". "Control state" (the state that enables or disables actions) is thus reified as a variable assignment. That definition appears erroneous, and the reification unnecessary. We suggest that assignment of variables are just that, and inherently controlled by  a "Type", while (control) state can be more effectively associated to ranges of values of specific properties. From there, we derive a simple metamodel for computing where State, Type, Actions are precisely articulated, without reification, and provide a DSL to illustrate it.

_________

I have always wondered about the relationship between Mathematics and Computing. This spectacular paper from Dr. Lamport: Computation and State Machines provides clear insights about that question. Anyone who writes software should read it. 

The core thesis of the paper, and the foundation of TLA+, is that an expression (in a programming language) can be viewed as a transition from one state to another. I am at odd with that thesis and its main corollary:

a state is an assignment of values to all possible variables.

I understand that it is difficult to disagree with such a statement, however, there is an example in the paper, the Factorial function example, that illustrates well the point I am trying to make.

Beyond this axiomatic definition of state, Dr. Lamport uses that example to ask several important questions:

we must decide what constitutes an [action]. How many [actions] are performed in executing a statement such as f = i ∗ f?

Does a single [action] change the value of f to the product of the current values of i and f ?

Are the evaluation of i ∗ f and the assignment to f two [actions]? Are there three actions? the reading of i, the reading of f, and the assignment to f ?

[Since] there is no standard way to describe a relation on the set of states, [should we] use a transition predicate, which is [defined as] a Boolean-valued formula N whose free variables are primed and/or unprimed state variables?

 
Fig. 1 - The Factiorial Algorithm (a- Program, b- Corresponding State Machine, c- TLA+ implementation)

 

Dr. Lamport uses labels (Fig. 1a) to align the computation of the algorithm with the states of the corresponding state machine (incidentally, I have a new appreciation for the infamous goto statement...). 

The state machine (Fig. 1b) is my representation of the algorithm's state machine. Nothing out of the ordinary: an initialize action bootstraps the process and the transitions operate until the test automatically transitions to done (i.e.  when i < 2). The multiply action performs the computation:

            f = i-- * f; 

The question that comes to mind is, assuming a state to be "an assignment of all possible variables": what (control) state are you in at the end of that computation? When Dr. Lamport translates the factorial algorithm in a TLA+ definition (Fig. 1c), he is forced to introduce a new variable (pc) which possible values are exactly the set of (control) states of the corresponding state machine (test, mult and done). In one of the simplest case possible, the definition of a state (as an assignment of values), fails to be correct, unless you introduce a new variable which possible values are the (control) states themselves. I believe that's a reification and that reification leads to a definition of state that is incorrect. If that definition is incorrect, the key implication is that an expression is not a transition. I would actually argue that an expression is an action.

In the TLA+ example, we can see that variables (i.e. types), states, actions and transitions are fully coupled. This is an unwanted and unnecessary coupling. It may well be mathematically correct (i.e. it does not violate the semantics of mathematics), but fails to represent the semantics of states, types, actions and transitions.

As I suggested in my previous post, the semantics of the system should include a "type" which is solely responsible for deciding which state the state machine is in once an action has been applied, and new values have been assigned to the variables.

Fig. 2 - Proposed Metamodel of Computation (BOLT notation)

With these kinds of semantics, the factorial algorithm could be represented as:

/// Types
type Factorial { 
     f : int  
     i : int  
     map{ 
         default:done, ///This is the default state
         mult: i>1   /// A State is defined as a range of values on properties
    }
}  

/// States
state mult { 
   ///action(s) enabled in the mult state
   multiply
}
state done { 
   ///action(s) enabled in the done state
   initialize
}

/// Actions
action initialize { 
      f' = 1
      i' = 7
} 
action multiply {
     f' = f*i
     i' = i-1
}

The system is by default in state "done" which allows a single action: initialize (which in the real world would be parameterized), when the system is in the state "mult" the action multiply can be applied, until it reaches the state "done", in which case, the system can only be reinitialized. Incidentally, I prefer the notion of default state to init state. 

You may also have noticed that the synthetic state "test" is no longer part of the definition of the system. I strongly question the fact that "test" could be a state at all since its meaning is quite different in nature from "done" or "mult". If it is not, then why this value can be assigned to the same variable as "done" and "mult"?

This new (meta)model no longer requires that "IF ... THEN ... ELSE" statements "be added to mathematics" as it was necessary when Dr. Lamport designed TLA+. As a matter of fact, they can be re-interpreted, mathematically, as "ranges", the predicate i > 1 simply means, when i is in the range [2,], we are in the "mult" state, i.e. they are the very fabric of "state".  

The next question is then: what is the best (hence lowest) coupling between action and types? In other words, are actions only coupled to a subset of the type properties, or are they requests to type operations? For instance, to align with Abstract Data Types, I could have written the system's description as:

/// Types
type Factorial { 
     f : int  
     i : int  
     map{ default:done, mult: i>1}
 operation _multiply {
        f' = f*i
        i' = i -1
     }
 operation _initialize { 
        f' = 1
        i' = 7
     } 
}  

/// States
state mult {
   multiply 
}
state done { 
   initialize 
}

/// Actions
action initialize { 
      _initialize
} 
action multiply {
     _multiply
}

I prefer the former, which is far better aligned with Functional Programming, and provides the lowest coupling with a type (since I could reuse actions over different types which share the same set of properties).  The resulting property values are passed to the type (as a request to change its state) which validates a set of constraints (not represented here) before it decides to accept (or refuse) the new values.

With the exponential growth of Compute, Network and Data resources (e.g. Cloud, Mobile, IoT...) the complexity of distributed systems is about to increase by an order of magnitude or more. I suggest that it may be time to realign the foundation of computing towards a proper articulation of its core concepts (Fig.3): State, Type, Action and Relationship (STAR). As Dr. Lamport suggests:

Computer Science should be about concepts, not languages.

Fig. 3 The Double-Symmetry of the core concepts of Computation

1 3 4 5 6 7 8 9 10 11 ... 24

Search

 Subscribe







blog engine