Elegant Coding: January 2012

29 January 2012

Abstract Algebra: The Logic of Odd and Even

Steven Strogatz wrote a series of math related articles on the New York Times website one of which is called "Rock Groups" where he looks at adding integers visually and with respect to odd and even and ventures into the territory of triangles and squares and visually portrays the closed formula for computing triangular numbers. My previous post on triangular numbers in some ways parallels his "Rock Groups" article at least in respect to the visual portrayal of triangular numbers. This post also parallels ideas in that article, in contrast his observations are geometric, these are algebraic observations about, among other things, adding and multiplying even and odd numbers, which work as follows:

+	*
Even + Even = Even	Even * Even = Even
Odd + Even = Odd	Odd * Even = Even
Odd + Odd = Even	Odd * Odd = Odd

Now this doesn’t really give one a sense of what is going on here, however, by using a Cayley Table it can also be called an operation table in that it shows the details of an operator. It becomes more interesting we now have a better depiction of the Group for Addition (+) with Even as the Identity Element and the Monoid for Multiplication (*) with Odd as the Identity Element:

+	Even	Odd
Even	Even	Odd
Odd	Odd	Even

*	Even	Odd
Even	Even	Even
Odd	Even	Odd

Now that still may not be very intuitive, however, if you convert Even to Zero (0) and Odd to One (1) you get the following, possibly more familiar Group with 0 as the identity and the Monoid respectively with 1 as the identity:

+	0	1
0	0	1
1	1	0

*	0	1
0	0	0
1	0	1

The Group has 1 as is its own inverse 1 + 1 = 0. In general these "types" of Algebraic structures consist of a binary operator and a set that is closed under the operation so I am using the notation (Operator, {Set}) to denote these and since the sets are small I can just list the elements, but you could also write (+, Z) where Z is the set of integers and in fact this is a well known group: addition on the set of integers, but the set is infinite in this case in contrast to our finite sets. We can describe the above as the Group (+, Z₂) and the Monoid (*, Z₂) where Z₂ is the set {0,1} of the Integers Modulo 2, or simply the Binary digits which are Base 2.

Now let’s do another transformation on our Monoid and Group, we will now map Zero (0) to False (F) and One (1) to True (T) . We will also change our operators, + now becomes Exclusive-Or and * now becomes Logical Conjunction aka Logical And.

Xor	F	T
F	F	T
T	T	F

And	F	T
F	F	F
T	F	T

All of these transformations are actually Isomorphisms and the two structures combine to form a Ring this is also a Field, the smallest Finite Field GF(2). I have a follow up for this post and I will get into these and other ideas in more detail.

Some basic algebraic structures are:

Magma or Groupoid – A closed binary operation on a set.

Semigroup – A closed binary operation on a set, with associatively. (An associative Groupoid)

Monoid – A closed binary operation on a set, with associatively and an identity element. (A Semigroup with an identity element)

Group – A closed binary operation on a set, with associatively and an identity element. Each element has an inverse element. (A Monoid with inverses)

Commutative (Abelian) – used to describe any of the above with the commutative property, these two terms are interchangeable but term abelian¹ in abstract algebra is used mostly just for groups.

	Closed	Associative	Identity	Inverse
Magma	X
Semigroup	X	X
Monoid	X	X	X
Group	X	X	X	X

Now Commutative is more of a property of the above, as we saw the String Monoid is not or Non-Commutative where as the Group (+, Z₂) and the Monoid (*, Z₂) are both Commutative.

Going back to Boolean Logic, if you have had any exposure to Boolean Logic things now may look amiss in regards to operators: such as what about the Or operator? Actually Xor can be defined in terms of Logical Conjunction and Logical Disjunction as follows:

Our Cayley Table for Logical Disjunction (Or) is:

Or	F	T
F	F	T
T	T	T

Now this table is almost the same as Xor with one exception (T or T) = T vs. (T Xor T) = F. We can still call F an identity since it combines with T to give T, but you can see with that one change we lost our inverse for T so this is a now "downgraded" to Monoid status.

Now if we add Logical Conjunction (And) back we get the familiar values in Caley tables:

Or	F	T
F	F	T
T	T	T

And	F	T
F	F	F
T	F	T

Above we noted that the Group (+, Z₂) and the Monoid (*, Z₂) formed a Ring. Since Or is a Monoid these now combine to form a slightly different structure, a SemiRing.

Another interesting observation about these sets of operators comes from a logical standpoint. The combination of these two operators are what’s known as not being functionally complete, which means that there exists truth functional statements that cannot be expressed by these two operators alone, this can be fixed by adding negation which yields a truth functionally complete set of two dyadic and one monadic² operators: {And, Or, Not}, which is a pretty well known and familiar set of logical operators. The Ring operators from above {Xor, And} also form a functionally complete set. It turns out that just {And, Not} and just {Or, Not} are also each functionally complete sets as are the one element operator sets {Nand} and {Nor}, this is why digital logic circuits can be constructed solely with Nand Gates.

The thing that bothered me the most was the fact that And and Xor had this nice mapping from Boolean Logic with {T,F} to Binary Integer Arithmetic {0,1} but Or seemed to be an oddball and then I realized that you can remap both Logical Conjunction and Logical Disjunction to the Numerical functions Min and Max, respectively:

Max(0,0) = 0

Max(0,1) = Max(1,0) =1

Max(1,1) =1

Min(0,0) = 0

Min(0,1) = Min(1,0) =0

Min(1,1) =1

In Caley table form which better shows the isomorphisms (≅ means bidirectionally isomorphic to) from our Caley tables above: (And, {T,F}) ≅ (Min, Z₂) and (Or, {T,F}) ≅ (Max, Z₂):

Max	0	1
0	0	1
1	1	1

Min	0	1
0	0	0
1	0	1

Interestingly this mapping is the effectively definition of Fuzzy Logic mapped to the discrete cases (two value logic). Boolean Logic also maps to Lattices with disjunction being meet and conjunction being join, the min and max ideas give a nice intuition in relation to ordering.

To recap we saw a number of Group and Monoid Isomorphisms (remember Z₂ = {0, 1}), we saw:

(+, Z₂) ≅ (+, {Odd, Even})

(*, Z₂) ≅ (*, {Odd, Even})

(+, Z₂) ≅ (Xor, {F, T})

(*, Z₂) ≅ (And, {F, T})

(Or, {F, T}) ≅ (Max, Z₂)

(And, {F, T}) ≅ (Min, Z₂)

We also saw Ring Isomorphisms and a Semiring Isomorphism:

[(+, Z₂), (*, Z₂))] ≅ [(+, {Odd, Even}), (*, {Odd, Even})]

[(+, Z₂), (*, Z₂))] ≅ [(Xor, {F, T}), (And, {F, T})]

[(Or, {F, T}), (And, {F, T})] ≅ [(Max, Z₂), (Min, Z₂))]

I have always felt that examples are helpful and I think these give some sense of the diversity that can be found in just a small little corner of math. In researching this I came across some interesting facts which also highlight this diversity, for example the Nand operation is not associative but is commutative so it’s a commutative Groupoid. The Octonions, which are hypercomplex numbers,are neither associative nor commutative under multiplication but have an identity of 1 making it a non commutative Groupoid with an identity, so my point here is that the above table of basic algebraic structures that makes everything look so hierarchical is deceiving as the diversity does not seem to be constrained in any way, as far as I can tell.

¹ This term is named after the Norwegian mathematician Niels Henrik Abel who developed Group Theory independently of Galois and like Galois tragically died young(<30).

² Dyadic means takes two operators, e.g. (And, Or) and monadic means takes one operand, in this case the negation (Not) operator, this should not be confused with other uses of the term Monad.

24 January 2012

Real Software Engineering: The Good, The Bad, and The Ugly

Deconstructing Glenn VanderBurg’s "Real Software Engineering"

Many years ago when I started my career, fresh out of school with a Bachelors in Computer Science, I chose to answer that ubiquitous question "What do you do?" with the title of "Software Engineer". Now you must realize that this was back at a time when any technical title including Programmer was as rare and mysterious as any engineering field to the average person who probably had never used a computer, not like today when everyone seems to have one in their hand at all times. When I was younger I always felt weird calling myself a programmer and I had a number of associations with other engineering disciplines, including starting my college studies in Chemical Engineering. Actually I was a chemistry nerd before I was a computer nerd, which you might have guessed as much if you read my previous post on Software Engineering. I also felt weird and perhaps even a bit phony using the title Software Engineer, I still do, so now I mostly go by Software Developer.

My original foray into the contemplation of Software Engineering lead me to do some more research which yielded an enlightening and informative presentation by Glenn Vanderburg called "Real Software Engineering", PDF Slides here, which attempts to explore ideas of what a real discipline of Software Engineering might be. I strongly recommend watching it. It’s about 52 minutes long. I feel that his presentation has some very good points but it also has some flaws and biases.

The good

He starts with a critique about the state of software engineering today, which includes the following points:

Doesn’t reliably control costs.

Doesn’t reliably produce quality software.

Sometimes doesn’t produce software at all.

Software Engineering doesn’t work.

The term engineering is reserved for practices that work, a good capsule definition of engineering independent of any particular discipline.

The set of practices and techniques that have been determined to work reliably through experience.

Software engineering is a caricature of engineering.

In software we have these practices that don’t work and that is called engineering

I find myself agreeing with these, and sadly Software Engineering does seem to be a ridiculous farce that often seems to resemble a real engineering practice only minimally.

He then goes on to talk about the NATO Software Engineering Conference in 1968 which he cites as the origin of the waterfall process as a misinterpretation of a document called:
MANAGING THE DEVELOPMENT OF LARGE SOFTWARE SYSTEMS by Dr. Winston W. Royce who it seems inadvertently started the waterfall process through what might be called "Poor Information Design".

He continues with a quote from Parness about the document oriented path into engineering captured by the following quote:

In Enginering ... people design through documents.

-Parness

This is followed by a list that might be considered failed attempts at such an approach:

CASE Tools

OOSE

VDM

UML

MDA

OMT

Booch

He refers to this as:

Modelling and what I call ‘Math Envy’

While I agree with this last statement in practice, I feel that he goes too far with this opinion, more on that later.

His presentation provides a good historical survey of Software Engineering with respect to the 1968 NATO Software Engineering Conference and is worth watching for that alone. He continues with some great observations about engineering in general:

Engineering is a creative activity it’s about designing and building and creating new things and there are always blind alleys and missteps and mistakes and discoveries and reactions to those reactions and adjustments.

I especially like the trial and error aspect in relation to creativity. I think this is a fundamentally important and potentially overlooked aspect of software development.

The following succinctly sums up what might be the kernel of general engineering:

Different Engineering Disciplines are Different

Different materials, physical effects, forces

Different degrees of complexity in requirements, designs, processes, and artifacts

Varied reliance on formal modeling and analysis vs. experimentation, prototyping, and testing

Varied use of defined and empirical processes

This leads to a comparison between the empirical vs. formal predefined models:

The defined process control model requires that every piece of work be completely understood. A defined process can be started and allowed to run until completion, with the same results every time.

The empirical process control model provides and exercises control through frequent inspection and adaptation for processes that are imperfectly defined and generate unpredictable and unrepeatable outputs.

He then takes the following ideas from Structural Engineering:

Structural engineering is the science and art of designing and making, with economy and elegance, […] structures so that they can safely resist the forces to which they may be subjected.

—Structural Engineer’s Association

Science and art, it employs the findings of science and yet is done with creativity "designing and making" engineers don’t sketch something on paper and validate it with math and throw it over the wall and never look at it again they participate in the building of what they design structural engineers have hard hats in their office because they visit the site and talk to the construction people and work with them during the process. "With economy and elegance" cost is always an object.

From these ideas he derives a brief description of Software Engineering:

Software engineering is the science and art of designing and making, with economy and elegance, Systems so that they can readily adapt to situations which they may be subjected.

With the following caveat:

Software engineering will be different from other kinds of engineering.

We know that software engineering will be different from other kinds of engineering we know that because other kinds of engineering are different from each other. Software systems are usually more complex in terms of number of moving parts and things like that than artifacts in other engineering disciplines we’re not working with physical materials we have different laws we work with and it is rare for a software project to run up against the limitations of physical materials, very rare.

He reiterates this from the Structural Engineering piece:

Cost is always an object.

This is more eloquently stated by Arthur Mellen Wellington:

Engineering is not the art of constructing. It is rather the art of not constructing: or, it is the art of doing well with one dollar what any bungler can do with two.

—Arthur Mellen Wellington

The following are elements that in part drive the agile philosophy:

The source code is the model and the document.

Source code is the detailed design

The design and construction of software is tightly coupled.

He draws the following inferences from the above:

Encourage continued innovations in processes

And do like other engineering disciplines do. Take the processes that work and call those engineering.

Agile software development, far from being anti-engineering, represents the responsible application of engineering ideals to the field of software.

I know that was a lot of quotes with sparse commentary, but it seemed like the best way to capture the really good points that he makes in his talk and again I am trying to create a written artifact that I can reference and build upon in the future.

The Bad

One thing that I really don’t like about his presentation is the comparison of the two bridge builders: Robert Maillart who designed the Valtschielbach Bridge in Switzerland among others and Leon Moisseiff.

The distinction being that Maillart used testing and all but one of his bridges are still standing while Moisseiff used modeling most notably Deflection Theory and was responsible for designing the ill fated Tacoma Narrows Bridge aka Galloping Gertie possibly one of the most famous engineering disasters in history. Moisseiff was also involved in the design of Manhattan Bridge which is still standing and his Deflection Theory was used in the design of the Golden Gate Bridge and was later determined to be of sound design and is also still standing. The Frommers travel guide considers the Golden Gate Bridge "possibly the most beautiful, certainly the most photographed bridge in the world", what bridge Maillart design?

I will not dispute that the collapse of the Tacoma Narrows Bridge was probably one of the most famous if not the most famous engineering disaster of all time and Moisseiff certainly suffered as a result, his career was shattered and he died shortly thereafter. Wikipedia states: "Though he [Moisseiff] had designed many other famous spans, the Narrows collapse overshadowed them all. It became a symbol of failed engineering and the dangers of arrogance in design." I am in no position nor do I care to debate the soundness of deflection theory or the reasons for the collapse, actually it seems that reasons for the collapse are pretty complicated.

From my limited understanding of Civil Engineering it seems that problems with wind are a continuous problem for Civil and Structural Engineers, I know that the John Hancock Tower in Boston was plagued by problems where wind induced twisting which caused large plate glass windows to careen hundreds of feet to the street making it a deadly hazard. These were replaced with plywood earning it nicknames like the "Plywood Palace" and causing it to be condemned as a fire hazard, eventually these problems were resolved and it won awards for enhancing the Boston skyline. More recently, only twelve years ago in 2000 the Millennium Bridge in London exhibited similar resonance problems to those of Galloping Gertie, it was shut down and reinforced. So while all of this makes for interesting history of engineering fodder and good showmanship, I am not sure how directly relevant it is to software engineering. I Again I agree with his assertion that testing, perhaps not necessarily TDD in my opinion, is most likely a very important component in creating a real software engineering discipline but I feel this particular comparison argument is flawed as it is completely irrelevant to software engineering and in no way can be used to conclude that TDD should be used in software engineering. Also he contradicts his own argument with the statement, from above:

Software engineering will be different from other kinds of engineering.

He also contradicts this in his slides:

Software is very unlike bridges and buildings.

He ends his presentation abruptly with:

Today software engineering is called Agile.

I hate to say it but the way he states it, it sounds a bit like "Mission Accomplished" to me and I may be misinterpreting him here but to me it sounds like his whole presentation is designed to conclude that Agile and TDD are the solution to the problem and I agree that they are probably the best solutions to the problem today, but I think they are severely lacking as an effective and precise framework for a real engineering discipline, I think we still need more cowbell.

The Ugly

I feel that his presentation has anti-math and anti-academic undertones and I have to say: "No Sir I don’t like it." The two main themes that I take issue with are:

He mostly dismisses the work of non practitioner academics.
He dismisses Math and views it as only a cost saving tool.

At various points He states the following:

Advances come from practitioners not from academia.

"Advances come from practitioners." And then are brought into academia.

Important advances often from practitioners not necessarily from academia and they are brought into the academic world and are refined and so forth. But they come from practitioners.

They were practitioners … and there were a few academics, but the academics had good things to say as well.

Perhaps I am misinterpreting his tone here, but the in the last statement he seems surprised that the academics had something of value to contribute. Actually I feel that he has this somewhat backwards, for example I think the work that companies like Google are doing and the general trend in analytics fundamentally contradict this sentiment. I think the way that both the fields of Computer Science and Software Engineering will inevitably move forward in both efficiently and efficacy will be through more collaboration between practitioners and academicians with both groups refining each other’s ideas. I anticipate and hope this is a trend that we will see more of. In fact some really interesting things are happening in that regard one interesting example is the work that Mike McCandless has done with Lucene using Finite State Transducers which was based on academic work, not to mention the Entropy-Framework relationship that we have already looked at.

He goes on to say:

Mathematical Models were introduced to engineering as a cost saving tool it would be easy if you hear software engineering people talk about math and modeling, it would easy to get the idea that it’s the only way to do things right and that it is a tool for robustness and you don’t know it really works that it is really works unless have math to prove it, I’ve heard all of those statements. But, that’s not why it was introduced to engineering at all it was introduced to save cost because in other engineering fields they are working with real materials and things that require people and labor to go out and build and building prototypes and testing them especially at scale is extremely costly and if you can build a model, an approximation of reality you can save money.

Ask your friendly neighborhood aerospace engineer how much math he would do if building a model of his design and testing it were was effectively instantaneous and free. The answer would probably not be none, but a lot less than I do now.

I am not sure if I agree with the idea, that seems to be implied, which is that mathematical models are strictly a cost saving measure.

Towards the end he states the following are approaches needed to advance software engineering:

Not Math

Not Models

Not Documents

Not copying other disciplines

Learn From practioners

Bias towards empirical processes

Ironically he mentions the following:

Programming languages are formal languages some of them even have mathematically specified semantics ... and we understand the math behind them at least as well as your average structural engineer understands the math behind the procedures that he was taught to do to validate his designs.

Programs themselves are models of the solution to the problem, our very tools are mathematical in nature.

In the slides he has the following quote, which I think he skips in his presentation due to time constraints, but I believe it is intended to reinforce his anti math-argument:

What is needed is not classical mathematics, but mathematics. Systems should be built at levels and modules, which form a mathematical structure.

—Friedrich Bauer

I was unable to find it elsewhere, so I don’t know the context of the Friedrich Bauer quote, but I find the quote interesting. I feel the key here is: "not classical mathematics, but mathematics". My interpretation of this is that he is saying that math is important, but not the math many people think about, hence "not classical mathematics".

In Conclusion

I hope I didn’t come across as too harsh on Glenn Vanderburg but I am just being honest and as he said about Bruce Eckel, "He said it so I get to pick on him".

The real strength of his presentation is that he really does step back from software and give some perspective on general engineering, and while we can all debate the utility of comparing different engineering fields it does hold that they all have a several common threads, including the application of artistry and creativity to design and construction, the objective to design and build things, working within certain constraints including cost, and the organization of people and resources to accomplish the tasks of design and construction.

I admit that my experience with real Agile is limited and while I like what I have seen. I still doubt that it truly fulfills very little beyond an empirical approach and it seems that its focus is primarily on the process of creating software, which is not necessarily a bad thing. The structural aspect of software is not ignored by Agile, but I do feel that it is the structural aspects of software which is where we currently lack a real engineering framework. This is really what needs to become more established. Ideas like Coupling, Cohesion, Refactoring, Patterns, Anti-patterns, etc. are the state of the art in terms of structure but they seem too "squishy" which leads me to ask the question, is it not possible to have more comprehensive math backed concepts like the types of things you find in other engineering disciplines?

The answers to these questions in the past have been dominated by ideas like formal methods and metrics, ideas which have mostly failed to live up to a real promise, however, that does not mean that we should completely discount these and similar ideas in the future. These and other more math oriented approaches may represent a failed lineage that will go extinct or they may represent the nascent ideas from which we are only now developing the theoretical and mathematical ideas to truly conquer these problems. For example the Nato Conference of 1968 is often cited as a milestone in the development of software engineering. It was only about forty years proir that Category Theory was discovered. I wonder how many attendees or general practitioners knew about it, unlike today where it seems that almost every software blogger writes about Monads and many also mention Category Theory. I feel that the new era of software engineering will be forged from math but it won’t be "Classical Math" like Calculus and Algebra over the Field of Real Numbers, it will be from areas like Graph Theory, Lattice Theory, Category Theory, Information Theory and other related disciplines and not only will it be built on those disciplines but also the interrelation of those disciplines. I may be wrong about this but I have seen research that seems to point in these directions, ideas I will explore in future posts. It seems crazy to assume that the failures of the past imply failures of the future. If math can make software cheaper and more reliable to produce it will become an inevitability.