19 April 2011

Math You Can Use




Refactoring If Statements with De Morgan's Laws

When I was in college, years ago, I discovered a little trick to use De Morgan's Laws, a basic principle from Discrete Math, to manipulate logical statements while Programming and I employ it whenever it fits. I started doing this so long ago that I do not recall if I came up with it on my own, I did take a lot of extra Logic classes in college, or if I picked it up from someone else. So naturally I was excited to write a blog entry about it and did, however, as I was finishing it up it occurred to me to Google it, and although I was not really surprised to find other mention of the idea, I was, however, dismayed to find out that Steve McConnell mentions it in his book, Code Complete, which is highly recommended and also has been on my list of books to read, I guess I am disheartened because now it might seem like I am just taking an idea from a fairly prominent book, and in looking through the book I see many good ideas, some that I think have been expressed by others as well, so the re-expressing a good software idea seems both justifiable and one that is consistent with the goals of my blog, so I am glad that I did not Google it earlier which might have discouraged me from writing this.

In Code Complete, this idea is treated, consistent with the rest of the book, very practically and concisely. It is discussed solely as a refactoring technique, and it is a good one to know, but to me it is much more than that, it is the actual application of Math to that act of Programming! And this presents an opportunity to use this technique to help Programmers understand an interesting Mathematical concept.

So let’s run through how this works. To apply this to Programming mechanics we can look at this from a symbol manipulation point of view, after all Programming in some senses is the manipulation of lexical symbols. Strictly speaking, it involves symbolic variants, if you will, of the distributive property and of factoring, in regards to the negation symbol.

The two forms of Demorgan’s Law, described using standard Java/C++ logical operators are:

!(a || b) == (!a && !b)

!(a && b) == (!a || !b)

Looking at either of the above logical forms of the De Morgan’s statements, identities actually, you can see that going left to right, the negation symbol is "distributed" across the two terms and going right to left, the negation symbol is "factored" out from the two terms and applied to the whole statement. Each operation causes the "and" symbol to be swapped with the "or" symbol or vice versa. A simple mnemonic might be "either distribute and flip or factor and flip." Please note that flip means switching between these two operators, and that these are not inverse operations.

Ok, so let’s put this to use, a recommended refactoring, also mentioned in Code Complete, is the "Reverse Conditional," . If you were to come across the following Java/C++ code, you would look at it and see that it is negative, if you have been using the Demorgan’s Law refactoring trick, you will instantly know how to refactor it:

if((a != b) || (c != d)) {

statementA1;

...

statementAn;

} else {

statementB1;

...

statementBn;

}

So we need to get the positive conditional which means we need to negate the current if statement’s logic, let’s walk through it:

First we’ll pull the negative’s out, using the following rule:

(x != a) == !(x ==a), I don’t know what this is called, but it should be pretty obvious that it is true.

So now we have:

if(!(a == b) || !(c == d)) {

Now we factor and flip:

if(!((a == b) && (c == d))) {

Negating gets rid of the "!" because !!x == x, which is called double negation, and the final refactored code with the if and else blocks reversed is:

if((a == b) && (c == d)) {

statementB1;

statementBn;

} else {

statementA1;

statementAn;

}

So now, you either already knew this or now know it, if this is new to you I would recommend that practice it. Perhaps write some test code and play with the various forms translating each form back and forth and seeing how the logic works.

Now let’s look at this from a more formal perspective:

¬ (p ∨ q) ⇔ (¬ p) ∧ (¬ q)

¬ (p ∧ q) ⇔ (¬ p) ∨ (¬ q)

Just a couple quick notes on the notation here, you most likely already know these ideas, but in case you are not familiar with this notation: The symbol ¬ means logical negation, the symbol ∧ is logical conjunction, aka "and," the symbol ∨ is logical disjunction aka "or," and the double arrow ⇔ means logical equivalence which can be read as "if and only if." The parentheses serve the usual purpose.

De Morgan’s laws also apply to set theory, and the Set Theoretic forms of De Morgan's Laws are:

Where ∪ is set union, ∩ is set intersection and X is set complementation.

If you received a CS degree then most likely you took some form Discrete Math which probably included Set Theory and possibly some basic Propositional Logic, if for whatever reason you didn’t learn them, I highly recommend learning them, and if you’ve forgotten them, I highly recommend reviewing them.

Set theory and Propositional Logic are intertwined, with a number of concepts which run parallel between the two disciplines and De Morgan's Law is not the only one. Additionally it seems that De Morgan's Law shows up a few other places as well.

Logic is a central construct in Programming and Math, in many respects it is one of the foundations of all of Math, of course like all things in Math it can also become very involved and complex once you dig just slightly below the surface, but this is really a good thing because it will never get boring. Also the interrelationship between Logic and Set Theory seems, to me, to be pretty intricate and extremely profound.

If you are a non Math oriented Programmer but want to be, my recommendation is that you take this refactoring trick and its associated Logic related Math and study it and make it part of your repertoire of skills. To put in mantra form: Learn it. Know it. Live it.

3 comments:

  1. You converted an equation with negative attitude "if((a != b) || (c != d))" into an equation with positive attitude "if((a == b) && (c == d))" :) . I did the same an year back, without knowing any maths. Let me share my experience with you.

    I am very poor at maths and there was this old code at work where there are 6 or 8 variables spread over 3 lines combined in various odd ways with && and || along with weird negations (!). The whole conditional-statement was totally unclear, clumsy and complete mess when it comes to one question: What this conditional check is doing ? (How it is doing something was a bigger mess). I was supposed to add some feature to that software and that conditional-statement was coming in the way, not to mention previous maintainers faced the same issue but that code still lied there for years. And then I decided to clean it up out of frustration.

    Guess What, I read your article only today but I did the same thing at that time, plus I grouped several statements into one, learned the "why" behind the conditional by reading code and even removed some statements. I did this without any knowledge of logic, set theory or De Morgan's laws. In fact, I learned all of this technique from comp.lang.c --> "positivity is easier to understand than negation". I have been to comp.lang.c, wrote several thousand lines of code at work and I stil gang in there to learn.

    My question you just used Maths as language to do that while I read the same English from comp.lang.c . Does that mean Maths gives same benefits as experience in programming ? That it is actually not Maths but a certain kind of thinking required to solve problems in programming ?

    ReplyDelete
  2. I think this specific example isn't a strong case for the importance of these mathematical theories in programming. Your refactored code merely makes sense; it doesn't require anyone to understand the theory to be able to simply understand, by pure logic, that the refactored code has the same effect as the original.

    In the original code we basically have a Pass/Fail scenario where Condition 1 is Fail and Condition 2 is Pass: If A!=B OR C!=D, Fail, otherwise, Pass (meaning A=B AND C=D because either statement returning false would result in Failure).

    It seems only logical to reform the statement so your pass conditions are strictly met and your failure conditions are met via "else" aka "any other result." So if A=B & C=D, Pass, otherwise, Fail.

    ReplyDelete
    Replies
    1. Note: The issue here may be that I simply understand what you're putting out but am simply unfamiliar with the formal representation of it. I would be remiss to assume it makes sense to everyone.

      Delete