Elegant Coding: Scala

Showing posts with label Scala. Show all posts

11 March 2013

Programming and Order Theory

Covariance, Contravariance and Order Theory

In this post I make the observation that covariance and contravariance in programming are what are known as Order Duals. I am not the first person to make this observation, however, these ideas often tend to be buried in academic research papers like "Adding Axioms to Cardelli-Wegner Subtyping" by Anthony J H Simons, don’t get me wrong I love these types of papers, they give me hope and inspiration that software engineering will someday become a first class engineering citizen. Unfortunately, these types of papers tend to be too theoretical and thus not very accessible for the average developer. This is unfortunate as the idea of covariance and contravariance as order duals both puts these concepts into the mathematical context of order theory and possibly gives programmers some native context for order theory. So hopefully this post will make these ideas more programmer friendly.

I previously wrote a post about lattice theory, which is part of the more general order theory, where I talked about some basic order theory ideas such as duality. Order theory occurs quite a lot in software and programming and this is part of a series of posts to talk about those occurrences.

Covariance and contravariance receive a fair amount of attention, as they should, in software blogs and some of these posts include some interesting observations. One, perhaps slightly off topic observation, is "Liskov Substitution Principle is Contravariance" which is an interesting observation and interesting post if you overlook the disdainful tone towards OO. Another more relevant post which is a nice post about "Covariance and Contravaiance in Scala" relates these ideas to category theory which is relevant especially since apparently you can think of "Category Theory as Coherently Constructive Lattice Theory", warning heavy going in that paper.

Defining Covariance and Contravariance

To me one of the most striking and perhaps apropos examples of order theory in software is that of covariance and contravariance which Eric Lippert defines on his blog as:

The first thing to understand is that for any two types T and U, exactly one of the following statements is true:

T is bigger than U.

T is smaller than U.

T is equal to U.

T is not related to U.

For example, consider a type hierarchy consisting of Animal, Mammal, Reptile, Giraffe, Tiger, Snake and Turtle, with the obvious relationships. (Mammal is a subclass of Animal, etc.) Mammal is a bigger type than Giraffe and smaller than Animal, and obviously equal to Mammal. But Mammal is neither bigger than, smaller than, nor equal to Reptile, it’s just different.

He has an eleven part series on covariance and contravariance, his posts cover some C# implantation details but the ideas are generally applicable and looking at one language’s details can help with comparing and contrasting to other languages.

Wikipedia includes the following definition, the animal example is pretty popular:

Covariant: converting from wider (Animals) to narrower (Cats).

Contravariant: converting from narrower (Triangles) to wider (Shapes).

Invariant: Not able to convert.

Including this is slightly redundant but this definition captures the conversion aspect and defines the relationships explicitly.

Covariance and Contravariance as Order Duals

The above are definitions that have order theory written all over them. In fact that is pretty much a text book definition of an order Relation in that it is reflexive, transitive, and antisymmetric. It is reflexive since Animal = Animal, transitive since Animal ≤ Mammal ≤ Cat implies Animal ≤ Cat, and antisymmetric since Animal ≤ Mammal implies not Animal ≥ Mammal and in the animal example there are cases of both comparability and incomparability as you would find in a partial order.

As you can see from the above definitions both sets of terms, wider/narrower or bigger/smaller, which are the same, define an order dual for comparison. To write it more formally we will call the set C classes of various types in an OO hierarchy. So covariant would be represented by less than or equals ≤ and contravariant would be represented by greater than or equals ≥ and a set of classes with these order relations can be written with mathematical notation as (C, ≤) = (C, ≥)^d .

Types as Sets of Fields

It was in researching this post that I came across the paper "Adding Axioms to Cardelli-Wegner Subtyping". These kinds of discoveries are one of the reasons I write these posts. In that paper they quote another paper "On understanding types, data abstraction and polymorphism" by Luca Cardelli and Peter Wegner:

a type A is included in, or is a subtype of another type B when all the values of type A are also values of B, that is, exactly when A, considered as a set of values, is a subset of B

The ideas about types and subtypes covered in these papers extend beyond which fields a class or object has, however, I thought it would be interesting and beneficial to limit the discussion to that case. One reason is that if you take the fields of an object or class then all subtype collections of fields will be found in the powerset and all subtypes will be a subset relation and these can be drawn as my favorite lattice, yes I have favorite lattice, the powerset lattice. Also in this case covariance and contravariance are now defined as the subset and superset operations on the powerset lattice.

Types as Sets of Fields in the Real World

Now I always feel that a real example helps quite a bit so I have created a set of example classes, Scala traits actually, which illustrate the above ideas using a quasi-real-world example. Please note that these code examples are designed for the purposes of illustrating these ideas and may contain design issues that one would not implement in the real world, but they should be close enough to bridge the conceptual gap, if you will. Also this first example is should be applicable to dynamically typed languages that might use duck typing or structural typing.

The following Scala traits define a possible domain object hierarchy that could be used to persist data to a database and render it back to a web page among other possible uses:

trait BaseDomain {

override def equals(that: Any) : Boolean

override def hashCode : Int

}

trait PersonInfo extends BaseDomain {

var firstName : String

var lastName : String

}

trait Address extends BaseDomain {

var street : String

var street2 : String

var city : String

var state : String

var country : String

var zipCode : String

}

trait PhoneNumber extends BaseDomain {

var phoneNumber : String

var extension : String

}

trait Person extends BaseDomain {

var personInfo : PersonInfo

var address : Address

var phoneNumber : PhoneNumber

}

These traits yield the following field powerset lattice:

Now suppose we would want to define a type for each of the above lattice points, which we probably would not do but there may be cases to do similar types of things in the real world. Let’s define the following Scala traits that wrap the above domain object hierarchy elements:

trait PersonInfoTrait extends BaseDomain {

var personInfo : PersonInfo

}

trait AddressTrait {

var address : Address

}

trait PhoneNumberTrait extends BaseDomain {

var phoneNumber : PhoneNumber

}

trait PersonInfoAddressTrait extends AddressTrait with PersonInfoTrait {

}

trait AddressPhoneNumberTrait extends AddressTrait with PhoneNumberTrait {

}

trait PersonInfoPhoneNumberTrait extends PhoneNumberTrait with PersonInfoTrait {

}

trait PersonTrait extends PersonInfoAddressTrait with AddressPhoneNumberTrait with PersonInfoPhoneNumberTrait {

}

Since Scala traits support multiple inheritance we can define the above type hierarchy which can be drawn as the powerset lattice that it is:

Again we can see covariant and contravariant types defined on this lattice and each relation is actually the subset superset relation on fields.

I feel the basic observations and the above examples on order duals and covariance and contravariance make the ideas pretty strait forward for field set context. In writing this I delved into a number of papers, adding some of them to my "to read list", on Types in programming and Type Theory and I feel there are probably some deeper insights and implications of all this.

22 October 2012

The Math of the First Scala Assignment

Math and Scala with a Little Bit of Clojure

Martin Odersky is teaching Functional Programming Principles in Scala on Coursera and I attempted to complete this course, I completed the first assignment on recursion before the hard deadline which was hard given my work schedule at the time, however, I decided not to submit it, I remembered how much I hate school, or at least being a student in a school even a virtual one, I have my own timeline and my own interests. While doing the problems I realized, and I know this is not a major revelation, that these first three problems are set in the context of math. However, some of that math may not be immediately obvious to everyone so I thought it would make a nice post. I was originally going to publish my solutions in this post, after the hard deadline of course, but there was recently some unpleasantness about publishing solutions and cheating and it seems that it is still possibly a violation of the honor code. I feel having some code for two of the problems increases the quality of this posting so I will provide my examples for those solutions in Clojure, the non solution code will be in Scala so it’s now a hybrid posting. Also the ideas for this post forced me to finally write my previous post on Pascal’s Triangle, which this post will draw on.

Exercise 1: Pascal’s Triangle

The first problem is to recursively compute a binomial coefficient within Pascal’s triangle, this is fairly trivial if you have exposure to recurrence relations and recognize it as such, it is expressed as follows:

Some other common recurrence relations that are popular in CS educational circles include, Fibonacci numbers, triangular numbers and of course the factorial function these are expressed as:

I chose a term based representation of the factorial function over the more traditional:

These can be implemented recursively in Scala as:

def fibonacci(n : Int) : Long = {

if(n <= 1)

else

fibonacci(n - 1) + fibonacci(n - 2)

}

def triangularNumber(n : Int) : Long = {

if(n <= 1)

else

n + triangularNumber(n - 1)

}

def factorial(n : Int) : Long = {

if(n <= 1)

else

n * factorial(n - 1)

}

All of the above examples are inefficient and in many cases considered to be naïve solutions. The Fibonacci example is terrible as it incurs an exponential number of function calls, fortunately there ways to optimize these algorithms. One potential approach would be to use the closed formulas for triangular numbers or Binet’s formula for Fibonacci numbers. I will ignore those as I have covered them previously and they are not what I am going for here. I want to look at two optimizations, one is a way is the use of the tail call optimization and the second is to reduce the number of function calls.

Tail call optimization, put simply, is a mechanism that allows certain recursive functions that are written in a tail recursive form to be converted to iteration when they are executed. It is my understanding that this requires some extra work in the JVM, at least in the older versions, I think 7 and 8 makes changes for this. So this extra work requires some special syntax beyond simply putting your code in tail recursive form, Lisp and Scheme for example do this automatically, in Scala this requires the @tailrec annotation, using this form our functions get converted as follows:

def triangularNumber(n:Int) : Long = {

@tailrec

def triangularNumberAcc(acc: Long, n: Int): Long = {

if (n <= 1) acc

else triangularNumberAcc(n + acc, n - 1)

}

triangularNumberAcc(1, n)

}

def factorial(n:Int) : Long = {

@tailrec

def factorialAcc(acc: Long, n: Int): Long = {

if (n <= 1) acc

else factorialAcc(n * acc, n - 1)

}

factorialAcc(1, n)

}

As you can see the tail recursive forms take on an accumulator parameter. The other optimization I mentioned was to get rid of the extra function call for the Fibonacci, this is done by calculating the series using a second accumulator parameter, so that term Fn and term Fn+1 are passed as accumulator parameters. This following example came out really nice as the base function sequenceGenerator can also generate the Lucas numbers if seeded appropriately:

@tailrec

def sequenceGenerator(n: Int, t0: Long, t1: Long): Long = {

if(n <= 0)

else

sequenceGenerator(n - 1, t1, t0 + t1)

}

def fibonacci(n : Int) : Long = {

sequenceGenerator(n, 0, 1)

}

def lucas(n : Int) : Long = {

sequenceGenerator(n, 2, 1)

}

Going back to Pascal’s Identity if we look at the example from my last post:

We can represent the values that get calculated for this example visually as:

This gives some sense of how the recursive solution moves up through the triangle to sum the values, the example code, in Clojure, the naïve solution is:

(defn bc [r c]

(if (or (<= c 0) (<= r 0) (<= r c))

(+ (bc (dec r) (dec c)) (bc (dec r) c))

)

Additionally the Clojure version of our Binomial Coefficient can be similarly single call/tail call optimized:

(defn bc [rr cc]

(loop [r rr c cc acc 1]

(if (or (= c 0) (= r 0) (= r c))

acc

(recur (dec r) (dec c) (/ (* acc r) c)))

)

As mentioned most of these have closed form formulas like Binet’s formula for Fibonacci numbers. Also if we take the formula for the binomial coefficient and since r ≥ c we can get the following cancelation of c!:

Which can be implemented in Scala, using the factorial function defined above, as:

def partialFactorial(s : Int, e: Int) : Long = {

@tailrec

def partialFactorialAcc(s : Int, e: Int, acc: Long): Long = {

if(e >= s) acc

else partialFactorialAcc(s - 1, e, s * acc)

}

partialFactorialAcc(e, s, e)

}

def binomialCoefficient(r : Int , c : Int) : Long = {

if((c <= 0) || (r <= 0) || (r <= c))

else

partialFactorial(r, c+1)/factorial(r-c)

}

Both of these two “optimized” solutions for the binomial coefficient introduce division which can be problematic due to remainders and floating point rounding errors this would also be the case for the closed form approaches to calculating these numbers. There is a variant for the binomial coefficient that is discussed in a paper which uses a gcd calculation to try to deal with these issues.

Another interesting observation in terms of how values are calculated between the naïve recursive approach and the tail recursive accumulator approach is that there seems to be some order duality on how the calculations are performed. In the naïve approaches the recursion goes to the bottom and accumulates as it comes back up in the call stack. In the accumulator versions the values are accumulated on the way down and returned from the bottom of the call stack.

Exercise 2: Parentheses Balancing

The next problem is not so much about the solution as it is about the combinatorial properties of the input values. The problem was to take a string and determine if the parenthesis in the string matched, this is also probably pretty common CS educational fodder. So there will be no code for this problem.

The math that we are interested in here relates to the central binomial coefficient, it is a number sequence known as the Catalan numbers after the Belgian mathematician Eugène Charles Catalan. It is given by the following equations:

In the original problem the text containing parentheses was the input, since a parenthesis counting algorithm would need to filter the text or implement some type of get next parenthesis function, we will assume that we have a string that only has only parenthesis. First thing to notice that valid input strings need to be of even length. Each term of the Catalan numbers describes how many possible sets of matching parenthesis you can have for each number n. The table below illustrates the first five possibilities:

n	C_n	Matching parentheses set	Total String Size 2²ⁿ
n=0	1 way	λ	0
n=1	1 way	()	2
n=2	2 ways	()(), (())	16
n=3	5 ways	()()(), ()(()), (())(), (()()), ((()))	64
n=4	14 ways	()()()(), ()()(()), ()(())(), ()(()()), ()((())), (())()(), (())(()), (()())(), ((()))(), (()()()), (()(())), ((())()), ((()())), (((())))	256
n=5	42 ways	()()()()(), ()()()(()), ()()(())(), ()()(()()), ()()((())), ()(())()(), ()(())(()), ()(()())(), ()((()))(), ()(()()()), ()(()(())), ()((())()), ()((()())), ()(((()))), (())()()(), (())()(()), (())(())(), (())(()()), (())((())), (()())()(), (()())(()), ((()))()(), ((()))(()), (()()())(), (()(()))(), ((())())(), ((()()))(), (((())))(), (()()()()), (()()(())), (()(())()), (()(()())), (()((()))), ((())()()), ((())(())), ((()())()), (((()))()), ((()()())), ((()(()))), (((())())), (((()()))), ((((()))))	1024

The table above also includes the number of possible strings with only the two parentheses, which grows at a rate 2²ⁿ. This table is based on one from a paper by Tom Davis on Catalan numbers. It is quite interesting paper as Catalan numbers also describe diagonal avoiding paths, “mountain ranges”, noncrossing partitions, multiplication orderings, the number of different ways a convex polygon with n + 2 sides can be cut into triangles, to mention a few. Another interesting thing about Catalan numbers is they can be described by a recurrence relation:

I confess that this may not be that pragmatic in terms of solving the problem, but I think it is interesting and possibly gives a larger context to the problem.

Exercise 3: Counting Change

The last problem, which is probably something of a CS classic problem as it appears in SICP, is the problem of counting change, given some numerical amount and some set of coin denominations we need to count the number of ways that we can represent the amount given the set of denominations. I confess to having cheated a little bit, well ok a lot, I found this post by Bill the Lizard. I also found this post with a more elegant solution and translated these into both Scala and Clojure, well ok I guess it’s not cheating since I didn’t submit it. The clojure version is:

(defn count-change [amount coin-values]

(cond (= amount 0) 1

(or (< amount 0) (empty? coin-values)) 0

:else

(+ (count-change amount (rest coin-values))

(count-change (- amount (first coin-values)) coin-values))

)

The math here is from number theory and combinatorics, actually it might fall under combinatorial number theory. In number theory there are two pretty big areas, one is multiplicative number theory which dates back at least to the Greeks and deals with how numbers divide and remainders and prime numbers which includes things like the well known fundamental theorem of arithmetic. A relatively newer area of number theory is known as arithmetic number theory and is mostly attributed to Euler. In fact it was Euler who initiated the study of number partitions in his book Introduction to the Analysis of the Infinite in 1748. It is the math of number partitions that our change problem falls under. We are basically counting a subset of the partitions of a number’s full partition set. In fact we can create a function that uses a Clojure list comprehension to create a number partition function from our change counting function:

(defn number-partition [n]

(count-change n (for [x (range 1 (inc n))] x ))

)

Functions to count partitions have been studied over the years and Euler invented a generating function that gives rise to, you guessed it, a recurrence equation to count partitions:

School’s out

So there it is some (hopefully) interesting math context for these problems. I guess it’s kind of a gray area but I am pretty sure I didn’t cross the line on the whole honor code thing but even if I did I believe it’s covered by the People's Freedom of Choices and Voices Act.

Well it was a fun post to write, I got some code in a post, something I haven't done lately, which is only fitting for a blog called Elegant Coding, I think it came out pretty decent but remember I am still learning Scala and Clojure so my code examples here may not be as elegant as they could be. I guess another objective of this post and my blog in general is to explore ways to mix programming and math education, while I think this post makes for a nice start along these lines, it could probably be improved which would be nice.

29 April 2012

What is Generic Programming?

Over the course of my career I have strived to write reusable code, always looking for more general patterns and trying to refactor code to more flexible more generic versions sometimes with mixed success. Lately it has become something of an obsession of mine in that I both try to employ it and have been endeavoring to learn more about it, what it really is, and how to do it right, as I have previously pondered I believe there is a complexity cost with increased reuse. Over the last few years, which I have been programming in Java, generics has become a central tool in achieving reuse. So such a concept comes to mind but it is just one idea in a larger framework of ideas on generic programming.

There is a fair amount of research and discussion of the ideas pertaining to generic programming, the title of this post is taken directly from a paper on the topic, which perhaps not so coincidently was presented at the same OOPSLA (2005) as the paper on Software Reuse and Entropy that I previously discussed. It just seems like every idea I explore is super-interrelated to everything else and this post will be no exception, I can conceptually connect these two papers using a Kevin Bacon like approach using just two other research papers, ideas I have for other posts. I will try to keep this one as focused as possible. Also the referenced paper is, like the Software Framework paper, very math heavy but different math, Category Theory,Order Theory, Universal Algebra, etc. This post will focus on more high level concepts and not the details of math that I wish I could claim that I fully understand.

So back to our question, what is generic programming? For me it is about reuse, and while the information entropy approach to understanding reuse was about reuse from perspective of measure and the reuse (measurable) variation on different domains, I feel that the generic programming concept is about the structure of reuse and how it is implemented in various programming paradigms but the concept itself transcends language specific interpretations and implementations.

As you can tell I am not sure how to exactly answer this question so I will defer to the experts.

"What is generic programming?" in part outlines the paradigmatic distinctions between mainstream OO languages (Java Generics, C++ with STL) and Functional languages (Haskel, Scheme):

As for most programming paradigms, there are several definitions of generic programming in use. In the simplest view generic programming is equated to a set of language mechanisms for implementing type-safe polymorphic containers, such as List<T> in Java. The notion of generic programming that motivated the design of the Standard Template Library (STL) advocates a broader definition: a programming paradigm for designing and developing reusable and efficient collections of algorithms. The functional programming community uses the term as a synonym for polytypic and type-indexed programming, which involves designing functions that operate on data-types having certain algebraic structures.

In "Generic Programming" written in 1988 by David A. Musser and Alexander A. Stepanov use Ada and Scheme and describe generic programming as:

By generic programming, the definition of algorithms and data structures at an abstract or generic level, thereby accomplishing many related programming tasks simultaneously. The central notion is that of generic algorithms, which are parameterized procedural schemata that are completely independent of the underlying data representation and are derived from concrete efficient algorithms.
...
A discipline that consists of the gradual lifting of concrete algorithms abstracting over details, while retaining the algorithm semantics and efficiency.

That last sentence has a very Agile/Refactoring feel to it.

Ralf Hinze in "Generic Programs and Proofs" describes it as:

A generic program is one that the programmer writes once, but which works over many different data types.
...
Broadly speaking, generic programming aims at relieving the programmer from repeatedly writing functions of similar functionality for different user-defined data types. A generic function such as a pretty printer or a parser is written once and for all times; its specialization to different instances of data types happens without further effort from the user. This way generic programming greatly simplifies the construction and maintenance of software systems as it automatically adapts functions to changes in the representation of data.
...
So, at first sight, generic programming appears to add an extra level of complication and abstraction to programming. However, I claim that generic programming is in many cases actually simpler than conventional programming. The fundamental reason is that genericity gives you "a lot of things for free" ...

This last statement resonates with me as it parallels my thoughts on the desirability of using software frameworks and I feel that I continuously have this conversation (argument) with junior level programmers and also senior level cargo cult coders who don’t get it.

Design Patterns are a common generic concept in mainstream programming. It turns out that a number of design patterns can be described in the framework laid out by these papers and a number of papers do exactly that including "Design Patterns as Higher-Order Datatype-Generic Programs " part of extensive research on the topic by Jeremy Gibbons:

Design patterns, as the subtitle of the seminal book has it, are ‘elements of reusable object-oriented software’. However, within the confines of existing mainstream programming languages, these supposedly reusable elements can only be expressed extra-linguistically: as prose, pictures, and prototypes. We believe that this is not inherent in the patterns themselves, but evidence of a lack of expressivity in the languages of today. We expect that, in the languages of the future, the code parts of design patterns will be expressible as directly-reusable library components. The benefits will be considerable: patterns may then be reasoned about, typechecked, applied and reused, just as any other abstractions can. Indeed, we claim that the languages of tomorrow will suffice; the future is not far away. All that is needed, in addition to what is provided by essentially every programming language, are higher-order (parametrization by code) and datatype-generic (parametrization by type constructor) features. Higher-order constructs have been available for decades in functional programming languages such as ML and Haskell. Datatype genericity can be simulated in existing programming languages [2], but we already have significant experience with robust prototypes of languages that support it natively [2]. We argue our case by capturing as higher-order datatypegeneric programs a small subset ORIGAMI of the Gang of Four (GOF) patterns. (For the sake of rhetorical style, we equate ‘GOF patterns’ with ‘design patterns’.) These programs are parametrized along three dimensions: by the shape of the computation, which is determined by the shape of the underlying data, and represented by a type constructor (an operation on types); by the element type (a type); and by the body of the computation, which is a higher-order argument (a value, typically a function). Although our presentation is in a functional programming style, we do not intend to argue that functional programming is the paradigm of the future (whatever we might feel personally!). Rather, we believe that functional programming languages are a suitable test-bed for experimental language features - as evidenced by parametric polymorphism and list comprehensions, for example, which are both now finding their way into mainstream programming languages such as Java and C#. We expect that the evolution of programming languages will continue to follow the same trend: experimental language features will be developed and explored in small, nimble laboratory languages, and the successful experiments will eventually make their way into the outside world. Specifically, we expect that the mainstream languages of tomorrow will be broadly similar to the languages of today — strongly and statically typed, object-oriented, with an underlying imperative mindset — but incorporating additional features from the functional world — specifically, higher-order operators and datatype genericity

Lately I have been making more of an effort to learn Scala. Bruno Oliveira and Jeremy Gibbons make the case in "Scala for Generic Programmers" that it might be one of or even the best language for expressing Generic Programming concepts in part due to its hybridization of object oriented and functional paradigms. Here is the abstract in its entirety:

Datatype-generic programming involves parametrization by the shape of data, in the form of type constructors such as ‘list of’. Most approaches to datatype-generic programming are developed in the lazy functional programming language Haskell. We argue that the functional object-oriented language Scala is in many ways a better setting. Not only does Scala provide equivalents of all the necessary functional programming features (such parametric polymorphism, higher-order functions, higher-kinded type operations, and type- and constructor-classes), but it also provides the most useful features of object-oriented languages (such as subtyping, overriding, traditional single inheritance, and multiple inheritance in the form of traits). We show how this combination of features benefits datatype-generic programming, using three different approaches as illustrations.

So far for me Scala has been a joy. Its features and flexibility are beautifully elegant of course I am tainted coming from the clumsy, clinking, clanking, clattering caliginous world of Java, BTW Java we need to talk, it’s not you, it’s me. I’ve met someone else.

Jeremy Gibbons Makes the following point in "Datatype-Generic Programming":

Generic programming is about making programming languages more flexible without compromising safety. Both sides of this equation are important, and becoming more so as we seek to do more and more with computer systems, while becoming ever more dependent on their reliability.

This idea is similar to ideas that Gerald Sussman explores in his talk at Strangeloop 2011 "We Really Don’t Know How To Compute". However, his general and perhaps philosophical approach to the idea of generic programming is very different from the previous ideas:

[8:40]We spend all of our time modifying existing code, the problem is our code is not adequately evolvable it is not adequately modifiable to fit the future in fact what we want is to make systems that have the property that they are good for things that the designer didn’t even think of or intend. ... That’s the real problem. ... The problem is that when we build systems whatever they are we program ourselves into holes. ... It means that we make decisions early in some process that spread all over our system the consequences of those decisions such that the things that we want to change later are very difficult to change because we have to change a very large amount of stuff. ... I want to figure ways that we can organize systems so that the consequence of the decisions we make are not expensive to change, we can’t avoid making decisions but we can try to figure out ways to minimize the cost of changing decisions we have made.
...
[11:40]Dynamically extensible generic operations ... I want to dynamically extend things while my program is running ... I want a program to compute up what it is going to do.

He shows an example[19:00] which is "the definition of the method of computing Lagrange Equations from a Lagrangian given a configuration space path". This is apparently easy stuff, to him at least. Fortunately you don’t really need to understand the details of this math to understand his points here:

[20:45]This is a little tiny feeling of what is it needed to make things more flexible consider the value here, the value is that I am dynamically allowed to change the meaning of all my operators to add new things that they can do. Dynamically. Not at compile time, at runtime.
...
[21:15]It gives me tremendous flexibility. Flexibility because the program I wrote to run on a bunch of numbers with only a little bit of tweaking suddenly runs on matrices so long as I didn’t make mistake of commuting something wrong. ... So this makes proving the theorems very hard In fact maybe impossible the cost and the benefits are very extreme. I can pay correctness or proofs of correctness or belief in correctness, I can pay that for tremendous flexibility. ... Is correctness the essential thing I want, I am not opposed to proofs, I love proofs I am an old mathematician. The problem is that putting us into the situation that Mr. Dijkstra got us into ... where you are supposed to prove everything to be right before you write the program getting into that situation puts us in a world where we have to make the specifications for the parts as tight as possible because it’s hard to prove general things, except occasionally it’s easier but it’s usually harder to prove a general theorem about something so you make a very specialized thing that works for a particular case you build this very big tower of stuff and boy is that brittle change the problem a little and it all falls over.

His ideas tend towards adaptable and self modifying code at one point he talks about self modifying code in both assembly language and the use of eval in Lisp. He also touches on a number of parallels in the biological world including genetics and the adaptability of the human immune system to fight off mutating parasites. Interestingly some of his ideas seem similar to work done by Stephanie Forrest on using evolutionary/genetic algorithms for program repair which she speaks about at the keynote of SPLASH 2010, formerly known as OOPSLA. The genetic repair approach at the conceptual level is striving for the same results as what Gerald Sussman describes. Also did you notice that the Levenshtein Distance between "generic programming" and "genetic programming" is (one substitution), pretty weird!

Seriously though, the objectives of all of these approaches are the same: building better more flexible software that handles more situations. These ideas are reflected in how Glenn Vanderburg describes software engineering:

Software engineering is the science and art of designing and making, with economy and elegance, systems so that they can readily adapt to situations which they may be subjected.

I feel this Software Engineering insight reinforces my opinion that we will eventually see a real math backed discipline of software engineering and I think that the research work discussed here both supports my conjecture on this point and most likely lays part of that foundation. I also think some of these ideas reflect ideas I discussed in "The Software Reuse Complexity Curve" which implies that programmers will need to become more aware and more adept at more advanced theoretically based concepts like for example GADT’s (Generalized algebraic datatypes), etc. and of course math in general such as Abstract Algebra.

Clearly the idea of Generic programming has some very specific ideas and some very general objectives. There seem to be two approaches here which might described Predictive vs. Adaptive, much like the idea of Separation of Design and Construction described by Martin Fowler. The Gibbons et al more calculational approach is predictive whereas the Sussman/Forrest ideas describe an adaptive approach. I suspect that this will just be another dimension of program design that will have to be considered and assessed as a tradeoff in its application, which Sussman also mentions.