Elegant Coding: Information Entropy

Showing posts with label Information Entropy. Show all posts

05 February 2012

The Software Reuse Complexity Curve

I have come to the conclusion that Complexity and Reuse have a nonlinear relationship that has a parabolic shape, shown below, Complexity increases on the Y axis and reuse increases on the X axis. Code complexity is high both with very low reuse and very high reuse forming a minima which might be viewed as an optimization point for reusability, now remember this is a very simple two dimensional graph, a real optimization scenario for this type of problem may include many dimensions and probably would not have a nice clearly defined single minima.

Scenario #1 Low Reuse

In a low reuse situation everything that could have been implemented using reusability is duplicated and may also be implemented inconsistently, so the code base will be larger which would require a maintainer to learn and know more code, also since there will be variations of varying degree between the similar components the maintainer would have to be constantly aware of these differences, which might be subtle, one might even find them confusing at times.

As we have previously seen reuse relates to entropy, so a low reuse system would have a large random, “uncompressed” codebase, which has high entropy, I have seen this type of system in my career and it is something I hate to work on.

Skill vs. Complexity

Before I go to the opposite end of the spectrum, which is probably more interesting, and perhaps more speculative on my part, I’d like to talk about the relationship of code complexity to maintainer developer skill.

I believe, and this really shouldn’t be a stretch, that there are some types of complex code that require certain higher level proficiencies and theoretical knowledge, an obvious choice is parsers and compiler construction. A developer who works in this area needs to have a good understanding of a number disciplines, which include but are not limited to: Formal Language Theory and Automata Theory and possibly some Abstract Algebra, which sadly many developers lack. If you had a project that had components that are any type of compiler or parser and needed someone to create and subsequently work on the maintenance of the code, they would need to know these things, they would need to understand this complexity in an organized theoretical way. The alternative would be an ad hoc approach to solving these problems which would probably yield significant complexity issues of its own.

These types of things can be seen at simpler scale, a system that makes use of state machines or heavy use of regular expressions will exhibit more compact and more complex code. The maintainers have to have a skill level to deal with this more complex approach, sadly not all developers do.

Now you can probably find numerous code bases where these types of approaches and others were not used where they should have been, these codebases will be larger and will require a case by case effort to learn something that could have been “compressed” and generalized with better theoretical understanding. It’s a trade off if you cannot hire or retain people who have the advanced skills an advanced codebase can also be a maintenance liability.

Scenario #2 high reuse

Now let’s create a hypothetical example, say we have a system that has many similar components and a brilliant programmer manages to distill this complexity into compact highly reusable code. The problem is that our brilliant programmer is the only one who can really understand it. Ideas similar to this are pondered by John Cook in his post "Holographic Code". Now it might be the case that it code that is really only understandable to our brilliant programmer, or he may be the only one in the organization with that level of proficiency, also in this case we will assume that complexity was not artificially introduced, i.e. not unnecessary complexity, which is also a real problem that arises in software systems. Also we will assume that we know that a less complex solution that is more verbose with less reuse exists, and in fact the solution of the maintainer developer might be to completely rewrite it in a less complex fashion.

As we have previously seen, the complete compressibility of a code base can never be achieved which relates heavily to aspects of both Information Theory and Algorithmic Information theory this results in diminishing returns in terms of shrinking an algorithm or codebase. My speculation here, based on observation, is that as a codebase is "compressed" its complexity increases as shown above. Essentially you need to be increasingly clever to continue to increase reuse. Now some of these reuse complexity issues may be due to our lack of understanding of complexity in relation to software construction. It is feasible that things that we find hard to gain reuse with today may be easier in the future with advances in this understanding.

Another idea from the research covered in my previous post is that different domains exhibit different degrees of Favorability/Resistance to reuse. I would also suspect that the types of reusable components would also vary across different domains this makes me think that perhaps in the future reusability in different domains can be guided both qualitatively and quantitatively by either empirical or theoretical means or both. This means that you might have fore-knowledge of the "compressibility" of a "domain solution" and then you would be able to guide the development effort appropriately.

I feel the ideas here are a step back from a possible absolutist thinking to reuse. Reuse like any important approach in software, for example test coverage, which can also exhibit diminishing returns, needs to be tempered and viewed as an optimization as to the extent of its practice. In many respects that’s what good engineering is, working within constraints and optimizing your outcome within those constraints.

27 November 2011

Eleven Equations True Computer Science Geeks Should (at Least Pretend to) Know

This idea is a complete rip off an article that appeared in Wired a little while ago and it got me thinking what would my list for Computer Science look like? Plus I thought it might be a fun post and unlike the Wired list this one goes to eleven. So here they are in no particular order:

Binomial Coefficient

The Binomial Coefficient equation generates Pascal’s Triangle and gives you the coefficients for the Binomial Theorem these ideas are often attributed to Pascal but in fact they have been known in part for over a millennia.

As I mentioned that this list is no particular order and I don’t wish to play favorites but if there is one equation here that you should really consider learning and committing to memory this is it. It is central to Combinitorics which has connections to many areas of math, I guess I should qualify that if you are serious about computer science and math related programming topics then you should be able to recite and apply it from memory, even when you are drunk, sleep deprived or being held at gun point. Ok that might be a bit much but you get the idea. If you are programmer and you haven’t learned it, or need a refresher, I have a post that relates it to the Java Collections API.

Demorgan’s Laws

Logical form:

Set Form:

Ok so that’s like four "equations" for DeMorgan’s Laws, one can’t help but to struck by the similarity between the two sets and this is not a coincidence these two sets of formulas are essentially the same thing, they are special cases of complemented distributive lattices^† which means technically it’s really just two formulas:

In this case the ∨ symbol means lattice join operation and the ∧ symbol is the lattice meet operation and the dash with the right downward mark means lattice complementation, I used this to differentiate from the tilde for Boolean complementation. Also these two formulas are categorical duals of each other, which means that if one were smart and had a good grasp of Category Theory we could probably be really OCD about it and get it down to one formula.

Understanding Set Theory and Boolean Algebra are very important basic concepts in our profession and the Boolean version can make you better programmer, as you can use it to refactor logical if statements.

Eigenvector and Eigenvalue

This equation and these concepts are not only central to Linear Algebra but these Linear Algebraic ideas and others are both used and extend into many other areas of math including Linear Transformations and Graphs in terms of matrices like the Adjacency Matrix and much more.

Linear Algebra is becoming increasingly central to the types of things we are doing in our profession, it is used heavily in Statistics and Machine Learning not to mention the Page Rank Algorithm is based in part on this equation.

Pumping Lemma for Regular Languages

No Comp Sci list would be complete with at least one formula from either Formal Language Theory or Automata Theory. The problem is that these two are pretty different from other areas of math in terms of "Equations" I tried to think of some basic Ideas and The Pumping Lemma came to mind, and I found the above formula concisely expressed on Wikipedia. This version of the Pumping Lemma is a more limited version of Another Theory which is used to check whether a Language is Regular. Also this is maybe not the best choice as it is pretty dense, if you want a better explanation I recommend this.

While this "equation" is pretty specialized, it is actually a special case of the more general Iteration Theorem, the real point here is really about more general domain knowledge. It is central to what you do every day you are programming such as compilers and regular expressions not to mention the almost ubiquitous Kleene Star.

Information Entropy

I confess after all my machinations to try to reduce Demorgan’s Laws down to less equations I am totally cheating here by pulling a twofer in including two formulas, both Shannon’s Information Theory:

And the formula for Chaitin’s Constant, which is associated with Algorithmic Information Theory and Kolmogorov Complexity:

Interestingly this area has a parallel in the physical word, in terms of Thermodynamic Entropy, which also parallels the Boltzman Equation mentioned in the Wired Article. Seth Loyd draws some interesting connections between computation and the physical world including Entropy and Boltzman’s Equation.

In programming and computer science information is our business and these ideas are central to many things that we deal with especially the storage, transmission, computational transformation (like compression), and computation of information. We’ve already seen the possible relationship between both of these theories and Reusability and Software Frameworks.

Bayes' Theorem

Of all of equations on the list, Bayes' Theorem may stand out on its own merits as being one of the most recognizable and it might even qualify as a poster child for the statistical revolution sweeping our industry. It rose from obscurity to being used for spam detection and with such applications as classifiers, Machine Learning, Data Mining and more.

Mathematically it is part of a rift in statistics and probability and I confess I may not yet call myself "an initiate of the bayesian conspiracy" but it hard to deny its utility plus there seems to be a more axiomatic basis that relates to Boolean Logic Set Theory, which makes it all the more alluring.

Fermat’s Little Theorem

These two equations of Fermat’s Little Theorem are really the same equation written in two different forms, where (a ⊥ p) means coprime and (p ∈ P) means prime.

This is a beautiful little theorem in Number Theory which can be generalized and written even more succinctly using Eulers Totient Function which is represented by φ (n):

These ideas are crucial for understanding encryption algorithms like RSA and there’s this cool result.

Natural Join

Natural Join, like all of the Relational Algebra Join Operations is a composition of the more basic, operations: Selection (σ), Projection (π), Cartesian Product (×), Set Difference (-) and Union (∪). Natural Join is the Cartesian Product of two tables selecting matching rows and eliminating duplicate columns. (In the formula above the projection (π) is the set (A) of attributes with duplicates removed and then selects (σ) the set (C) of common attributes which match in both sets and are equal in both sets.)

The Relational Algebra is in my opinion a bit of an odd beast when it comes to algebraic structures, but if you use SQL you are already using an implementation of it. As data persistence becomes more heterogeneous I think understanding these concepts, will become more important. Additionally Relational Algebra also has strong ties to Abstract Algebra[ and Category Theory[.

The Fixed-Point (Y) Combinator

Just as we need a "equation" from Formal Language Theory or Automata Theory we really should have one that is related to Lambda Calculus, in this case this equation is in the untyped lambda calculus. Even though the Fixed-point Combinator stands up on its own, it is fairly well known, well at least in name due to the use of it by Paul Graham for his incubator (Y-Combinator).

The Fixed-Point Combinator allows one to implement anonymous recursion which is a powerful programming technique. It also ties into some deeper theoretical aspects of computer science and logic such as Combinatory Logic.

O(n)

Many CS majors may cringe at this, and another possible reaction is that’s not much of an equation. There’s actually quite a bit to this for example did you know there are some substantial and quite beautiful equations associated with this seemingly simple formula. They are:

Also the following is a definition of (lim sup) Limit Superior:

Where inf is infimum and sup is supremum two concepts that seem to show up quite a bit especially when you are concerned with bounds like with Lattices and they also seem to show up in relation to Topology.

And we all know why we should care about his one: Algorithms, Algorithms, Algorithms.

Euler’s Identity

Euler’s Identity is also listed in Wired’s article and I have to agree that it is one the most beautiful and intriguing formulas in math.

Now this isn’t really as relevant to CS as the other formulas but if you a true computer geek you are hopefully some kind of math geek, and this is such a great formula, not to mention that it appears in The Simpsons' Homer³. It’s actually a specific case (where θ = π) of Euler’s Formula:

These formulas will give one a better understanding complex numbers which are needed in other areas of advanced math also these formulas relate to Fourier Analysis which does have applications in Computer Science.

I am curious to see how well this list holds up, how will I feel about it in a year? Also in writing this I realized that I still have quite a bit more to learn about at least half of this list, if not all of it, which made it hard to write as it was quite a little (unfinished) research project. I have already written individual articles on several of these equations and have at least two in the works for Linear Algebra, at least two on the Relational Algebra and one planned for Pascal’s Triangle, and those are just the ideas I’ve already had, who knows what the future holds. I hope you enjoyed it as much as I did writing it.

^† The Set version may not apply to all cases in set theory, this assumes Union and Intersection over the Power Set of a Set Universe, this would be closed under these operations and would include complements of all elements. The Set Universe is the upper bound and the empty set is the lower bound.

31 July 2011

Software Frameworks: Resistance isn’t Futile

As I have previously discussed, in my opinion there are three main framework components that can be described succinctly as, libraries, rules, and templates. It is the library component that I wanted to talk about here, perhaps in a context that might be seen as more evidence in support of frameworks in certain cases. Now to recap, building a framework does not make sense for all projects, the two main scenarios that I have seen that are extremely conducive to it are an organization with multiple similar projects with similar problem domains. The second case is a medium to large project with a lot of commonality that would favor reusability across the project. One of my complaints about the Framework debate is that it is debated as a black and white argument. Either frameworks are absolutely required or the worst thing you can do. Now I am sure that many developers can anecdotally cite either side of this argument, which is what I feel really drives this debate and there is no doubt that I do this as well, but the goal, in my opinion, is to step back and look at this problem from a bigger perspective.

One place that I have found some interesting perspective is in a paper by Todd L. Veldhuizen titled: "Software Libraries and Their Reuse:Entropy, Kolmogorov Complexity, and Zipf’s Law", there is a slide version here, now a word of caution this paper is very math intensive, but it should be possible to read it and gain some insights without understanding the math, for example in the paper he states the following:

A common theme in the software reuse literature is that if we can only get the right environment in place— the right tools, the right generalizations, economic incentives, a "culture of reuse" — then reuse of software will soar, with consequent improvements in productivity and software quality. The analysis developed in this paper paints a different picture: the extent to which software reuse can occur is an intrinsic property of a problem domain, and better tools and culture can have only marginal impact on reuse rates if the domain is inherently resistant to reuse.

I think this is a good observation, many projects that I have worked on have exhibited characteristics that are favorable to reuse, but I have read a number of counter arguments especially by people in fast paced startups where the flux of the system evolution is potentially resistant to reusability, actually in this case it’s probably the SDLC, not necessarily the problem domain that is resistant. Also I would suspect the probability of reuse is in part inversely proportional to system size, so for small systems it’s less likely or at least will have a smaller set of reusable components, so an investment in reuse may not be seen as justified.

Another interesting observation from the paper is:

Under reasonable assumptions we prove that no finite library can be complete: there are always more components we can add to the library that will allow us increase reuse and make programs shorter. To make this work we need to settle a subtle interplay between the Kolmogorov complexity notion of compressibility (there is a shorter program doing the same thing) and the information theoretic notion of compressibility (low entropy over an ensemble of programs).

This is especially interesting if you have some familiarity with Information Theory, and if you don’t I recommend learning more about it. Here he is comparing characteristics of both Algorithmic Information Theory [Kolmogorov complexity] and Shannon’s Information Theory [information theoretic notion]. Roughly, Algorithmic Information Theory is concerned with the smallest algorithm to generate data and Shannon’s Information Theory is about how to represent the data in the most compact form. These concepts are closely related to data compression and in the paper this is paralleled to the idea that reusing code, will make the system smaller in terms of lines of code, or more specifically: symbols, which effectively "compresses" the codebase. In Algorithmic Information theory you can never really know if you have the smallest algorithm so I may be taking some liberty here, but I think the takeaway is that when trying to create reuse you can probably do it forever so one needs to temper this desire with practicality. In other words there is probably a point where any subsequent work towards reuse is a diminishing return.

I find the paper compelling and I confess that perhaps I am being bamboozled by math that I still do not fully understand, but intuitively these ideas feel right to me. Also the application of Zipf’s law is interesting and should be pretty intuitive, once again roughly, Zipfs law relates to power curve distributions, also related to the 80/20 rule, the prime example is the frequency of English words in text, words like [and, the, some, in, etc.] are much, perhaps orders of magnitude, more common than words like [deleterious, abstruse, etc.]. This distribution shows up in things like the distribution of elements in the universe, think hydrogen vs. platinum, the wealth distribution of people you vs. Bill Gates, how many followers people have on twitter, etc. and to a smaller scale, the curve is scale invariant, in software, often some components will have a fair amount of reuse, things string copy functions, entity base classes, etc., whereas others may only have a couple of reuses.

On the lighter side, Power curves relate to Chaos Theory, I have seen a number of people including Andy Hunt draw parallels between Agile and Chaos theory, although these are usually pretty loose, it does strike me that one way to model chaos is through iterated maps, which is reminiscent of the iterative process of agile, also the attractor and orbit concepts seem to parallel the target software system as well.

Another place, and this is one that most developers will find more accessible, I would have lead with this but the title and flow leaned the other way, is Joshua Bloch’s "How To Design A Good API and Why it Matters". Actually I think this a presentation that every developer should watch especially if you are involved in high level design and architecture related work, and don’t worry, there is no math to intimidate the average developer. A summary article version can be found here, but I would still recommend watching the full video at least once if not multibple times. Slides to an older version of the talk can be found here.

In his presentation he talks about getting the API right the first time because you are stuck with it. I think this helps illuminate one very important and problematic aspect of framework design and software development. The problem can be illustrated with a linguistic analogy, in linguistics there are two ways to define rules for a language: prescriptive, is where you apply (prescribe) the rules on to the language vs. descriptive where the rules describe the language as it exists, I strongly favor the descriptive. One of the most famous English language rules: "not to end sentences with prepositions" is a prescriptive rule thought to be from Latin which was introduced by John Dryden and famously mocked by Winston Churchill "This is the sort of nonsense up with which I will not put.", pointing out that it really doesn’t always fit a Germanic language like English. I know I’m a bit off topic again with my "writer’s embellishment", not to mention that the same idea is discussed in depth by Martin Fowler which he terms "Predictive versus Adaptive".

It is a common problem which Agile attempts to address and it is common in general software design and construction, it also occurs API and framework design. Software construction as we know is an organic process and I feel that frameworks are best developed in part out of that process though Martin Fowler’s Harvesting which can be termed as descriptive framework creation. What Joshua Bloch in part describes and to some degree cautions against can be described as a prescriptive approach to API/Frameworks. I think many developers including me have attempted to create framework components and API’s early in a project usually driven by a high level vision of what the resulting system will look like only to find out later that certain assumptions were not valid or certain cases were not accounted for¹. What he talks about is a pretty rigid prescriptive model which is in many ways at odds with an adaptive agile approach, I feel that the more adaptive agile approach is really what is needed for the framework approach and we do see this via versioning, for example the differences between Spring 1.x and Spring 3.x are substantial and no one would want to use 1.x now, but there are apps that are tied to it now. Also this approach of complete backwards compatibility was used with Java Generics, specifically Type Erasure which has lead to a significant weakening of the implementation of that abstraction. It is my understanding that Scala has recently undergone some substantial changes from version to version leading some to criticize its stability while others cite that it is the only way to avoid painting yourself into a corner like with Java. The harvesting approach will often involve refinement and changes to the components which are extracted and this can lead to the need for refactorings that can potentially affect large amounts of already existing code. It’s a real chicken and egg problem.

He starts his presentation with the following Characteristics of a Good API:

Characteristics of a Good API
Easy to learn

Easy to use, even without documentation

Hard to misuse

Easy to read and maintain code that uses it

Sufficiently powerful to satisfy requirements

Easy to extend

Appropriate to audience

He also makes the following point:

APIs can be among a company's greatest assets

I think this is sentiment is reflected in Paul Graham’s "Beating the Averages" of course that is more about Lisp but underlying principle is the same, actually an interesting language agnostic point comes from Peter Norvig, I can’t find the reference, but he said he had a similar attitude towards Lisp until he got to Google and saw good programmers who were incredibly productive in C++. I feel that this is all just the framework argument, maximizing reusability by building reusable high level abstractions within your problem domain that allow you to be more productive and build new high level components more quickly, it’s all about efficiency.

In regards to API’s he adds:

API Should Be As Small As Possible But No Smaller

To which he adds:

When in doubt leave it out.

Conceptual weight is more important than the bulk - The number of concepts.

The most important way to reduce weight is reusing interfaces.

He attributes this sentiment to Einstein, but he wasn’t sure about it, I did some follow up the paraphrasing is "Everything should be made as simple as possible, but no simpler." or "Make things as simple as possible, but not simpler." More about that can be found here. These are good cautions about over-design or over-engineering APIs, Frameworks, and Software in general, I have definitely been guilty of this at times, once again this is something that needs to be balanced.

The following is what I consider to be seminal advice:

All programmers are API designers because good programming is inherently modular and these inter modular boundaries are API’s and good API’s tend to get reused.

As stated in his slides:

Why is API Design Important to You?
If you program, you are an API designer
Good code is modular–each module has an API

Useful modules tend to get reused
Once module has users, can’t change API at will

Good reusable modules are corporate assets

Thinking in terms of APIs improves code quality

The next two quotes are pretty long, and it was no easy task transcribing them, he talks really fast, I tried to accurately represent this as best as possible, also I feel that he really nails some key ideas and I wanted to have a written record of it to reference, since I am not aware of any other:

Names matter a lot, there are some people that think that names don’t matter and when you sit down and say well this isn’t named right, they say don’t waste your time let’s just move on, it’s good enough. No! Names, in an API, that are going to be used by anyone else that includes yourself in a few months mater an awful lot. The idea is that every API is kind of a little language and people who are going to use your API needs to learn that language and then speak in that language and that means that names should be self explanatory, you should avoid cryptic abbreviations so the original Unix names, I think, fail this one miserably.

He augments these ideas by adding consistency and symmetry:

You should be consistent, it is very important that the same word means the same thing when used repeatedly in your API and you don’t have multiple words meaning that same thing so let us say that you have a remove and a delete in the same API that is almost always wrong what’s the difference between remove and delete, Well I don’t know when I listen to those two things they seem to mean the same thing if they do mean the same thing then call them both the same thing if they don’t then make the names different enough to tell you how they differ if they were called let’s say delete and expunge I would know that expunge was a more permanent kind of removal or something like that. Not only should you strive for consistency you should strive for symmetry so if you API has two verbs add and remove and two nouns entry and key, I would like to see addEntry, addKey, removeEntry, removeKey if one of them is missing there should be a very good reason for it I am not saying that all API’s should be symmetric but the great bulk of them should. If you get it right the code should read like prose, that’s the prize.

From the Slides:

Names Matter–API is a Little Language
Names Should Be Largely Self-Explanatory
Avoid cryptic abbreviations

Be consistent–same word means same thing
Throughout API, (Across APIs on the platform)

Be regular–strive for symmetry

Code should read like prose

I feel that this hits some essential concepts which really resonate with me, in fact I have follow on posts planned to further deconstruct and develop these ideas more generally. From the framework perspective this also gets at some of the variety of the Framework code components, they can be the Paul Graham’s functional Lisp abstractions, they can be DSL’s, they can be Object Oriented like Spring, Hibernate and Java API, etc. Any framework built for a domain will have their conceptual vocabularies or API languages that are a higher level abstraction of the problem domain, Domain Specific Abstractions, and they all benefit from concepts like consistency and symmetry as appropriate to the domain.

The following is a very common and widespread problem that often inhibits reuse and leads to less efficient production of lower quality software:

Reuse is something that is far easier to say than to do. Doing it requires both good design and very good documentation. Even when we see good design, which is still infrequently, we won’t see the components reused without good documentation.
- D. L. Parnas, Software Aging. Proceedings of the 16^th International Conference on Software Engineering, 1994.

He adds:

Example Code should be Exemplary, one should spend ten times as much time on example code than production code.

He references a paper called "Design fragments" by George Fairbanks also here which looks interesting but I have not had time to read it yet.

This is also, I believe, to be a critical point that is possibly symptomatic of problems with many software projects. I feel that projects never explicitly allow for capturing reuse in terms of planning, schedule and developer time. Often reuse is a lucky artifact if you have proactive developers who make the extra effort to do it and it can often be at odds with the way I have seen projects run (mismanaged). I have some follow up planned for this topic as well.

In terms of design he adds this interesting point:

You need one strong design lead to that can ensure that the api that you are designing is cohesive and pretty and clearly the work of one single mind or at least a single minded body and that’s always a little bit of a trade off being able to satisfy the needs of many costumers and yet produce something that is beautiful and cohesive.

I have to confess that I do not recall how I came across Todd Veldhuizen’s paper or Joshua Bloch’s talk, but I felt that they were really about similar ideas, in writing this and finding all of the references again I realized that my association of these two was not coincidental at all. For they are both part of the Library-Centric Software Design LCSD'05 workshop for Object-Oriented Programming, Systems, Languages and Applications (OOPSLA'05) with Joshua Bloch delivering the same talk as the Keynote Address.

Now I admit that one goal of mine is to put these ideas, much of which was borrowed from the two referenced works, into a written form that I can reference back to, I hope this stands up by itself but is really written for my future entries. Also, for the record this is not the first time I have "leached" off of Joshua Bloch’s work.

Ultimately my ideas will diverge from some of those in regards to API design, Joshua Bloch speaks from a position of greater responsibility in terms of APIs. I see them as part of a framework continuum used to construct software and I think many of his ideas apply directly and more generally to that. Also I see framework and the system as interlocked and feel that frameworks can drive structure and consistency for example Object-Oriented API’s aka frameworks can in turn drive the structure of software, the Spring Framework is a good example of this, Spring is build heavily around IOC which is one of the SOLID principles.

There will always be arguments against frameworks, the classic one is it creates more code that is more complex and it requires more learning, my counter argument to this is twofold: First any code that is in production probably has the need to be known and understood which might require it to be learned regardless of how efficiently it is created. Also if a framework creates reusability and consistency the initial learning curve will be higher but each subsequent encounter with a codebase that constructed this way should be easier. Also highly redundant inconsistent code is potentially (much) more difficult to learn and maintain because there is more of it. The second is if your framework API is well defined and well documented it should make the resultant code much easier to understand and maintain aka "read like prose". This will be due to the fact that much of the "generic" complexity is "abstracted" downwards into the framework level. For example compare the code to implement a Spring Controller to the underlying classes that do the work such as AnnotationMethodHandlerAdapter Now if you have defects and issues at that lower level they will be harder to fix and changing common code can have side effects to other dependant code, it’s not a perfect world.

I think the issue with reuse and the framework approach is asking the right question: How resistant (or favorable) is you domain and your SDLC to reuse? I think most domains have some if not a fair amount of favorability to reuse and I see reuse as increased efficiency in software construction.

¹Perhaps: "...certain cases for were not accounted." Dryden blows.