24 May 2016

Software Development is a Risky Business

Software projects are inherently risky.  Some statistics show that roughly 70% to 50% of software projects get classified as "challenged".   Where challenged can mean cost overruns and/or delivery delays, or it can mean complete failure and the cancellation of the project.  Some statistics indicate as the size of the project increases so does the probability of problems and failure.  Many articles and blog posts have been written about the long history of software project failure.  At the time of writing this post the most recent and very public software project debacle was the healthcare.gov project.  It seems as IT becomes more pervasive and essential the problems with projects may be getting worse not better.

In spite of these numbers and all too often in my experience I have found that individuals and organizations seem to have a very optimistic outlook about their own projects especially at the beginning of the project.  Perhaps it’s the classic: "that won’t happen to me" attitude.  Even as projects start to run into problems I have seen people especially managers not take issues seriously or ignore them all together.  In some of the analysis of healthcare.gov, this was cited as one of the problems.

Risk management and risk assessment/analysis are developed areas in project management and engineering and are often discussed in software projects.  Many risks will end up falling into two categories.  One category is actual risks to the end product, such as unforeseen security vulnerabilities in software systems or other engineering issues that lead to flaws or critical failures of the system or components. The other is risks in the management and execution of constructing product or system. 

Risk management is one of those areas where the engineering aspects and the engineering management aspects can become tightly coupled.  An example can be derived from the XKCD comic "Tasks".  Now the point of this comic is to demonstrate the difference between an easy problem and hard one, and presumably the conversation takes place between a manager and developer.  In a risk setting, perhaps solving the hard problem is the goal of the project.  So a risk might be that the research team fails to solve that problem at the end of the five years, or that it takes ten. So in a sense the task is high risk but there are team related risks, maybe the budget is too low and the team is subpar due to low salaries so they can never even solve it.  Another possible scenario is that the conversation never happened and the project management and team think it is solvable in six months.  This second scenario is the risk I have seen though out most of my career, the project management and team are overly optimistic or really don’t have a good understanding of the complexity of the task at hand. In some cases these types of issues seem to be due to pluralistic ignorance, usually where the developers didn’t want to contradict management.

Sadly it seems like the risks that are identified are almost peripheral to the real risks on projects.  It is usually things like the hardware delivery will be delayed which will delay setting up our dev/test/prod environments and delay the start of development.  While these are legitimate risks, there is always the risk of all of the unknowns that will cause delays and cost over runs, even things like developer turnover never seem to be taken into account, at least in my experience.  Also there seems to no cognizance of the long term cost risks like the project was poorly managed or rushed or was done on the cheap which caused the delivered codebase to be an unmaintainable big ball of mud that will incur a much higher long term cost.

In order for software engineering to grow up and become a real engineering discipline it will need formal risk assessment and risk management methodologies both on the project management side and on the work product side.  This is also an area where ideas and methodologies can likely be taken from existing work in other engineering disciplines.  These methodologies will also draw from probability and game theory.

References and Further Reading

30 April 2015

What is Software Engineering?

The terms software engineering, software engineer, and sometimes just engineer as in "full stack engineer" appear a lot in the software development field.  Some people including myself find the use of the term "Software Engineering" to be somewhat dubious. I also feel that there are too many people calling themselves engineers who really shouldn’t be.  As a result of my skepticism I have been trying to devise a way to determine exactly what is Software Engineering?

The happy path is to go to the web and search for it or just find a book. Both will lead to many answers.  But that is really just getting someone else’s opinion.  A little later we will look at other people’s opinions as there are very established authors who readily answer this question.

To get some perspective, let’s step back and ask a more basic question. What is engineering?  For this I will rely on the ultimate oracle of our time. Some top responses from the Google are:

The Google dictionary definition:



the branch of science and technology concerned with the design, building, and use of engines, machines, and structures.

• the work done by, or the occupation of, an engineer.

• the action of working artfully to bring something about.

"if not for Keegan's shrewd engineering, the election would have been lost"

The Georgia Tech School of Engineering provides the following:

Engineering is the practical application of science and math to solve problems, and it is everywhere in the world around you. From the start to the end of each day, engineering technologies improve the ways that we communicate, work, travel, stay healthy, and entertain ourselves.

What is Engineering? | Types of Engineering:

Engineering is the application of science and math to solve problems.  Engineers figure out how things work and find practical uses for scientific discoveries.

Wikipedia: defines it as

Engineering (from Latin ingenium, meaning "cleverness" and ingeniare, meaning "to contrive, devise") is the application of scientific, economic, social, and practical knowledge in order to invent, design, build, maintain, research, and improve structures, machines, devices, systems, materials and processes.

So to summarize and synthesize the above for a working definition, we get:

Engineering is a branch of science and technology where mathematical, scientific, economic, and social knowledge are applied artfully, creatively, and efficiently to invent, design, build, maintain, research, and improve structures, machines, devices, systems, materials and processes.

Another aspect of engineering that should be considered is its diversity. Just searching around I came up with the following list of types of engineering, which should by no means be considered complete, they include:  Aeronautical, Agricultural, Audio, Biomedical, Biomechanical, Building, Ceramics, Chemical, Civil, Computer, Electrical, Electronics, Environmental, Food, Geological, Industrial, Marine, Mechanical, Military, Mineral, Military, Nuclear, Optical, Petroleum, Photonics, Systems, and Textile.

Often discussions of software engineering draw parallels to construction and civil engineering and it is easy to see why, most people often see the results of these disciplines and even live in them.  Not to trivialize these disciplines, but bricks and mortar, in addition to wood, glass, and rebar are things that support our lives and are physical thus somewhat easy to conceptualize.  But all engineering disciplines now touch and perhaps run our lives in ways that we rarely think about.  I previously posited that chemical engineering, prior to the software era, was probably the most pervasive engineering in our lives not to mention that the computer hardware we use wouldn’t exist without it.  It serves as an interesting area to look at in regards to software engineering comparisons too, as it is a relatively recent discipline and has some parallels like the control of processes and "flows" in this case of materials.  Perhaps the closest "relative" to software engineering might be industrial engineering.  Like software it deals with human factors.  It also deals with stochastic processes, something that is becoming increasingly more important in software systems.

Now that we have some general engineering perspective, let’s revisit the question "What is software engineering" by looking at what the established organizations and authors have to say.

Wikipedia gives the following definition:

Software engineering is the study and an application of engineering to the design, development, and maintenance of software.

Typical formal definitions of software engineering are:

  • the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software
  • an engineering discipline that is concerned with all aspects of software production
  • the establishment and use of sound engineering principles in order to economically obtain software that is reliable and works efficiently on real machines.

K.K. Aggarwal provides us with a bit of history:

At the first conference on software engineering in 1968, Fritz Bauer defined software engineering as "The establishment and use of sound engineering principles in order to obtain economically developed software that is reliable and works efficiently on real machines".

Stephen Schach defined the same as "A discipline whose aim is the production of quality software, software that is delivered on time, within budget, and that satisfies its requirements".

The ACM weighs in with:

Software engineering (SE) is concerned with developing and maintaining software systems that behave reliably and efficiently, are affordable to develop and maintain, and satisfy all the requirements that customers have defined for them. It is important because of the impact of large, expensive software systems and the role of software in safety-critical applications. It integrates significant mathematics, computer science and practices whose origins are in engineering.

There are a number of books on software engineering which have an obligation to address the question.

The answer in Ian Sommerville’s 8th Edition of Software Engineering  is:

Software engineering is an engineering discipline that is concerned with all aspects of software production from the early stages of system specification to maintaining the system after it has gone into use. In this definition, there are two key phrases:

1. Engineering discipline Engineers make things work. They apply theories, methods and tools where these are appropriate,. but they use them selectively and always try to discover solutions to problems even when there are no applicable theories and methods. Engineers also recognise that they must work to organisational and financial constraints, so they look for solutions within these constraints.

2. All aspects of software production Software engineering is not just concerned with the technical processes of software development but also with activities such as software project management and with the development of tools, methods and theories to support software production.

What Every Engineer Should Know about Software Engineering by Philip A. Laplante offers the following:

Software engineering is "a systematic approach to the analysis, design, assessment, implementation, test, maintenance, and reengineering of software, that is, the application of engineering to software. In the software engineering approach, several models for the software life cycle are defined, and many methodologies for the definition and assessment of the different phases of a life-cycle model" [Laplante 2001]

In Software Engineering A Practitioner's Approach 8th Edition Roger S. Pressman and Bruce R. Maxim provide a terse explanation:

Software Engineering encompasses a process, a collection of methods [practice] and an array of tools that allow professionals to build high-quality computer software.

Roger Y. Lee describes it in Software Engineering:A Hands-On Approach as:

Software engineering is the process of designing, developing and delivering software systems that meet a client’s requirements in an efficient and cost effective manner. To achieve this end, software engineers focus not only on the writing of a program, but on every aspect of the process, from clear initial communication with the client, through effective project management, to economical post-delivery maintenance practices. The IEEE Standard glossary of software engineering terminology defines software engineering as the application of a systematic, disciplined, quantifiable approach to the development, operation and maintenance of software (Ghezzi et al. 1991). As made clear by the above definition, software engineering came about in order to reverse early development practices by establishing a standard development method that could unify and normalize development operations, regardless of the people involved or the goals sought.

The laborious effort of wading through all of this erudition gives me the following distillation:

Software Engineering is:

  • An engineering discipline
  • Application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software"
  • Software engineering (SE) is concerned with developing and maintaining software
  • Software engineering is the process of designing, developing and delivering software systems
  • A standard development method that can unify and normalize development operations, regardless of the people involved or the goals sought.

Software Engineering Includes:

  • All aspects of software production from the early stages of system specification to maintaining
  • The system after it has gone into use
  • Non technical activities such as software project management and with the development of tools, methods and theories to support software production.
  • The integration of significant mathematics, computer science and practices whose origins are in engineering.
  • The application of theories, methods and tools where these are appropriate
  • Many methodologies for the definition and assessment of the different phases of a life-cycle model
  • Several models for the software life cycle
  • A process, a collection of methods [practice] and an array of tools that allow professionals to build high-quality computer software.
  • The application of sound engineering principles to the analysis, design, implementation (development), assessment, test, maintenance and reengineering of software that is reliable and works efficiently on real machines

Software Engineering Produces:

  • Quality software
  • Software that is delivered on time
  • Cost effective production of software (within budget)
  • Software that satisfies its requirements.
  • Software systems that behave reliably and efficiently
  • Software that is affordable to develop and maintain
  • Solutions that conform to organizational and financial constraints

I tried to break the multitude of ideas into three categories, but they are admittedly a bit fuzzy and you could probably move some of these around.  The point of the categorization exercise is to attempt to separate a definition from the broader idea of what the discipline encompasses and the attributes of the structures and systems that it produces, again things can get fuzzy here.  In thinking about these and my own experience with software development I came up with a list of things that I feel should be defined to make software engineering a real discipline and they are:

  1. What is the definition of the engineering discipline we are describing?
  2. What are the produced artifacts, e.g. structures and processes that the discipline creates?
  3. What non-managerial processes and methodologies are used to create the produced artifacts?
  4. What types of documentation (descriptive artifacts) are used to describe the produced artifacts and the processes to create those produced artifacts?
  5. How does the underlying math and science support the engineering discipline and how do these foundational disciplines allow one to reason about the produced artifacts that the engineering discipline is trying to construct?  How do these disciplines define and drive the descriptive artifacts?
  6. How does one organize and manage the effort including various roles people will play to actually carry out the work to bring the system or structure to fruition?

So coming back to the original purpose of this post, my answer to the question of what is software engineering, number one above, is:

Software engineering is the branch of science and technology where mathematical, computer science, linguistic, social knowledge and system relevant domain specific knowledge are applied artfully, creatively, and efficiently to invent, design, build, maintain, research, and improve software systems.

That’s the easy part. The hard part is what are the answers to the other five questions? I know that there a number of industry standard answers for these that include things like UML, Agile, Scrum, etc., but are those really valid and are they really working? I don’t think so, so maybe it’s time to look for other answers.

23 March 2015

What is Software Correctness?

Software correctness which is really software quality is not one thing.  It is comprised of a number of different and sometimes conflicting attributes.  As always Gerald J. Sussman provides interesting insights and in this case it is in his "We really don’t know how to compute" talk.   He makes the point that "correctness" may not be the most important attribute of software, the example he give is if Google’s page craps out on you, you will probably try again and it will probably work.  Actually this example might be more about reliability but it got me thinking as I have always thought that "software correctness" meant the combination of both accuracy and reliability and that was this the "correctness" that I always thought was the most important feature of software.  But this correctness is really two software quality attributes and the Google search software exemplifies these two aspects of "software correctness", one is availability and the other is search accuracy or the fulfillment of the users expectations.  One could argue that availability is not a software attribute but it is affected by the quality of the software and it is an attribute of a software system even if some of that might be driven by hardware.   So these two attributes for Google search do not have to be perfect.  The search quality needs to be at least better than the competition and the availability has to be high enough so that people won’t be discouraged from using it and for these two attributes they are both very high.   These parameters of quality vary in other domains, a machine administering a dose of radiation to a cancer patient would need to have stringent requirements as to the correctness of the dose, and in fact there have been deaths due to exactly this type of problem.  Another is software guiding a spacecraft or a Mars rover which needs to be extremely reliable and correct and when it hasn’t been, well there are a few space bricks floating around the solar system due to software errors.  In the real world, especially now, software is everywhere and it now lives in a variety of contexts.  So which quality attributes are important will vary based on its domain and its context.

So what makes software good?  Google first conquered search with Map-Reduce and other processes running on custom commodity hardware that allowed for flexible fault tolerant distributed massive data crunching.  Amazon has commoditized its server infrastructure to sell spare capacity that can be easily used by anyone.  Apple created easy to use attractive interfaces for the iPhone and iPad that consumers loved with similar usability now being offered on Android.  NASA has successfully landed on the frigid world of Titan, touched the edge of the heliosphere, and has run rovers on Mars.  While there are obviously hardware components to these systems, it was the various types of software quality that played a pivotal role in these and the many other great software successes.

It is easy to look back at these successes.  With the exception of NASA, I doubt there was a deliberate effort on which quality aspects to focus on.  Most were probably chosen intuitively based on business needs.  Software quality is essential to any organization that creates software.  It is an established subdomain or perhaps a related separate sibling field of software development.  It has consortiums, professional organizations and many books written on the subject.  So I delved into some of these and you find mention of Agile and TDD.  You also start to see things like CMMI, ISO 9000, Six Sigma.  Some of the resources seem to be written in some alien language, perhaps called ISO900ese, that seems related to software development but leaves me feeling very disconnected from it.   I think that the problem might be that these attempts to formalize software quality as a discipline are suffering from the same problem that plagues the attempt to formalize software engineering as a discipline.  It conflates the development process with the final product and the software quality formalization might be further exacerbated by the influence of the domain of manufacturing and its jargon.  I think these established methodologies have value, for example makes sense, but it seems like in general with these standards there is a lot to weed through and the good parts seem to be hard to pick out.  So I thought it might be fruitful to look at the problem from another angle that might provide a better way to reason about the quality of a system not only after and during its development but also before development begins.

A list of "quality parameters" for software is a requirement for software engineering to be a real engineering discipline.  These parameters could then be viewed qualitatively and quantitatively and perhaps even be measured.  In engineering disciplines various parameters are tradeoffs, you can make something sturdier but you will probably raise the cost and or the weight, for example plastic bumpers on cars, metal would be more durable but would be heavier and more costly.  If one were to think of some parameters in software as tradeoffs you can see that improvements on some parameters may cause others to go down.  In my post "The Software Reuse Complexity Curve" I posit that reuse and complexity are interrelated and that the level of reuse can affect understandability and maintainability.  Another example is performance, increasing performance of certain modules or functions might lead to an increase in complexity which without proper documentation could lead to harder understandability which might cause a higher long term maintenance cost.

So to come back to the question and to rephrase it:  What are the attributes that constitute software quality?   I am not the only person to ask this question as I have seen other blog posts that address this. There are quite a few people who have asked this question and have compiled good lists.  One post by Peteris Krumins lists out some code quality attributes that Charles E. Leiserson, in his first lecture of MIT's Introduction to Algorithms, mentions  as being more important than performance.   Additionally Bryan Soliman’s post "Software Engineering Quality Aspects" provides a nice list with a number of references.  I thought I’d create my own list derived from those sources and my own experience.  I am categorizing mine in three groups, Requirements based (including nonfunctional requirements), structural, and items that fall into both categories.


  • Performance
  • Correctness (Requirements Conformance)
  • User Expectations
  • Functionality
  • Security
  • Reliability
  • Scalability
  • Availability
  • Financial Cost
  • Auditability


  • Modularity
  • Component Replaceability
  • Recoverability
  • Good Descriptive Consistent Naming
  • Maintainability/ Changeability
  • Programmer's time
  • Extensibility/Modifiability
  • Reusability
  • Testability
  • Industry standard best practices (idiomatic code)
  • Effective library use, leveraging open source (minimal reinvention)
  • Proper Abstractions and Clear code flow
  • Good and easy to follow architecture/ Conceptual Integrity
  • Has no security vulnerabilities
  • Short low complexity routines


  • Robustness/Recoverability
  • Usability
  • Adaptability
  • Efficiency
  • Interoperability/Integrate-ability
  • Standards
  • Durability
  • Complexity
  • Installability

This is by no means intended as a definitive list.  The idea here is to put forth the idea of enumerating and classifying these attributes as a way to reason about, plan and remediate the quality of software projects, basically create a methodical way to address software quality throughout the software development lifecycle.  In the case of planning software quality attributes can be potentially rated with respect to priority.  This could also allow conflicting attributes to be more effectively evaluated. 

The attribute of availability is one that is fairly easy to quantify and often gets addressed usually in some type of Service Level Aggreement (SLA).  Although availability cannot be predicted, the underlying system that "provides" it can be tested. Netflix has developed chaos monkey to in part to verify (test) this quality attribute of their production system.  Tests and QA testers are a way to verify the requirements correctness and the correctness of system level components and services contracts of software. Static analysis tools like PMD, Checkstyle, and Sonarqube can help with code complexity, code conventions and even naming. 

Complexity is an elusive idea in this context it also crosses many of the above ideas.  Code can be overly complex, as can UIs, Usability (think command line), Installability, etc.  Another cross concept is understandability which can affect code, UIs, etc. which is in part related to complexity but can manifest itself in the use of bad or unconventional nomenclature as well.  There may be other cross concerns and it seems that the spectrum of quality attributes might better reasoned about by separating out these concepts and developing general ways to reason about them.

I should note that I include financial cost as one of the attributes.  While it might not be a real quality attribute, higher quality may necessitate higher cost, but not necessarily.  If one were to develop two systems that had the same level of quality across all of the chosen attributes, but one system had twice the cost, the cheaper one would be a better value, hence more desirable. So cost is not a quality attribute per se, but it is always an important factor in software development. 

In summary my vision is a comprehensive methodology where the quality attributes of a software system are defined and prioritized and hopefully measured.  They are planned for and tracked throughout the life cycle and perhaps adjusted as new needs arise or are discovered.  The existing tools would be used in a deliberate manner and perhaps as thinking about the quality attributes becomes more formalized new tools will be developed to cover areas that are not currently covered.  It will no longer be about code coverage but quality coverage.

References and Further Reading

Anthony Ferrara, Beyond Clean Code,  http://blog.ircmaxell.com/2013/11/beyond-clean-code.html

David Chappell,  The Three Aspects of Software Quality: Functional, Structural, and Process, http://www.davidchappell.com/writing/white_papers/The_Three_Aspects_of_Software_Quality_v1.0-Chappell.pdf

Rikard Edgren, Henrik Emilsson and Martin Jansson , Software Quality Characteristics,


Bryan Soliman, Software Engineering QualityAspects, http://bryansoliman.wordpress.com/2011/04/14/40/

Ben Moseley and Peter Marks , Out of the Tar Pit, http://shaffner.us/cs/papers/tarpit.pdf

Brian Foote and Joseph Yoder, Big Ball of Mud, http://laputan.org/mud/

Wikipedia , Software quality, http://en.wikipedia.org/wiki/Software_quality

Franco Martinig, Software Quality Attributes, http://blog.martinig.ch/quotes/software-quality-attributes/

MSDN, Chapter 16: Quality Attributes, http://msdn.microsoft.com/en-us/library/ee658094.aspx

Advoss, Software Quality Attributes, http://www.advoss.com/software-quality-attributes.html

Eric Jacobson, CRUSSPIC STMPL Reborn, http://www.testthisblog.com/2011/08/crusspic-stmpl-reborn.html

14 August 2014

A Brief and Incomplete History of Computing from Ancient to Modern

I recently attended an event here in DC where Woz was the guest of honor.  I was unaware of his book at the time.  So leading up to it, I thought it would be fun to re-watch a couple of old documentaries in which he appears.  I forewent his foray into reality TV.  I chose to watch "The Machine that Changed the World" (youtube)  a PBS documentary  from 1992 which is farsighted enough to not feel completely dated plus the chronicling of the early history of computing is nicely  done.   The other was Robert X Cringley’s  1996 "Triumph of the Nerds" (youtube) which is a fun account of the rise of the personal computer.

ENAIC took up 1800 square feet

Watching these rekindled an idea I had to do a history of events from ancient to modern that got us to where we are now.  So I set forth to collect and list notable events that propelled both the science and industry of computing to its present state.  My goal was to include the people and events that not only shaped the technology but that also shaped the industry and the language we use to describe things.  One example is the term bug that many people may take for granted.  It is derived from a problem caused by an actual insect, a moth stuck in a relay.  The other and perhaps more profound case is that the word "computer", a name we also take for granted, was a historical accident.  Prior to 1940 a computer was a person who did computations or calculations.  Many of the first machines with the exception of the decryption work done Bletchley Park were developed for doing computations, specifically ENIAC, which was initially designed to calculate artillery firing tables.  ENIAC became the first well known computer and captured the public’s imagination when the "Giant Brain" was displayed to the American press in 1946.  Not long after that the word computer went from describing a person to a machine.  Meanwhile the Colossus machine from Bletchley Park, the project Alan Turing worked on, remained a secret until the 1970’s.  Had that work been made public immediately following the war we might be calling these machines decoders or decrypters and the history of the computer might have taken some different turns, who knows maybe even Silicon on Avon, okay that’s probably a stretch. 

As I was chronicling these events of which I tried to include when certain ideas and inventions occurred I noticed some interesting trends.  For example some ideas like the algorithm and robotics (or mechanical automata) are quite ancient going back over a thousand years.  Many of the technologies we use today date back mostly about 30 to 50 years ago and some are even older.  Magnetic storage was patented in 1878, making it 19th century technology, and if you think about it, it is really is fairly primitive, I‘ve heard hard disks described as "spinning rust".  Admittedly today’s hard disks contain very sophisticated electronics and electromechanics but are still technologically primitive (effectively sophisticated 19th century technology) compared to SSD technology, which is 20th century technology with DRAM from 1966 and NAND Flash being invented in 1980.  The real advances, not surprisingly, are the advances in materials science, fabrication and manufacturing that take 1970’s technological developments like the active matrix display technology and make them small, cheap and ubiquitous today.  

There are many interesting people who played very important and pivotal roles who seem to be largely forgotten.  One of my biggest pet peeves is the cult of personality effect that seems to show up in our industry and the worst case of it is probably Steve Jobs.  I remember working with one young fanboy who was telling me how great Steve Jobs was and that Steve Jobs thought of the iPad in the 1970s, to which I replied that it was depicted in a movie called 2001 from the late 1960s, which I would wager that Jobs had seen. I include that movie in my list of events because it does depict advanced computer technology.  While Steve Jobs seems to garner more attention than he deserves, and I do credit Apple with certain events that shaped the industry as it is today.  I think there are several more interesting people who have made much more substantial contributions. One person is Robert Noyce who cofounded Intel and invented the practical version of the integrated circuit that we use today, he was kind of a Jobs and Wozniak combo in terms of technical brilliance and business acumen.   Another interesting story surrounds ARPANET, the proto-internet.  While Al Gore may have tried to take credit for its invention, J. C. R. Licklider might have actually been the one who did effectively invent it.  He convinced Bob Taylor of the idea that became ARPANET.  Taylor is another of those people who deserves recognition as after he created the initial proposal for ARPANET he went on to lead Xerox Parc for about 15 years from its inception.   Also Xerox Parc often seems to get the credit for our much of our modern computing interfaces, but again Douglas Engelbart really deserves that credit as he demonstrated most of those ideas in 1968 two years before Xerox established Parc. He of course was a big influence of that work at Parc.

A big driving force for the creation of practical computer technology was the need for machines to do tedious laborious mental work like large quantities of calculations especially to create mathematical tables that gave values of various computations such as logarithms which were used extensively prior to the 1970s before the advent of affordable desktop and pocket scientific calculators.  Babbage’s work was aimed at solving this problem.  During world war two the need for the creation of artillery tables exceeded the available human labor which drove the military to fund the creation of ENIAC, which was not delivered until after the war.  The war also drove the Bletchley park work, again an attempt to automate tedious work of trying to manually decipher the coded messages of the Enigma Machine.

Big data circa 1950s. ~ 4GB of punched cards.

Big data is a common term today, but in many respects it has history going back over a hundred years and could be also considered a driver of the development of practical computer technology.  One continual source of big data was and probably still is, is collection of census data.  In 1889 Herman Hollerith developed an electromechanical tabulating machine that was used to process data for the 1890 US census.  His 1889 patent had the title "Art of Compiling Statistics" and he later established a company to sell the machines which were used worldwide for census and other applications. He is credited with laying the foundations of the modern information processing industry. His company was eventually merged with four others to become IBM.  In 1948 the US Census Bureau had a big data problem that was outgrowing the existing electromechanical technology.  John Mauchly one of the inventors of ENIAC and cofounder of Eckert–Mauchly Computer Corporation sold the census department on its successor named UNIVAC to handle census data.  The company survives today as Unisys.

Grace Murray Hopper at the UNIVAC keyboard

A huge problem with software projects today is that of project failure and the history of computing seems to have its share as well.  Software projects sometimes outright fail and all too often are late and over budget.  Babbage’s Difference Engine project which was cancelled after 15 years and an expenditure that was ten times the original amount paid.  The failure was due to bad management on his part and what might be called scope creep in that he got distracted by designing the more general Analytical Engine.  UNIVAC is another notable early project that was delivered late and over budget.

The real challenge of a post like this is what to include. The Machine that Changed the World documentary starts with writing which was our first precise technique to record information. I went back farther to include the recording of numerical information. A big influence on the way I view history comes from a PBS show that aired in the eighties called Connections which was done by James Burke. In it he traces in a very engaging way various technological developments as cause and effect events. In a sense he demonstrated that history was an interesting living topic not the banal languid mimeographs that I experienced in school. I tried to choose events that were either the inception of something or the point at which it caused a significant change. In the spirit of Connections there are definite threads here, the thread of calculation and computation including number representation for those purposes. The thread of encryption also plays an important role as does dealing with information both in terms of storage, replication and processing. Some of the threads are obvious and some are more subtle.

History of Computing from Ancient to Modern

Pre 3000 BCE: Tally Marks, a unary numeral system, are used to keep track of numerical quantities such as animal herd sizes.

c. 3200 BCE:  Mesopotamian Cuneiform appears.  Writing may have been independently developed in other places as well. Although protowriting systems previously existed, society now had a more concise mechanism to record information.

c. 3000 BCE:  The development of the Egyptian Hieroglyphic Number System and the Sumerian and Babylonian Number System base 60 number system is the start of non-unary number systems.

2800 BCE: The appearance of the I Ching which included the Bagua or eight trigrams is possibly the first occurrence of binary numbers.

2700–2300 BCE: The Sumerian Sexagesimal Abacus allows human mechanical calculations.

c. 500 BCE Pāṇini describes the grammar and morphology of Sanskrit.

c. 300 BCE Euclid's Elements describes among other things an algorithm to compute the greatest common divisor.

100 BCE The Greeks develop the Antikythera mechanism, the first known mechanical computation device.

c. 5 BCE:  The scytale transposition cipher was used by the Spartan military, although ciphers probably existed earlier.

c. 1 - 300 CE:  Hindu numeral system, later to become the Hindu–Arabic numeral system, is invented by Indian mathematicians.

c. 201 - 299 CE: Diophantus authored a series of books called Arithmetica which deal with solving algebraic equations.

c. 801 - 873 CE: Al-Kindi develops cryptanalysis by frequency analysis.

c. 825 CE Muḥammad ibn Mūsā al-Khwārizmī writes On the Calculation with Hindu Numerals which helps spread the Indian system of numeration throughout the Middle East and Europe. It is translated into Latin as Algoritmi de numero Indorum. Al-Khwārizmī, rendered as (Latin) Algoritmi, leads to the term "algorithm".

850 CE The Banu Musa publish The Book of Ingenious Devices a large illustrated work on mechanical devices, including automata.

1202: Leonardo of Pisa (Fibonacci) introduces the Hindu–Arabic numbers to Europe in his book Liber Abaci.

c. 1283 Verge escapement is developed in Europe and allows the development all-mechanical clocks and other mechanical automata.

1450:  Johannes Gutenberg invents the Printing Press, while actually invented earlier in Asia, it allows the mechanization of the duplication of information and creates an industrial boom surrounding the selling and dissemination of information.

c. 1624: The slide rule is developed to calculate logarithms.

1642: Blaise Pascal and Wilhelm Schickard invent a mechanical calculator known as Pascal's calculator. It is later discovered that Schickard had described the device to Johannes Kepler in 1623.

1654: Pierre de Fermat’s and Blaise Pascal’s correspondence leads to mathematical methods of probability.

1679: Gottfried Leibniz invents the modern binary number system.

1694: Gottfried Wilhelm Leibniz completes the Stepped Reckoner a digital mechanical calculator that can perform all four arithmetic operations: addition, subtraction, multiplication and division.

1725: Basile Bouchon first uses punched cards.

1736: Leonhard Euler lays the foundation of graph theory and topology in the Seven Bridges of Königsberg problem.

1763: Two years after Thomas Bayes dies his "An Essay towards solving a Problem in the Doctrine of Chances" is published in the Philosophical Transactions of the Royal Society of London.

1786: J. H. Müller Hessian army engineer publishes a description of a Difference Engine.

1801: Joseph Marie Jacquard demonstrates a working mechanical loom, the Jacquard Loom, that uses punch cards based on previous work by Jacques Vaucanson and Basile Bouchon.

1823: The British government gives Charles Babbage £1700 to start work on the Difference engine proposed the previous year.  In 1842 after a total expenditure of £17,000 without a deliverable the project was canceled.

1832: Évariste Galois is killed in a duel leaving behind a legacy in algebraic theory to be discovered decades later.

1837: Charles Babbage describes a successor to the Difference Engine, the Analytical Engine, a proposed mechanical general-purpose computer and the first design for a general-purpose computer that could be described in modern terms as Turing-complete.  Babbage receive assistance on this project from Ada Byron, Countess of Lovelace. It also serves to distract Babbage from completing the Difference Engine. Neither project is ever completed.

1844: Samuel Morse demonstrates the telegraph to Congress by transmitting the famous message "What hath God wrought" in Morse code over a wire from Washington to Baltimore.

1854 English mathematician George Boole publishes An Investigation of the Laws of Thought.

1878: Oberlin Smith patents magnetic storage in the form of wire recording.

1889: Herman Hollerith develops an electromechanical tabulating machine using punched cards for the United States 1890 census.

1889: Giuseppe Peano presents his "The principles of arithmetic", based on the work of Dedekind and origination of "primitive recursion" begins formally with the axioms of Peano.

1897: Karl Ferdinand Braun builds the first cathode-ray tube (CRT) and cathode ray tube oscilloscope.

1900: German mathematician David Hilbert publishes a list of 23 unsolved problems.

1907: The thermionic triode, a vacuum tube, propels the electronics age forward, enabling amplified radio technology and long-distance telephony.

1910: Alfred North Whitehead and Bertrand Russell publish Principia Mathematica which introduces the idea of types.

1914: Spanish engineer Leonardo Torres y Quevedo designs an electro-mechanical version of the Analytical Engine of Charles Babbage which includes floating-point arithmetic.

1918: German engineer Arthur Scherbius applied for a patent for a cipher machine using rotors which gives rise to the enigma machine, a family of related electro-mechanical rotor cipher machines used for enciphering and deciphering secret messages.

1928: David Hilbert formulates the Entscheidungsproblem, German for 'decision problem', from his 2nd and 10th problems.

1925 Andrey Kolmogorov publishes "On the principle of the excluded middle".  He supports most of Brouwer's results but disputes a few and proves that under a certain interpretation, all statements of classical formal logic can be formulated as those of intuitionistic logic.

1931: Kurt Gödel publishes "On Formally Undecidable Propositions of "Principia Mathematica" and Related Systems".

1933: Andrey Kolmogorov publishes Foundations of the Theory of Probability, laying the modern axiomatic foundations of probability theory.

1936: Alonzo Church and Alan Turing publish independent papers showing that a general solution to the Entscheidungsproblem is impossible.  Church’s paper introduces the Lambda Calculus and Turing’s paper introduces ideas that become known Turing Machines.

1938 Konrad Zuse builds the Z1, a binary electrically driven mechanical calculator. It is the first freely programmable computer which uses Boolean logic and binary floating point numbers.

1943 Warren McCulloch and Walter Pitts publish A Logical Calculus Immanent in Nervous Activity (pdf) making significant contributions to the study of neural network theory, theory of automata, and the theory of computation and cybernetics.

1944: The first Colossus computer, designed by British engineer Tommy Flowers, is operational at Bletchley Park. It was designed to break the complex Lorenz ciphers used by the Nazis during WWII and reduced the time to break Lorenz messages from weeks to hours.

1945: Samuel Eilenberg and Saunders Mac Lane introduce categories, functors, and natural transformations as part of their work in algebraic topology.

1945: John von Neumann publishes the "First Draft of a Report on the EDVAC" the first published description of the logical design of a computer using the stored-program concept which came to be known as the von Neumann architecture.

1945: Grace Hopper records the first actual computer "bug" - a moth stuck between relays.

1946: J. Presper Eckert and John Mauchly found Mauchly Computer Corporation, the first commercial computer company.

1947: John Bardeen, Walter Brattain, and William Shockley invent the first point-contact transistorJohn R. Pierce, supervisor of the team, coins the term transistor a portmanteau of the term "transfer resistor".

1947: Freddie Williams and Tom Kilburn develop the Williams–Kilburn tube the first random access memory, a cathode ray tube used as a computer memory to electronically store binary data.

1948: Claude Shannon publishes "A Mathematical Theory of Communication" as a two part series in the Bell System Technical Journal describing information entropy as a measure for the uncertainty in a message which essentially invents the field of information theory.

1948: Bernard M. Oliver, Claude Shannon and John R. Pierce publish "The Philosophy of PCM" after the application for the patents by Oliver and Shannon in 1946 and by John R. Pierce in 1945, both patent applications were of the same name: "Communication System Employing Pulse Code Modulation".  All three are credited as the inventors of PCM.

1948: The world's first stored-program computer, Manchester Small-Scale Experimental Machine (SSEM), nicknamed Baby, runs its first program.

1951: David A. Huffman devises Huffman encoding for an assigned term paper for the problem of finding the most efficient binary code by his MIT information theory class professor Robert Fano.

1952: Tom Watson, Jr. becomes president of IBM and shifts the company’s direction away from electromechanical punched card systems towards electronic computers.

1953: Jay Forrester installed magnetic core memory on the Whirlwind computer.

1956: Stephen Cole Kleene describes regular languages using his mathematical notation called regular sets thereby inventing regular expressions.

1956: Wen Tsing Chow invents Programmable Read Only Memory (PROM) at American Bosch Arma Corporation at the request of the United States Air Force to come up with a more flexible and secure way of storing the targeting constants in the Atlas E/F ICBM's airborne digital computer.

1956: IBM introduces the IBM 305 RAMAC the first commercial computer using a moving-head hard disk drive (magnetic disk storage) for secondary storage.

1956: William Shockley moves back to California to be close to his aging mother in Palo Alto and starts Shockley Semiconductor Laboratory in a small commercial lot in nearby Mountain View which sets the agrarian area with cheap land and populated with fruit orchards on a path to later become known as Silicon Valley.

1957: Robert Noyce, Gordon Moore and six others known as the "traitorous eight" leave Shockley Semiconductor Laboratory to start Fairchild Semiconductor.

1957: Noam Chomsky publishes Syntactic Structures which includes ideas of the phrase structure grammar and a finite state grammar, communication theoretic model based on a conception of language as a Markov process.

1957: John Backus creates the first FORTRAN compiler at IBM.

1958: John McCarthy develops Lisp which is based on Alonzo Church’s Lambda calculus. It also includes the new idea of a conditional if statement.

1958: Jack Kilby develops the integrated circuit made from germanium and gold wires at Texas Instruments.

1959: Jean Hoerni develops the planar transistor and the planar manufacturing process at Fairchild Semiconductor.

1959: Robert Noyce develops a more practical silicon version of the integrated circuit which has the "wires" between components built directly into the substrate.

1959: Noam Chomsky publishes "On Certain Formal Properties of Grammars" which describes the Chomsky Hierarchy which includes its relation to automata theory.

1959: John Backus publishes "The Syntax and Semantics of the Proposed International Algebraic Language of Zürich ACM-GAMM Conference". In the UNESCO Proceedings of the International Conference on Information Processing, which introduces a notation to describe the syntax of programming languages specifically context-free grammars, the notation later becomes known as the Backus Naur Form (BNF).

1959: Arthur Samuel defines machine learning as a "Field of study that gives computers the ability to learn without being explicitly programmed".

1962: The Radio Sector of the EIA introduces RS-232.

1963: American Standards Association publishes the first edition of ASCII - American Standard Code for Information Interchange.

1964: The CDC 6600 is released. Designed by Seymour Cray, it is dubbed a supercomputer.

1964: MIT, General Electric and Bell Labs form a consortium to develop of the Multics Operating System.

1964: John G. Kemeny and Thomas E. Kurtz create the BASIC language to enable students in fields other than science and mathematics to use computers.

1965: Gordon Moore describes a trend in the paper "Cramming more components onto integrated circuits" which becomes known as Moore’s Law.

1966: Robert Dennard invents DRAM at the IBM Thomas J. Watson Research Center.

1968: Douglas Engelbart gives "The Mother of All Demos", a demonstration that introduces a complete computer hardware/software system called the oN-Line System (NLS). It demonstrates many fundamental elements of modern personal computing: multiple windows, hypertext, graphics, efficient navigation and command input, video conferencing, the computer mouse, word processing, dynamic file linking, revision control, and a collaborative real-time editor (collaborative work).

1968: Edsger Dijkstra´s publishes "GO TO considered harmful" letter, in Communications of the ACM helping pave the way for structured programming.

1968: The Apollo Guidance Computer makes its debut orbiting the Earth on Apollo 7. The Cold War and the Apollo program drive the demand for lighter more reliable computers using integrated circuits.

1968: ARPA approves Bob Taylor’s plan for a computer network.  Bob Taylor along with Ivan Sutherland were originally convinced of the idea by J. C. R. Licklider who initially referred to the concept as an "Intergalactic Computer Network".  Ivan Sutherland and Bob Taylor envisioned a computer communications network, that would in part, allow researchers to use computers provided to various corporate and academic sites by ARPA to make new software and other computer science results quickly and widely available.  The project became known as Arpanet.

1968: Arthur C. Clarke and Stanley Kubrick release 2001: A Space Odyssey a science fiction movie that features advanced computing technology including an AI computer named the HAL 9000 with a voice interface that manages ship operations. The astronauts use multimedia tablet technology (NewsPad) to catch up on family and worldwide events from Earth.

1968: Gordon E. Moore and Robert Noyce found Intel.

1969: Willard Boyle and George E. Smith invent the charge-coupled device (CCD) at AT&T Bell Labs.

1969: Ken Thompson creates his game Space Travel on the Multics Operating System.

1969: Bell Labs pulls out of Multics consortium.

1970: Ken Thompson and Denis Ritche create a new lighter weight "programmers workbench" operating system with many of the robust features of Multics.  The new operating system runs its first application: Space Travel.  Denis Ritche coins the name Unics playing on the Multics name, later it becomes known as Unix.

1970: Edgar F. Codd publishes "A Relational Model of Data for Large Shared Data Banks" which establishes a relational model for databases.

1970: Xerox opens Palo Alto Research Center (PARC) which attracts some of the United States’ top computer scientists. Under Bob Taylor’s guidance as associate manager it produces many groundbreaking inventions that later transform computing, including Laser printers, Computer-generated bitmap graphics, The WYSIWYG text editor, Interpress, a resolution-independent graphical page-description language and the precursor to PostScript, Model–view–controller software architecture.

1971: IBM introduces the first commercially available 8 inch floppy diskette.

1971: Intel introduces the first commercially available microprocessor the 4-bit 4004.

1971: Raymond Tomlinson implements the first email system able to send mail between users on different hosts connected to the ARPAnet.  To achieve this, he uses the @ sign to separate the user from their machine, which has been used in email addresses ever since.

1972: Peter Brody's team at Westinghouse produces the first active-matrix liquid-crystal display panel.

1972: Hewlett-Packard introduces the HP-35 the world's first scientific pocket calculator.

1973: Eckert and Mauchly Eniac patent is invalidated.

1973: Robert Metcalfe devises the Ethernet method of network connection at the Xerox Parc.

1974: Vint Cerf and Bob Kahn publish a paper titled "A Protocol for Packet Network Intercommunication" describing an internetworking protocol for sharing resources using packet-switching among the nodes which was henceforth called the Internet Protocol Suite, and informally known as TCP/IP.

1973: Gary Kildall develops CP/M which stands for "Control Program/Monitor".

1975: The January edition of Popular Electronics featured the MITS Altair 8800 computer kit, based on Intel´s 8080 microprocessor, on its cover.

1975: Bill Gates and Paul Allen found Microsoft to develop and sell BASIC interpreters for Altair 8800.

1975: Softlab Munich releases Maestro I the world's first integrated development environment (IDE).

1975: Frederick Brooks publishes The Mythical Man-Month: Essays on Software Engineering is a book on software engineering and project management

1976: Kenneth Appel and Wolfgang Haken prove the Four Color Theorem by computer, the first major mathematical proof to be proven by a computer.

1977: Ron Rivest, Adi Shamir, and Leonard Adleman publicly describe the RSA cryptosystem, the first practicable public-key cryptosystem.

1977: Apple introduces the Apple II, designed primarily by Steve Wozniak,  it is one of the first highly successful mass-produced microcomputer products.

1977 Atari, Inc releases The Atari 2600, a microprocessor-based hardware video game console with ROM cartridges containing game code.

1978: Brian Kernighan and Dennis Ritchie publish "K&R" establishing a de facto standard for the C programming language, later in 1989 the American National Standards Institute published a standard for C (generally called "ANSI C" or "C89").  Development of C was started by Dennis Ritchie and Ken Thompson in 1969 in conjunction with the development of the UNIX operating system.

1978: University of North Carolina at Chapel Hill and Duke University publicly establish Usenet.

1979: Dan Bricklin and Bob Frankston found Software Arts, Inc., and began selling VisiCalc.

1980: Smalltalk (Smalltalk80) a Xerox Parc project led by Allan Kay since the early 1970’s is released to the public.

c. 1980: Fujio Masuoka while working for Toshiba invents both NOR and NAND types of Flash memory

1981: IBM introduces the IBM PC.

1984: Apple Computer launched the Macintosh, Based on the Motorola 68000 microprocessor, it is the first successful mouse-driven computer with a graphic user interface.

1984: Abraham Lempel, Jacob Ziv, and Terry Welch publish the Lempel–Ziv–Welch (LZW) compression algorithm.

1985: Bjarne Stroustrup publishes The C++ Programming Language.

1989: Tim Berners-Lee proposes the "WorldWideWeb" project and builds the first Web Server and Web Client using HTTP and HTML all of which lays the foundation for what becomes the World Wide Web.

1991: Linus Torvalds releases Linux to several Usenet newsgroups.

1991: Eugenio Moggi publishes "Notions of Computation and Monads" (pdf) which describes the general use of monads to structure programs.

1992: The first JPEG standard and the MPEG-1 standard are released.

1993: Marc Andreessen and Eric Bina release the Mosaic Web Browser.

1994: Jeff Bezos founds Amazon.com.

1994: Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides, known as the "gang of four" publish Design Patterns: Elements of Reusable Object-Oriented Software.  Subsequently in 1999 are found guilty by The International Tribunal for the Prosecution of Crimes against Computer Science by a two thirds majority.

1995: NSFNET is decommissioned, removing the last restrictions on the use of the Internet to carry commercial traffic.

1996: Larry Page and Sergey Brin develop the PageRank algorithm and found Google two years later.

1997: Nullsoft releases Winamp.

1998: Diamond Multimedia releases The Rio PMP300, the first commercially successful portable consumer MP3 digital audio player.

1999: Napster is released as an unstructured centralized peer-to-peer system.

1999: Darcy DiNucci coins the term Web 2.0 to describe World Wide Web sites that use technology beyond the static pages of earlier Web sites including social networking sites, blogs, wikis, folksonomies, video sharing sites, hosted services, Web applications, and mashups.. The term is later popularized by Tim O'Reilly at the O'Reilly Media Web 2.0 conference in late 2004.

1999:  The Wi-Fi Alliance formed as a trade association to hold the Wi-Fi trademark under which most products are sold.

2001: Jimmy Wales and Larry Sanger launch Wikipedia.

2004: Google acquires Sydney-based company Where 2 Technologies and transforms its technology into the web application Google Maps.

2006: Amazon launches Amazon Web Services.

2007: Apple releases the iPhone setting the design of the modern smart phone.

2010: Apple releases the iPad shaping tablet computer technology that started over 20 years prior to its current character.

2010: Siri is released by Siri Inc on iOS and subsequently acquired by Apple.

2012: The LHC Computing Grid becomes the world's largest computing grid, comprising over 170 computing facilities in a worldwide network across 36 countries.

2012: The Raspberry Pi Foundation ships the first batch of Raspberry Pi’s, a credit-card-sized single-board computer.

2013: The Univalent Foundations Program, Institute for Advanced Study organizes 40 authors to release The HoTT Book on Homotopy Type Theory.

If you made it this far you’ve survived my brief, relatively speaking, trip though five millennia of the human history of computing.  It was hard to pick more recent things since we don’t know what the important events for the future are.  I chose to end with Homototpy Type Theory since it is supposed to be the next big thing.  Also you might have noticed that my title is similar, perhaps an homage, or just a rip off of:  "A Brief, Incomplete, and Mostly Wrong History of Programming Languages".  I had this idea prior to seeing his and for a long time it deterred me from writing this, but I finally got over the perception that he did it first since my original idea was quite different.

If you are still hungry for more including various references and influences on this post:

History of Computing

Columbia University Computing History

List of IEEE Milestones

History of Lossless Data Compression Algorithms

John von Neumann, EDVAC, and the IAS machine

Computer History Museum Timeline of Computer History

Wikipedia History of Computing

Wikipedia Timeline of Computing

PBS American Experience: Silicon Valley

PBS Nova’s Decoding Nazi Secrets

James Burke’s The Day the Universe Changed