10 April 2017

The Problem with Today’s Software Thought Leaders

Albert Einstein once said:  "What is right is not always popular and what is popular is not always right."   The current zeitgeist in software development has this exact same feel.  Our lives as software developers have been shaped by the ideas of many people.  Some of those ideas are right and some are popular.  In time the wrong ideas will fade and the right ideas will dominate.  The problem is: How do we make that determination in the present?  What are the good ideas and what are the bad ideas? 

As software developers our working days usually consist of two types of professional activities, one is obviously creating software (software development) or related technical tasks configuration management, troubleshooting, deployments, etc. and the other is being managed or other duties that relate to providing information (status) to managers (software development management).  The software industry is very young and constantly changing.  At any given moment there are dominant de facto standards that are followed for both software development and its management.  Although it varies across industry sectors, many development practices and tooling have become standards, things like IDE’s, continuous integration, the use of tests and testing frameworks, etc.  Some of these have been influenced by Agile and XP.   On the management side the standard seems to be largely dominated by Agile like scrum and associated practices from XP.

The people who dominate our industry with their ideas could be described as our thought leaders.   Wikipedia has the following definition for thought leader

A thought leader can refer to an individual or firm that is recognized as an authority in a specialized field and whose expertise is sought and often rewarded.

The creation of Agile was done in a ceremonious fashion, depicted in an odd ritualistic looking and now famous photograph, by signatories to a manifesto.   Let’s call these people the Agile Thought Leaders or just the Agilistas (since brevity is our thing).  The Agilistas have come to have their ideas dominate the industry and in many cases dictate how we now spend our days. 

It seems that Agile’s rise in popularity is directly proportional to my increasing skepticism about it.  I find myself internalizing that skepticism into frustration and even feelings of anger towards it and the Agilistas.  These feelings become exacerbated by the various incarnations of Agile created by legions of certified Agile consultants that roll into my life to tell me how to spend my days.

Now those feelings are not constructive and one really needs to step back and ask: Why is what I see in the industry wrong and how can I form and a constructive criticism of it? 

I was lucky to come across Greg Wilsons’s talk: "What We Actually Know About Software Development, and Why We Believe It's True".  This is a very enlightening talk which I highly recommend.  In it he looks objectively at how we have arrived where we are and how we can possibly better move forward.   I got the term Agilistas from his talk, although I don’t know if he coined it.

The talk touches on issues that I feel plague both my career and our industry. The problem manifests itself in how we make our decisions about what we do on projects both from the day to day and to the larger industry practices. Many of these decisions seem to be influenced all too often by people who claim some authority, sometimes it’s positional authority, sometimes it’s due to some form of gravitas and all too often right or wrong it boils down to someone’s opinion.  To better illustrate this about our thought leaders let's look at Greg Wilson’s critique of one of Martin Fowler’s publications about DSL’s (Domain Specific Languages).  The relevant quotes with time stamps are as follows:

11:26 [Martin Fowler] is a very deep and careful thinker about software and software design and last summer in IEEE Software he had an article on Domain Specific Languages, this notion that build a tiny language in which the solution to you original problem in easy to express. And in that he said using DSLs leads to improved programmer productivity and better communication with domain experts.

11:51 I want to show you what happened here. One of the smartest guys in our industry made two substantive claims of fact in an academic journal and there is not a single citation in the paper.  Not a single foot note or reference. There is no data to backup this claim. I am not saying that Martin is wrong in his claims. I am saying is what we have here is a Scottish verdict. Courts in Scotland are allowed to return one of three verdicts, guilty, innocent, or not proven.  Arguments in favor of DSLs, arguments in favor of functional programming languages making parallelism easier or Agile development leading to tenfold return on investment or whatever are unproven.  It doesn’t mean they are or wrong, it doesn’t mean they are right, it means that we don’t know and we should be humble enough to admit that.

12:38: Carrying on in that same article "Debate still continues about how valuable DSLs are in practice. I believe that debate is hampered because not enough people know how to develop DSLs effectively."   Crap I think the debate is hampered by low standards of proof.

12:55:  Listening computer scientists argue, listening to software engineers in industry argue, it seems that the accepted standard of proof is I’ve two beers and there’s this anecdote about a tribe in new guinea from one of Scott Berkin’s books that seems to be vaguely applicable therefore I’m right.  Well no, sorry that’s not proof and you should have higher standards than that.

I think these are excellent points. I disagree with Greg Wilson here and feel that Martin Fowler is not one of the smartest guys in our industry, although he maybe one of the most influential. I also question whether he can be considered a careful thinker.  I think his approach like many of the other Agilistas is self serving and shows very little humility or willingness to be self critical. This is something that Bertrand Meyer commented on when reaching out to the Agilistas in conjunction with his book on Agile! The Good, the Hype and the Ugly that almost all of the Agilistas treated him with "radio silence" after receiving his draft.  The problem is that the Agile business is just too lucrative for consulting and on the speaking circuit.  A great example of this business model is "Uncle Bob" Martin’s infomercial-esque site promoting his wisdom for sale. I refuse to link to it, so google it, if you want to see that spectacle.  All of these guys make a lot of money selling Agile and if they started being critical of it that would be a risk to their revenue streams.  Also how many of these guys still do significant software development work?  Ironically I think Uncle Bob still codes, but I am talking about opening up an IDE or text editor after checking out a big code base from version control and fixing bugs or doing actual coding or refactoring.  Now maybe that’s not fair to ask that, or is it?  After all they are driving how we spend our days being managed on software projects.

Another point of contention I have can be found on Martin Fowlers website under the FAQ section:

Why is your job title "Chief Scientist"?

It's a common title in the software world, often given to someone who has a public facing technical role. I enjoy the irony of it - after all I'm chief of nobody and don't do any science.

He openly admits he does no science (the black bolding is mine). Therein lays the problem with all of the Agilistas. There seems to be no scientific method involved in many if not all of their claims.  These claims as we have seen are sometimes made in "academic" journals without any citations. Of course I am skeptical about much that is written about software in journals published by ACM or IEEE and of the organizations themselves.  I think Greg Wilsons observation validates my skepticism and further draws attention to the lack of scientific method involved in software.  We need better science and people who are intellectually honest about the fact that they are just expressing unproven and in some cases unfounded opinions.

It seems to me a more conscientious approach would be for the Agilistas to promote and fund studies. Use the scientific method to back up their claims and help move the industry forward.  Whether they like it or not that is what will most likely eventually happen and they will either need to get on board or be left behind.

The irony here is that while we need more of a scientific method in defining software methodologies, much of our infrastructure comes from mathematics, another area that seems to be lacking with the Agilistas.  For example: Noam Chomsky’s formalization of grammars were adapted by Jim Backus and further refined algebraically by Donald Knuth to give us modern compiler theory and compilers.  There’s a long list of people who have laid down the mathematical foundations, people like Alan Turing, John Von Nueman, Alonzo Church, Stephen Cole Kleene, Haskell Curry and the list goes on.  Unfortunately I think too many people in our industry are ignorant of our field’s true mathematical underpinnings. I have also encountered a large degree of anti-intellectualism from programmers when it comes to applying modern math to programming.  Again I think this is an area where one should only ignore it at their own peril.

If you want to see the work that should and probably will drive our future I would recommend looking at what the researchers are doing, people like:

These are just a few. There are many people doing research that will eventually find its way into our industry. It has been happening this way all along.  The problem with these people’s work is it requires real intellectual effort as opposed to the Agile approach which, while in some cases maybe common sense, all to often seems to come across as a bunch of platitudinous management speak.

I think we need to hold the Agilistas to higher standards of proof!

Disclaimer: I still code personal projects but have not been on a dev team doing real dev work in about 3-4 years after doing it for over 25 years.  Actually from what I have seen of Agile management techniques and war room/open office layouts, I am in no hurry to do any of that any time soon.

References and Further Reading

08 February 2017

Data Science DC: Deep Learning Past, Present, and Near Future

Here in Washington DC we are lucky to have a robust technical community and as a result we have many great technical meetups which include a number that fall under the local data science community. I recently attended the Data Science DC (DSDC) talk "Deep Learning Past, Present, and Near Future" (video) (pdf slides) presented by Dr. John Kaufhold who is a data scientist and managing partner of Deep Learning Analytics. This will be the second DSDC post I have written, the first is here. Parts of this post will be cross posted as a series on the DSDC blog (TBA), references to my previous blog posts either here or there will be to referring to my blog elegantcoding.com. My previous DSDC post was pretty much just play by play coverage of the talk. I decided to take a slightly different approach. My goal here is to present some of the ideas and source material in John's talk while adding and fleshing out some additional details. This post should be viewed as "augmentation" as he talks about some things that I will not capture here.

I am guessing that some proportion of the audience that night is like me in that they feel the need to learn more about deep learning and machine learning in general. This is a pressure that seems to be currently omnipresent in our industry. I know some people who are aggressively pursuing machine learning and math courses. So I would like to add some additional resources vetted by John for people like myself who want to learn more about deep learning.

I decided to make this a single post in with parts. This will essentially parallel John's presentation, although I will change the order a bit. The first part will summarize the past, present, and future of deep learning. I will be doing some editorializing of my own on the AI future. The second part will be an augmented list of resources including concepts, papers, and books in conjunction with what he provided for getting started and learning more in depth about deep learning.

I will give a brief history of deep learning based in part on John's slide, well actually Lukas Masuch's slide. I am a fan of history, especially James Burke Connections style history. It is hard for me to look at any current recent history without thinking of the greater historical context. I have a broader computer history post that might be of interest to history of technology and history of science fans. Three earlier relevant events might be the development of the least squares method that lead to statistical regression, Thomas Bayes's "An Essay towards solving a Problem in the Doctrine of Chances" and Andrey Markov's discovery of Markov Chains. I will leave those in the broader context and focus on the relative recent history of deep learning.

An Incomplete History of Deep Learning

1951: Marvin Minsky and Dean Edmonds build SNARC (Stochastic Neural Analog Reinforcement Calculator) a neural net machine that is able to learn. It is a randomly connected network of Hebb synapses.

1957: Frank Rosenblatt invents the Perceptron at the Cornell Aeronautical Laboratory with naval research funding. He publishes "The Perceptron: A Probabilistic Model for Information Storage and Organization in The Brain" in 1958 . It is initially implemented in software for the IBM 704 and subsequently implemented in custom-built hardware as the "Mark 1 Perceptron".

1969: Marvin Minsky and Seymour Papert publish Perceptrons which showed that it was impossible for these classes of networks to learn an XOR function.

1970: Seppo Linnainmaa publishes the general method for automatic differentiation of discrete connected networks of nested differentiable functions. This corresponds to the modern version of back propagation which is efficient even when the networks are sparse.

1973: Stuart Dreyfus uses backpropagation to adapt parameters of controllers in proportion to error gradients.

1972: Stephen Grossberg published the first of a series of papers introducing networks capable of modeling differential, contrast-enhancing and XOR functions. "Contour enhancement, short-term memory, and constancies in reverberating neural networks (pdf)".

1974: Paul Werbos mentions the possibility of applying back propagation to artificial neural networks. Back propagation had been initially context of control theory by Henry J. Kelley and Arthur E. Bryson in the early 1960s. Around the same time Stuart Dreyfus published a simpler derivation based only on the chain rule.

1974–80: First AI winter which may have been in part caused by the often mis-cited 1969 Perceptrons by Minsky and Papert.

1980: Neoconitron, a hierarchical, multilayered artificial neural network, is proposed by Kunihiko Fukushima. It has been used for handwritten character recognition and other pattern recognition tasks and served as the inspiration for convolutional neural networks.

1982: John Hopfield popularizes Hopfield networks, a type of recurrent neural network that can serve as content-addressable memory systems.

1982: Stuart Dreyfus applies Linnainmaa's automatic differentiation method to neural networks in the way that is widely used today.

1986: David E. Rumelhart, Geoffrey E. Hinton and Ronald J. Williams publish "Learning representations by back-propagating errors (pdf)". This shows through computer experiments that this method can generate useful internal representations of incoming data in hidden layers of neural networks

1987: Minsky and Papert publish "Perceptrons - Expanded Edition" where some errors in the original text are shown and corrected.

1987–93: Second AI winter occurs. Caused in part by the collapse of the Lisp machine market and the fall of the expert system. Additionally training times for deep neural networks are too long making them impractical for real world applications.

1993: Eric A. Wan wins the an international pattern recognition contest. This is the first time backpropagation is used to win this contest.

1995: Corinna Cortes and Vapnik publish current standard incarnation (soft margin) in "Support-Vector Networks". The original SVM algorithm was invented by Vladimir N. Vapnik and Alexey Ya. Chervonenkis in 1963. SVMs now take a dominant role in AI.

1997: Sepp Hochreiter and Jürgen Schmidhuber invent Long-short term memory (LSTM) recurrent neural networks greatly improving the efficiency and practicality of recurrent neural networks.

1998: A team led by Yann LeCun releases the MNIST database, a dataset comprising a mix of handwritten digits from American Census Bureau employees and American high school students. The MNIST database has since become a benchmark for evaluating handwriting recognition. Lecun, Bottou, Bengio, Haffner publish "Gradient-Based Learning Applied to Document Recognition (pdf)"

1999: NVIDIA Invents the GPU

2006: Geoffrey Hinton and Ruslan Salakhutdinov publish "Reducing the Dimensionality of Data with Neural Networks (pdf)" This is an unsupervised learning breakthrough that now allows for the training of deeper networks.

2007: NVIDIA launches the CUDA programming platform. This opens up the general purpose parallel processing capabilities of the GPU.

2009: NIPS Workshop on Deep Learning for Speech Recognition discovers that with a large enough data set, the neural networks don't need pre-training, and the error rates drop significantly.

2012: Artificial pattern-recognition algorithms achieve human-level performance on certain tasks. This is demonstrated by Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton using the ImageNet dataset and published as "ImageNet Classification with Deep Convolutional Neural Networks (pdf)".

2012: Google's deep learning algorithm discovers cats. AI now threatens the cat cabal control of the internet.

2015: Facebook puts deep learning technology - called DeepFace - into operation to automatically tag and identify Facebook users in photographs. Algorithms perform superior face recognition tasks using deep networks that take into account 120 million parameters.

A Brief and Incomplete History of Game Playing Programs and AIs

1951: Alan Turing is first to publish a program, capable of playing a full game of chess, developed on paper during a whiteboard coding interview.

1952: Arthur Samuel joins IBM's Poughkeepsie Laboratory and begins working on some of the very first machine learning programs, first creating programs that play checkers.

1992: Gerald Tesauro at IBM's Thomas J. Watson Research Center develops TD-Gammon a computer backgammon program. Its name comes from the fact that it is an artificial neural net trained by a form of temporal-difference learning.

1997: IBM Deep Blue Beats Kasparov, the world champion at chess.

2011: IBM's Watson using a combination of machine learning, natural language processing and information retrieval techniques beats two human champions on the TV game show Jeopardy! Later versions of Watson successfully banter with Alex thus further surpassing human capabilities.

2013: DeepMind learns to play 7 Atari games, surpassing humans on three of them, using Deep Reinforcement Learning and no adjustments to the architecture or algorithm. It eventually learns additional games taking the number to 46.

2016: AlphaGo, Google DeepMind's algorithm, defeats the professional Go player Lee Sedol.

2017: DeepStack becomes the first computer program to beat professional poker players in heads-up no-limit Texas hold'em.

The Present and Future

John also included the idea of Geoffrey Hinton, Yann LeCun, and Yoshua Bengio being the triumvirate that weathered the AI winter, a theme that he used in his first DSDC talk as well. Since it was one of Homer's vocabulary words, I couldn't resist creating my own graphic for it:

While the triumvirate researchers are definitely major players in the field, there are many others. There seems to be some drama about giving credit where credit is due. John goes into some of the drama surrounding this. Unfortunately it seems to human nature sometimes we have drama surrounding scientific discoveries and this is no exception in the Deep Learning community today. Geoffrey Hinton also catches some flak on reddit (ok, no surprise there) for not properly crediting Seppo Linnainmaa for backpropagation. It is natural to want credit for one's ideas, however, these ideas are never conceived of in a vacuum and sometimes they are discovered independently, Calculus springs to mind. As they say history is written by the winners. John made a remark about Rosalind Franklin's Nobel Prize for the double helix. I asked him how many people in the audience he thought got that remark and he replied: "One, apparently :) I'll take it!" Lise Meitner is another interesting example of someone who failed to be recognized by the Nobel committee for her work in nuclear fission.

A major name in the field is Jeff Dean. John mentioned Jeff Dean facts. This is apparently based on the Chuck Norris facts. Jeff Dean somewhat of a legend and it is hard to tell what is real and what is made up about him. Do compilers really apologize to him? We may never know. As I understand it he was one of the crucial early engineers to have helped create the success of Google. He was instrumental in the conception and development of both Map Reduce and Big table not only solidifying Google's successful infrastructure but also laying down much of the foundation of major big data tools like Hadoop and Cassandra. There is some serendipitous timing here in that Jeff Dean published "The Google Brain team - Looking Back on 2016" on the Google Research Blog a few days after John's talk. His post lays out a number of areas where deep learning has made significant contributions as well other updates from their team.

The present and near future for Deep Learning is very bright. There are many deep learning companies that have been acquired and many of these are small averaging about 7 people. John describes this as a startup stampede. Most likely a lot of these are acqui-hires since the labor pool for deep learning is very small. Apparently the top talent is receiving NFL equivalent compensation.

It's fairly obvious that deep learning and machine learning are the current hotness in our industry. One metric might be, given any random moment, go on hacker news and you will most likely see a deep learning article if not a machine learning article on the front page. As if this is not enough, now Kristen Stewart famous for the Twilight movies, has coauthored an AI paper: "Bringing Impressionism to Life with Neural Style Transfer in Come Swim". I am now waiting for an AI paper from Jesse Eisenberg.

As for perspectives on the future of Machine Learning John recommends chapters 2, 8, and 9 of the free O'Reilly book The Future of Machine Intelligence. He also points out that the idea that deep learning does not necessarily mean neural Nets and that it means the number of operations between the input and output. This means that machine learning methods other than neural nets can have a deep architecture. It's about the depth itself that learns the complex relationship. For this he cites "Learning Feature Representations with K-means (pdf)" by Adam Coates and Andrew Y. Ng.

Christopher Olah in his post: "Neural Networks, Types, and Functional Programming" talks about how deep learning is a very young field and that like other young fields in the past things are developed in an ad hoc manner with more of the understanding and formalisms being discovered later. As he points out we will probably have a very different view of what we are doing in 30 years. Although that last point is a bit obvious, it still helps to put things in context. So it is probably hard to tell what will be the most fruitful AI algorithms even ten years from now.

At the end of his talk John touches on a bigger issue that is becoming more ubiquitous in conversations about deep learning and AI: What are the larger social implications for society? So I thought I would contemplate some possibilities.

The first and perhaps most imminent and frightening concern is the loss of jobs due to automation. Many people argue that this is the same fear that was raised especially during the first two industrial revolutions which occurred in the late 18th-early 19th century and the late 19th-early 20th century. Of course how you classify industrial revolutions can vary and can easily be off by one if you include the Roman innovation of water power. These industrial revolutions displaced certain skilled jobs by mechanizing work and created new, perhaps less skilled factory jobs. The third industrial revolution mid-late 20th century can be thought of be the rise of more factory automation and the development of microprocessor which can be described in simple terms as a general electronic controller, of course it is way more than that. This continued automation lead to many things like specialized robots for manufacturing. This meant still less jobs for people. Now we are on the cusp of what is being called the fourth industrial revolution with generalized AI algorithms and multipurpose robotics.

I have often heard the argument that this has all happened before. John brings up a valid point on this which is when it happened before there was a lot more time for society to recover and new jobs were created. There was an article on NPR identifying truck driver as one of the most common jobs in America. There is some dispute of this in that the most common jobs are retail jobs. Regardless of which is the most common job, still there are by some estimates 3.5 million truck driver jobs in the US. Self driving vehicles are very possibly less than five years away. That means those jobs, not to mention taxi jobs as well will be lost. How long before you click purchase on an ecommerce website like Amazon and no humans are involved in that purchase arriving at your house? This might be about that same timeframe. This is an unprecedented level of automation and it affects many other professions for example some paralegal and radiology jobs are potentially now under threat.

There is of course an upside. If AI's become better at medical jobs than humans that could be a good thing. They won't be prone to fatigue or human error. In theory people could liberated from having to perform many tedious jobs as well. AI's could usher in new scientific discoveries. Perhaps they could even make many aspects of our society more efficient and more equal. In this case I for one would welcome our AI overlords.

It is hard to contemplate the future of society without thinking of the work of Orwell and Huxley, this comic sums up some of those concerns and begs the question how much of this do we currently see in our society. From the Orwellian perspective programs like PRISM and the NSA's Utah Data Center allow for the collection and storage of our data. AI's could comb through this data in ways humans never could. This creates the potential for a powerful surveillance state. Of course it is about how this power is used or abused that will determine if we go down the trajectory of the Orwellian dystopia. The other scary aspect of AI's are idea of automated killer drones, this of course is the Skynet dystopia. I think we should fear the Elysium dystopia with both aerial and Elysium style enforcement drones.

Maybe we go down the other path with a universal income that frees us from having to perform daily labor to earn our right to live. How do we now fill our days? Do we all achieve whatever artistic or intellectual pursuits that we are now free to explore? What happens if those pursuits can also be done better by the AI's? Is there a point in painting a picture or writing and singing a song if you will just be outcompeted by an AI? Some of this is already possible. ML algorithms have written reasonable songs, created artistic images as well as having written newspaper articles that are mostly indistinguishable from human writers. Even math and science may fall under their superior abilities for instance Shinichi Mochizuki's ABC conjecture proof seems to be incomprehensible by humans. What if math and science becomes too big and complex for humans to advance it and it becomes the domain of the AI's? So in this case do we fall into Huxley's dystopia? Actually many people would probably find themselves, as some do now, exclusively pursuing drugs, alcohol, carnal pleasures, porn, video games, tv, movies, celebrity gossip, etc. So do we all fall into this in spite of our aspirations? And the AI's will most likely be able to create all of our entertainment desires including the future synthetic Kim Kardashian style celebs.

Ok so that's enough gloom and doom. I can't really say where it will go. Maybe we'll end up with AI hardware wired directly to our brains and thus be able to take our human potential to new levels. Of course why fight it. If you can't beat them join them and the best way to do that is to learn more about deep learning.

Deep Learning 101/Getting Started

Online Learning Resources

John mentions a few resources for getting started. He mentions Christopher Olah's blog, describing him as the Strunk and the White of Deep Learning through his clear communication using a visual language. I have had his "Neural Networks, Manifolds, and Topology" post on my todo list and I agree that he has a bunch of excellent posts. I admit to being a fan of his visual style from his old blog as my post on Pascal's Triangle was partially inspired by his work. His newer work seems to be posted on http://distill.pub/. John also mentions Andrej Karpathy's blog as well as the Comprehensive Course on Convolutional Neural Nets that he teaches Stanford: CS231n: Convolutional Neural Networks for Visual Recognition, which can additionally be found here.

Additionally John did a write up "Where are the Deep Learning Courses? after his first DSDC talk.

If you are local we also have the DC-Deep-Learning-Working-Group they are currently working through the Richard Socher Stanford NLP course.

Some Additional Resources

I wanted to include some additional resources and came up with a list that I asked John's opinion on. These are some that I found that he endorses with high recommendations for Richard Socher's NLP tutorial.

Neural Networks and Deep Learning By Michael Nielsen http://neuralnetworksanddeeplearning.com/

Deep Learning by Ian Goodfellow and Yoshua Bengio and Aaron Courville http://www.deeplearningbook.org/

UDACITY : Deep Learning by Google

ACL 2012 + NAACL 2013 Tutorial: Deep Learning for NLP (without Magic), Richard Socher, Chris Manning and Yoshua Bengio

CS224d: Deep Learning for Natural Language Processing, Richard Socher

These are the additional resources that looked interesting to me:

Deep Learning in a Nutshell by Nikhil Buduma

Deep Learning Video Lectures by Ruslan Salakhutdinov, Department of Statistical Sciences, University of Toronto:

Lecture 1

Lecture 2

Nuts and Bolts of Applying Deep Learning (Andrew Ng)(video)

Core Concepts

I wanted to list out some core concepts that one might consider learning. I reached out to John with a list of concepts. He replied with some that he thought were important and pointed out that such a list can grow rapidly and it is better to take a more empirical approach and learn as you go by doing. I think a brief list might be helpful to people new to machine learning and deep learning.

Some basic ML concepts might include:

These are some that are more specific to Neural Nets:

Some additional areas that might be of interest are:

Additionaly a couple of articles that lay out some core concepts are The Evolution and Core Concepts of Deep Learning & Neural Networks and Deep Learning in a Nutshell: Core Concepts.

Git Clone, Hack, and Repeat

John mentions the approach of Git Clone, Hack, and Repeat. I asked him if there were and there any specific github repo's that you would recommend? I am just using his reply here, with some of the links and descriptions filled in:

This is really, choose your own adventure, but when I look in my history for recent git clone references, I see:

Torch is a scientific computing framework for with wide support for machine learning algorithms that puts GPUs first. It is easy to use and efficient, thanks to an easy and fast scripting language, LuaJIT, and an underlying C/CUDA implementation. github

Efficient, reusable RNNs and LSTMs for torch

Torch interface to HDF5 library

Computation using data flow graphs for scalable machine learning

This repository contains IPython Notebook with sample code, complementing Google Research blog post about Neural Network art.

Neural Face uses Deep Convolutional Generative Adversarial Networks (DCGAN)

A tensorflow implementation of "Deep Convolutional Generative Adversarial Networks"

Among others--playing with OpenAI's universe and pix2pix both seem full of possibilities, too.

Image-to-image translation using conditional adversarial nets

Openai Universe

To learn, I'm a big advocate of finding a big enough dataset you like and then trying to find interesting patterns in it--so the best hacks will usually end up combining elements of multiple repos in new ways. A bit of prosaic guidance I seem to find myself repeating a lot: always try to first reproduce what the author did; then start hacking. No sense practicing machine learning art when there's some dependency, sign error, or numerical overflow or underflow that needs fixing first. And bugs are everywhere.

Data Sets

I asked John to recommend some data sets for getting into deep learning, he recommended the following:




Andrej Karpathy's blog post "The Unreasonable Effectiveness of Recurrent Neural Networks" provides some examples using a Shakespeare corpus, wikipedia data, and linux kernel code that can illustrate a lot of really interesting properties of RNNs

Deep Learning Frameworks

He also mentions a number of Deep Learning frameworks to look at for getting into deep learning, the top three are tensorflow, caffe and Keras.

As I was writing this I saw this: "Google Tensorflow chooses Keras". This makes Keras the first high-level library added to core TensorFlow at Google, which will effectively make it TensorFlow's default API.

Building Deep Learning Systems

One area of interest to me is how to build better software. In traditional software development tests area used to help verify a system. In these instances you can usually expect known inputs to yield known outputs. This is something that software developers strive for and is increased by things like referential transparency and immutability. However, how do you now handle systems that behave probabilisticly instead of discretely? As I was researching the topic of how do you successfully test and deploy ML systems, I found these two resources: "What's your ML test score? A rubric for ML production systems" and "Rules of Machine Learning:Best Practices for ML Engineering(pdf) ". I asked John his thoughts on this and he recommended: "Machine Learning: The High-Interest Credit Card of Technical Debt" as well as "Infrastructure for Deep Learning", although he said this is more of a metaphor us normal humans that don't have Altman/Musk funded budgets. He also pointed out that building these types of systems also fall under the standard best practices approach to any software and trying to maintain high software development standards is just as helpful as in any project.

Final Thoughts

My contemplation of the future is pretty pessimistic from the societal perspective. From the AI perspective it is pretty optimistic on what can be done. New articles about AI achievements like "Deep learning algorithm does as well as dermatologists in identifying skin cancer" are constantly appearing. However, then you see things like "Medical Equipment Crashes During Heart Procedure Because of Antivirus Scan". Although this is not related to Deep Learning it does touch on a well known issue, which is our lives are effectively run by software and we still don't really know how to truly engineer it. From what I understand, we really don't know exactly how some of these deep learning algorithms work. This makes the future of our software and AI seem even more uncertain. Hopefully we will overcome that.

Lastly I wanted thank John for being very generous with his time in helping me with additional resources for this post.

References and Further Reading

Some Leading Deep Learning Researcher's Links

26 September 2016

Software Development is a Risky Business

Software projects are inherently risky.  Some statistics show that roughly seventy percent to fifty percent of software projects get classified as “challenged”.   Where challenged can mean cost overruns and/or delivery delays, or it can mean complete failure and the cancellation of the project.  Some statistics indicate as the size of the project increases so does the probability of problems and failure.  Many articles and blog posts have been written about the long history of software project failure.  At the time of writing this post the most recent and very public software project debacle was the healthcare.gov project.  It seems as IT becomes more pervasive and essential the problems with projects may be getting worse not better.

In spite of these numbers and all too often in my experience I have found that individuals and organizations seem to have a very optimistic outlook about their own projects especially at the beginning of the project.  Perhaps it’s the classic: “that won’t happen to me” attitude.  Even as projects start to run into problems I have seen people especially managers not take issues seriously or ignore them all together.  In some of the analysis of healthcare.gov, this was cited as one of the problems.

Risk management and risk assessment/analysis are developed areas in project management and engineering and are often discussed in software projects.  Many risks will end up falling into two categories.  One category is actual risks to the end product, such as unforeseen security vulnerabilities in software systems or other engineering issues that lead to flaws or critical failures of the system or components. The other is risks in the management and execution of constructing product or system. 

Risk management is one of those areas where the engineering aspects and the engineering management aspects can become tightly coupled.  An example can be derived from the XKCD comic “Tasks”: https://xkcd.com/1425/.  Now the point of this comic is to demonstrate the difference between an easy problem and hard one, and presumably the conversation takes place between a manager and developer.  In a risk setting, perhaps solving the hard problem is the goal of the project.  So a risk might be that the research team fails to solve that problem at the end of the five years, or that it takes ten. So in a sense the task is high risk but there are team related risks, maybe the budget is too low and the team is subpar due to low salaries so they can never even solve it.  Another possible scenario is that the conversation never happened and the project management and team think it is solvable in six months.  This second scenario is the risk I have seen though out most of my career, the project management and team are overly optimistic or really don’t have a good understanding of the complexity of the task at hand.  In some cases these types of issues seem to be due in part to pluralistic ignorance.  Often in cases where developers don’t want to contradict management.  There’s a great dramatization of this type of behavior in the show Silicon Valley when the Hooli developers are working on the Nucleus project, where the information is conveyed up the management chain but nobody wants to tell Gavin the truth.

Sadly it seems like the risks that are identified are almost peripheral to the real risks on projects.  It is usually things like the hardware delivery will be delayed which will delay setting up our dev/test/prod environments and delay the start of development.  While these are legitimate risks, there is always the risk of all of the unknowns that will cause delays and cost over runs, even things like developer turnover never seem to be taken into account, at least in my experience.  Also there seems to no cognizance of the long term cost risks like the project was poorly managed or rushed or was done on the cheap which caused the delivered codebase to be an unmaintainable big ball of mud that will incur a much higher long term costs.

In order for software engineering to grow up and become a real engineering discipline it will need formal risk assessment and risk management methodologies both on the project management side and on the work product side.  This is also an area where ideas and methodologies can potentially be taken from existing work in other engineering disciplines.  These methodologies will also most likely draw from probability and game theory.

References and Further Reading

31 May 2016

My Ant Symbiosis Stories

This post is a little off topic from my normal posts.  Although, since I consider my blog to be about programming, software engineering, math and computer science, perhaps taking a brief foray into the natural sciences is warranted as it is science and I am a science nerd, and of course it is my blog.  The book that I am referencing for this post is: The Ants by E.O. Wilson and Bert Hölldobler. It has many great examples of data visualization.  There is also a CS relevant area of study called ant hill algorithms.  Additionally, I believe that the study of social biology is basically just game theory which again is a topic that comes up in CS.

The natural world is fascinating and it teams with many interesting life and death stories that play out every minute of every day in the theatre of nature.  Some of the most fascinating plots include various types of symbiosis that occur between species. Symbiosis, also called mutualism, is a cooperative relationship between two species that sometimes includes an exchange of “survival favors”.  A fairly famous example of this type of species interdependence is the story of why Brazil nuts produce very poorly in plantations.  It turns out the Brazil nut tree Bertholletia excelsa is dependent on a pollinator insect, an orchid bee Euglossa, which is attracted by an orchid, Coryanthes Vasquezii  which does not grow on the tree but is present in the undisturbed forest habitat.  Finally the tough outer shell needs large rodents like the Agouti to gnaw it open, they much like squirrels consume some of the seeds and also bury some for later which allows some of the forgotten seeds to germinate.  There are many examples of symbiotic relationships in rainforest and other habitats.

Ants are notorious for having symbiotic relationships with plants and other insects and many such cases are well documented. One case is the Neotropical Cecropia tree which provides hollow stems for Azteca ants to inhabit and even food in the form of gylcogen-rich Müllerian bodies.  There was another story about a relatively recent discovery of a plant that grows on the harsh cliffs of Pyrenees that depends on three species of ants for pollination and seed dispersal.  These types of stories are not uncommon in the natural world, however, one often reads of them playing out in the above and often exotic and/or tropical locations, so I found myself rather surprised to discover this type of drama transpiring literally inches from my living room window in suburban Arlington Virginia. 

The story I am going to tell happened during the summer of 2011 and I had originally labored under the assumption that it was a somewhat unique undocumented story.  However, my subsequent research has yielded a number of pictures which depict many of the events that I will describe and show here often with better photographic quality than what I what I captured.  Not that my photos are bad but they lack the fidelity of others I have seen.  I make up for this in quantity which captures various events over an extended period of time.

One of my hobbies is identifying eastern forest native plants and experimenting with growing them in my yard.  This activity, probably to the chagrin of my mcmansioned neighbors, has turned my yard into a small native plant and wildlife habit.   A few years ago I collected some seeds of a plant known as Yellow Leaf Cup, Bearpaw, or Bearsfoot, which I believe goes by the Latin name Smallanthus Uvedalia formerly known as Polymnia Uvedalia.  It is in the Aster Family (Asteraceae).  I planted some in pots and scattered some seeds in my yard, the potted ones germinated but did not make it however some took root right in front of my living room window.  To me this plant really stands out, for a temperate zone plant it has very large leaves, 12 to 18 inches long, and as a result it almost completely obscures my living room window shrouding it in giant green veined leaves as you can see in the picture at the top of this post.  The location of these leaves put me in a unexpected and unique position to watch an incredible and probably epoch old symbiotic drama of nature play out between Ants(Hymenopterans) and Hemipterans

In June 2011 as the stand of Large Flowered Leaf-Cup reached about eight or ten feet in height and completely blocked my living-room window providing a nice screen of green privacy, an odd looking insect, pictured below, showed up.  As you can see it’s shaped like a thorn which has garnered its ilk the general common name of thorn bug it is also referred to as a tree hopper this class of insects that falls under the order Hemiptera, and I believe it is the species Entylia carinata. It does not have a common name.  I think it needs a common name and in that picture I see a rhinoceros looking insect, thus I am coining it the rhinoceros thorn bug or just the shorter version of rhino thorn bug.  If you look closely at the picture below you can see its eyes and legs which are hard to see in many of the photos, my guess is this is by design as thorns don’t have eyes and legs.

Shortly after its arrival the ants showed up to tend to it:

I contacted a biologist who specializes in ants and he identified them as a species of carpenter ants in the genus Camponotus. He told me the possible species for this area are: Pennsylvanicus, Nearcticus, and Subbarbatus.

Over the following months I watched the interesting drama play out with these two species. Owning a copy The Ants gave some perspective to the traits and behaviors of these species.  Over time I captured over one thousand pictures of what played out between these two species.  This collection is really a dataset of photos and it revealed some interesting observations, which while not novel, still make for an interesting story.

Rhino Thorn Bugs: Masters of Camouflage

First let’s start with how Rhino Thorn Bug disguises itself.  A number of the photos of it were taken in different lighting conditions.  In the above photos the light is more direct and the yellow and green “veins” of both the plant and insect mesh and it looks like a thorn on the stem.  In the following lighting the Rhino Thorn Bug appears to have purplish veins that blend with the background of plant veins.

 Another outcome of the Rhino Thorn Bug living on smallanthus uvedalia is the damage to the leaves which also makes for a nice abstract photo:

I’m making a leap here, but it seems that this damage extends the camouflage range for the rhino thorn bug such that it can hang out on the leaf and look like damaged parts of the leaf that more closely match its brownish color and pseudo-veins in back lit conditions:

I also noticed that it seemed to stay in one place and was able to feed by making multiple bites into the thick main leaf vein, I suspect that the major vein might be a bundle of veins and it was tapping into individual sub-veins each time.  This is visible in a subsequent photo.

Protection and Feeding

Ant Symbiosis with other insects takes a number of forms. A common symbiosis is known as Trophobiosis where ants feed on what is known as honeydew.  This substance is a mixture of sugars that the insect excretes from its posterior.  It is often the case that the provider of this mixture breaks down more complex polysaccharides into simpler sugars that the ants can digest these include glucose, fructose, etc.   The not so great image below shows this an ant feeding from the posterior of the rhino thorn bug:

This relationship is not unfamiliar to humans. It seems to mirror our use of livestock especially dairy cows, although we get the fluid we consume from a different organ.  However, the behavior of protecting, caring for and in some cases herding these insects is exactly what we do with our livestock.  The ants have been doing this millions of years prior to our existence as the ant Hemiptera symbiosis behavior is thought to have originated in the Eocene period.  There is evidence of this, shown below, from about 15 million years ago found in Dominican Amber which shows a queen from the extinct species Acropyga glaesaria on her mating flight with a “seed” mealybug in her mandibles for use when she establishes her new colony.

Another ant behavior that I believe I caught is ants transferring food to each other, which is known as Trophallaxis.  In the image below you can see the two ants at the top with their mandibles interlocked, in a previous picture one of them was feeding and now seems to be transferring food to the other.


Ants have different castes of workers. The image below shows two different castes I observed.  The one at the top is a Major worker.  Ants of this caste are in some cases referred to as soldiers as they in some species take on roles like acting as “doormen” at colony entrances to ensure that only ants of that colony are allowed.  It easy to see that the larger body size with the large head and large mandibles makes them more formidable, although I don’t know if the solder distinction truly applies in this case.  The other two are minor workers which have smaller bodies and heads. 

Rhino Thorn Bugs, the Next Generation

As the story unfolded it got even more interesting. I recall, although I don’t have photos of it that the ants seems to escort the thorn bugs to meet each other for mating.  The evidence of mating was soon obvious as can be seen in the photo below with the ants in attendance of the brood of recently hatched nymphs.  The ants tended to the nymphs which included chasing them around as they wandered off. Some of the nymphs seemed to stray and many of them died.  Also visible in this photo are the multiple bites left by what was presumably the female rhino thorn bug made over her extended time staying in one location.

Pretty soon the nymphs are providing the same trophobiosis as their parents as shown below:

As the nymphs grow up you can see that they start to take on their adult shape:

The Next Year, New Ants and Aphids

I was curious to see if the thorn bugs would return the following year. They did not.  However, I was surprised to a whole new ant drama play out with new actors.   I reached out to the ant biologist who helped me the previous year and he identified these as Tapinoma sessile.  I was not able to identify the aphids.  I was told they would be plant specific.

Their activity was concentrated around the flowers and spanned the time that they were in bloom.  The “infestation” of the flower heads can be seen in the following photo:

Another difference was the ratio of ants to honeydew producers, as you can see there are many aphids to each ant which in contrast to the rhino thorn bugs which often had multiple ants in attendance of each one.  The biologist that I contacted said that the ants looked fat and happy, and their abdomens or gasters in ant biology speak, seem to be bursting and almost glistening, presumably with honeydew.   I once saw David Attenborough eat a honeypot ant and it got me thinking I should have tried it with these guys. Well that was until I learned that their common names include odorous house ant and stink ant.

The photo below shows how the flower’s sepal seems to serve as a protected staging area for the aphids to develop to full maturity:

All of this happened in 2012. I have not seen the aphids since. I do see the rhino thorn bugs at least one or two each year over the last few years.  I have never seen the level of activity that I saw back in 2011, I am always hopeful that each new year will bring some new drama. Actually last year it was an opiliones explosion, but that’s a whole other story. 

The story about the Brazil nut symbiosis is interesting and was a little bit of a setup to show my cute seed disperser below.  Although the plant is obscured by another (Wingstem, Verbesina alternifolia) It is collecting Large Flower Leafcup seeds, the flower heads are visible slightly right and below it.

References and Further Reading