Showing posts with label Graph Theory. Show all posts
Showing posts with label Graph Theory. Show all posts

23 July 2014

A Survey of Graph Theory and Applications in Neo4J


A while ago I had the idea to do a graph theory talk at our local Neo4J Meetup, the Baltimore Washington Graph Database Meetup, especially since Neo4J and graph databases are based on graph theory.  So I finally sat down and spent quite a bit of time putting a talk together. I decided to go with the idea of an overview of graph theory with a primary focus on the theoretical aspects and with some applications like network theory and some specific examples of applications to Neo4J.  My talk is titled: "A Survey of Graph Theory and Applications in Neo4J" and I presented it on June 24, see below for video, slides, and references. This post is my thoughts on it and things leading up to its creation.

Some of my blog posts are created from a desire to make math more accessible and more relevant especially those areas of math that seem to be neglected by the traditional educational system.  One huge gaping hole in my math education both in our American institutional learning facilities and in my college computer science curriculum was that I was never exposed to graph theory.   Once you start studying graph theory you realize that many of these concepts could be taught in middle and elementary schools.  As I was preparing my presentation, I saw this: "Math for seven-year-olds: graph coloring, chromatic numbers, and Eulerian paths and circuits", reaffirming what I already knew.  Sadly many of these concepts were new to me as an adult, and I’d guess the same was probably true of much of my audience.  

Graph theory has been a topic that I have included in my blog, I have written two posts about it one is a general math post: "Triangles, Triangular Numbers, and the Adjacency Matrix" and the other post: "The Object Graph" attempts to put it in part in a programming context .  One post or series of posts that I have wanted to do is an introduction/overview of graph theory.  I was thinking that my talk could be turned into those posts.  However, I am not sure if and when I will get a chance to do that, so I figured until then I will make the talk itself that post and that post is this post.

I found myself torn about publishing the video of my presentation, on one hand I feel that it could be much better but on the other, as a friend pointed out, people might find it informative and enjoy watching it.  That being said, I reserve the right to replace it with a better version if I ever do one.

Writing a post about the talk provides me with an opportunity make a few corrections, add some more source material and mention a couple of things that I would have liked to include but that got cut in the interest of time.

I’ll start by adding some additional source material.  I followed the introductory remarks with a brief historical overview.  For that, especially 20th century contributors I made use these power point slides of John Crowcroft’s "Principles of Communications" course.  I also referenced Graph Theory, 1736-1936 by Norman L. Biggs, E. Keith Lloyd, and Robin J. Wilson.  I mentioned the Watts and Strogatz Model (pdf of original nature paper).  I also included Fan Chung who is married to Ronald Graham and is known for her work in graph theory especially spectral graph theory.   I mentioned snarks not to be confused with Lewis Carroll‘s snarks from which he coined the name.  When I was talking about the topology part I mentioned Eulers Gem.  Since the World Cup was going on during the talk I have to include these topology links "The Topology and Combinatorics of Soccer Balls(pdf)" and some more "soccer ball math".

Due to time constraints I cut a few topics,  two of them were the ideas of graph isomorphisms and graph automorphisms, also ideas like: Graph families defined by their automorphisms and the automorphism group.  Another area that I didn’t really get into, except the linear algebra parts was the area of algebraic graph theory.

I did want to offer up a couple of corrections: I called a truncated icosahedron a dodecahedron, how embarrassing. I called the Travelling Salesman Problem, the Travelling Salesman Program, doh! When I said a bipartite graph has no connections between edges, it should have been vertices.  And finally I misstated Kirchoff’s Theorem, it should be n-1 non-zero Eigenvalues, obviously you only use the non-zero ones!  There are probably more and if I decide to replace these videos in the future, I will just replace this paragraph as well.

On a lighter note, when I was creating my "Graphs are Everywhere" images and sharing with my friend Wes, I created this one for fun:


Finally I’d like to thank our meetup sponsor neotechnology refreshments. The part where we thanked them at the meetup didn’t get recorded.


Videos


Part 1




Part 2





References and Further Reading



04 December 2012

Data Science DC: Implicit Sentiment Mining in Twitter Streams

Summary and more



I have been attending the Data Science DC meetup pretty regularly as it’s an interesting meetup often with quite good talks, the most recent was a very interesting presentation called "Implicit Sentiment Mining in Twitter Streams" by Maksim (Max) Tsvetovat.  He discussed number of ideas that relate to semantic discovery which are of interest to me as I am doing research into related areas including applying semantic ideas to software naming.  So I thought it would be nice to do a little review augmented with links to references that were made and additional ones that I found in the course of researching what was discussed.  I am also including some more introductory links and material as it helps me and hopefully others who are not fully versed in the world of NLP.  The meetup was held at Google’s downtown DC office, the irony being that the meetup was about Twitter was pointed out humorously by David Lieber of Google as he introduced Harlan for the usual Data Science DC Harlan-Mark introduction.


Max starts by talking about a system that his team at George Mason built to map sentiment during the 2012 presidential election which was then used to mine sentiment from current news, in this case the media coverage of recent conflict in Gaza. This work has yielded an algorithm to show media bias.


He points out that there are a number of things people are trying to mine and predict using twitter, the example he cites is the wired article "Twitter Can Predict the Stock Market".  He sees twitter not as a social network but as a wire, an analogue of physical broadcast ether.  It’s not a media carrier but a wire that other things go into with a real-time nature where things can change very quickly.


He moves on to Sentiment analysis, mentioning a paper called "Sentiment Analysis is Hard but Worth it" by Michelle deHaaff.   He contrasts this title with what he describes as an easy "old school" sentiment analysis.  It's where you want to know what people think, so you take a corpus of words and a stream of data and you look for occurrences of good words vs. bad words. You use an average or apply some formula to create a measure of sentiment, which is a naïve approach that might be used in a CS curriculum, but it does not really work in practice due to the complexity of human emotions and language that can have double and triple entendres.  He refers to a computational linguistics paper about "She said" jokes, which I believe is this "That’s What She Said: Double Entendre Identification".  Some examples he gives of possibly deceptive and/or ambiguous statements in terms of sentiment are:


  • This restaurant would deserve highest praise if you were a cockroach (a real Yelp review ;-)
  • This is only a flesh wound! (Monty Python and the Holy Grail)
  • This concert was f**ing awesome!
  • My car just got rear-ended! F**ing awesome!
  • A rape is a gift from God (he lost! Good ;-)

He summarizes these ideas which are challenges to machines learning these things:


  • Ambiguity is rampant
  • Context matters
  • Homonyms are everywhere
  • Neutral words become charged as discourse changes, charged words lose their meaning

The field of computational linguistics has developed a number of techniques to handle some the complexity issues above by parsing text using POS (parts-of-speech) identification which helps with homonyms and some ambiguity. He gives the following example:


Create rules with amplifier words and inverter words:


  • This concert (np) was (v) f**ing (AMP) awesome (+1) = +2
  • But the opening act (np) was (v) not (INV) great (+1) = -1
  • My car (np) got (v) rear-ended (v)! F**ing (AMP) awesome (+1) = +2??

Here he introduces two concepts which modify the sentiment, which might fall under the concept of sentiment "polarity classification" or detection.  One idea is of an amplifier (AMP) which makes the sentiment stronger and an inverter (INV) which creates an opposite sentiment.  I found this idea of "sentiment modification" intriguing and did a little searching and came across a paper called "Multilingual Sentiment Analysis on Social Media" which describes these ideas [page 12] and a few more including an attenuator which is the opposite of an amplifier.  It also describes some other modifiers that control sentiment flow in the text, pretty interesting concepts, actually the paper looks quite interesting, I only read the first few pages.


He cites a paper "Cognitive map dimensions of the human value system extracted from the natural language" by Alexei Samsonovich and Giorgio Ascoli.  This paper defines the following dimensions:


  • Valence (good vs. bad)
  • Relevance (me vs. others)
  • Immediacy (now/later)
  • Certainty (definitely/maybe)
  • And about 9 more less-significant dimensions

One result which is quite interesting is that these dimensions are pretty much language independent. While searching this out I also came across "Computing Semantics of Preference with a Semantic Cognitive Map of Natural Language: Application to Mood Sensing from Text" and "Principal Semantic Components of Language and the Measurement of Meaning" by the same authors.


Max’s work seems to run pretty heavy in social networking theory, which includes an Orielly book: Social Network Analysis for Startups.  He also mentions having quite a bit of exposure to social psychology, consisting of "half a degree" as he put it, which also shows in his work.  He mentions a couple of human psychological aspects, somewhat NLP related but also somewhat divergent, these are the idea of mirroring and marker words.


Mirroring is the idea that when people interact, if the interaction is positive, the example that was given was a successful flirtation, then one person will mimic the others body language.  He extends this concept to the language used by various parties, in this case the tweets they emit.


Marker words are unique words an individual speaker tends to use. The idea can also extend to common expression between speakers. His description of marker words is:


  • All speakers have some words and expressions in common (e.g. conservative, liberal, party designation, etc)
  • However, everyone has a set of trademark words and expressions that make him unique.

He extends this idea to the idea of linguistic mannerisms he cites are calling health care "Obama care" would mark you as conservative, calling Hamas "freedom fighters" would mark you as siding with Hamas.  Which he uses to observe mirroring:


  • We detect marker words and expressions in social media speech and compute sentiment by observing and counting mirrored phrases

The next part of the talk gets into the details of how to do the analysis of the raw text.   One idea that he talks about is text cleaning pointing out that Twitter data is very noisy.  The text is cleaned in part using stop words which are words that are common and have little lexical meaning, some examples are {a, on, the, to}. His full list which he pilfered from WordNet is here.  


Another important NLP concept is stemming a linguistic morphology related concept, given by his example:


  • Stemming identifies root of a word, stripping away: Suffixes, prefixes, verb tense, etc
  • "stemmer", "stemming", "stemmed" ->> "stem"
  • "go","going","gone" ->> "go"

He takes his stemming code from the python project: Natural Language Toolkit.


Since the data being mined is coming from the internet which is used by people all over the globe, language detection is important. While the semantic concepts as outlined in the above work by Samsonovich and Ascoli may be language independent, the stemming and stop words are not, these techniques apply to most other languages but the specific tools and data do not, so the goal is to filter out other languages.  He sums this up as:


  • Language identification is pretty easy...
  • Every language has a characteristic distribution of tri-grams (3-letter sequences);
    • E.g. English is heavy on "the" trigram
  • Use open-source library "guess-language"

The Python library he uses is guess-language which is based on some other implementations.  There is also a java library: language-detection on Google code which was written by Nakatani Shuyo.  All of these use a trigram approach to language detection which uses an n-gram of characters and their probabilities to identify languages this approach is described in "N-Gram-Based Text Categorization".

After the text is filtered to English, cleaned, and stemmed this leaves roots of big words, words that carry more meaning.  These are used to create term vectors.  Term vectors are a way to map documents into a vector space, this is known as the Vector Space Model (VSM), and is a fairly common approach, it is used in Lucene and its derivatives like Solr and ElasticSearch.  Term vectors can be built with different levels of granularity, generally in Lucene this done at the document level but it can also be done at the sentence level.

I was hoping to better describe what he is doing with the term vectors and how they relate to the graphs that he creates but I am unclear as to whether his term vectors are built at the document (tweet) level or sentence level, I believe it is the sentence level as he refers to a common word in two sentences being the intersection of two different term vectors.  He then starts talking about bigrams and linking speakers to bigrams, I am not sure how these relate to the term vectors.  In this case the bigrams, n-grams order 2, refer to words as opposed to the trigrams mentioned above for language detection which were for letters.

Regardless of how they are created the system he describes uses bigrams of words linked to speakers which form a two-mode network, a concept that I was unfamiliar with which is described in "2-Mode Concepts in Social Network Analysis".  This two-mode graph technique drives the final graphs for the sets, in the cases of {Santorum, Gingrich, Romney} and {IDF, Hamas}.   He also points out by counting of the occurrence of bigrams the most common bigrams give the nature of discourse structure.

Counting bigrams enables a technique to throw out bigrams that only occur once in a certain time period, purging single occurrences cuts out the noise.  The number of co-occurrences are power law distributed which reduces this from a big data problem to something that runs on an Amazon micro instances.  Also dates were recorded for each occurrence which allowed noncurrent topics to be purged from the current data over time.

The algorithm to detect media bias, which he warned is naïve, yielded:

NPR58% favorable to IDF
Aljazeera53% favorable to IDF
CNN59% favorable to IDF
BBC54% favorable to IDF
FOX51% favorable to Hamas
CNBC60% favorable to IDF

I hope others find this useful, I sure learned a lot digging into the presentation and researching the ideas presented.  This post ran longer than I had originally thought. I attribute this to the broad subject area that this talk covered.  Semantics is a complex and deep topic with many facets and approaches, I was hoping to throw some order theory related ideas in as well, as they are quite applicable, but that will have to wait for another time.

References

The following references are a mix of works referenced in the presentation and that I came across while writing this, many are linked above but not all are:

Data Science DC meetup

Data Science DC:Implicit Sentiment Mining in Twitter Streams

Event Audio

Code and Slides

Maksim Tsvetovat Publications

The Math of Search and Similarity, Part One: Lucene, the Boolean Model, tf*idf, and the Vector Space Model

Sentiment Analysis is Hard but Worth it by Michelle deHaaff.

That’s What She Said: Double Entendre Identification by Chloe Kiddon and Yuriy Brun

WordNet

Stanford NLP

Natural Language Toolkit, github

Multilingual Sentiment Analysis on Social Media by Erik Tromp

Opinion mining and sentiment analysis by Bo Pang and Lillian Lee

Tracking Sentiment Analysis through Twitter by Thomas Carpenter and Thomas Way

Sentiment Analysis: An Overview by Yelena Mejova

N-Gram-Based Text Categorization (1994) by William B. Cavnar , John M. Trenkle

2-Mode Concepts in Social Network Analysis by Stephen P. Borgatti

Basic notions for the analysis of large two-mode networks by Matthieu Latapy, Clemence Magnien, and Nathalie Del Vecchio

19 August 2011

The Object Graph

In the ORM world, e.g. Hibernate, the term Object Graph is used quite a bit and it is a powerful notion, however, I suspect that Object Graph often gets used without much thought to some of the underlying concepts.  Generally the term refers to an Object, usually a Domain Object which often contains other Domain Objects and/or Collections of other Domain Objects via composition.  With an ORM like Hibernate very complex Object Graphs can be read, altered and persisted with very little work, it is a effective concept that can make building applications both easy and elegant especially if you have the proficiency and forethought to design your mappings properly.

In order to talk more intelligently about some of the underlying concepts regarding the Object Graph it is helpful to get a few fundamentals of Graph Theory out of the way, if you know them feel free to skip ahead.  Actually this is a good opportunity to put them in a context that may be more intuitive for many developers.  Graph Theory is an abstraction, a framework if you will, built on top of Set Theory, however, we don’t need to get bogged down in a Set based definition here, I do encourage you to pursue this on your own as it is extremely seminal and becoming increasingly essential.  We just need to get four fairly simple graph concepts out of the way which can be explained easily and visually, they are: The difference between a simple graph and a multi graph, the difference between a cyclic graph and an acyclic graph, the definition of a directed graph, and the definition of a tree.   Let’s start with the graphs:

As you can see from the above diagram we have several different types of graphs, now it should be noted that these attributes can overlap, the Undirected Graph contains cycles, so it is cyclic as does the Directed Graph, the Multi-Graph1, by definition contains cycles so it is cyclic this one is also undirected.  The Acyclic Graph is undirected. So these attributes can combine, except for an acyclic multi-graph which by definition can’t exist, I think.

In order to limit complexity for our discussion we are going to limit our object graph to the case of a directed acyclic graph, which is also known as a DAG.  More specifically we will be considering a specific case of this known as a directed rooted tree, where edges have a direction away from the root. The following example shows a "rooted" DAG aka a tree:

As you can see the above tree is also a domain object graph it is also it is a directed graph where the direction indicates containment via composition, which can be thought of as a "has-a" relation.  Now in some cases using ORM’s like hibernate one might include a mapping from Address back to Person so the ORM knows how to manage the relation and this is a cycle but it is an artifact created by the ORM since all accesses to this object graph by the application are from the root (Person node).  Now that is not to say that there might not be other structural reasons for such a cycle, perhaps an application needs to hold a list of addresses, but needs to get some of the Person attributes, then in this use case the Address node becomes the "relative" root from which the application access occurs so the object graph would then have two roots and need a cycle to maintain that duality.

So in many cases in regards to the ORM entity model the object graph can be thought of as a tree, a rooted directed acyclic graph (DAG).  There is one important divergence from traditional graph theory which is the presence of collections in graphs, these form a special node that can have combinatorial enumerative properties, in the case of the various types of collections also it can have certain mapping properties in the case of a Map object. An example of a modified person object that might hold a collection of addresses is:

There are a number of ways to expand on these ideas further but I think I will leave those for future posts. 

 

1 This is the graph that Euler came up with for famous the Seven Bridges of Königsberg problem. The bridges are shown in the top image.

24 July 2011

Triangles, Triangular Numbers, and the Adjacency Matrix

One of the cool things about math is that there are many interconnections and interrelations between the various ideas and concepts and looking at things from a broader perspective can often be enlightening.

The area of a triangle is something that we learn in elementary school, its formula, often stated as one half the base times the height, is:

It’s pretty basic stuff, perhaps even boring.

In Number Theory and Discrete Math there is a sequence of numbers known as the Triangular numbers, each triangular number (Tn) is the sum of all the integers up to an including itself. For example

T1 = 1 = 1

T2 = 1 + 2 = 3

T3 = 1 + 2 + 3 = 6

T4 = 1 + 2 + 3 + 4 = 10

Tn = 1 + 2 + 3 + ... + (n -1) + n

These can be generalized by the following formula:

The capital Sigma notation is Summation, which is equivalent to the following simple Java code snippet:

public int triangularNumber(int n)

int result = 0;

for(int k = 1; k <= n; k++) {

result += k;

}

return result;

}

Triangular numbers are the additive analog to the factorial function, i.e. the addition of 1..n vs. the multiplication of 1..n to and can also computed recursively:

Tn = n + Tn-1

Tn = n + ((n - 1) + Tn-2)

. . .

Tn = n + (n - 1) + ... + 2 + 1

Which can be implemented in Java as:

public int triangularNumber(int n)

if(1 == n)

return n;

else

return n + triangularNumber(n - 1);

}

Triangular numbers can be drawn visually as a triangle made up of dots, the first 5 triangular numbers are 1, 3, 6, 10 and 15:

There is another, better way, to compute triangular numbers, a closed formula that Gauss1 came up with when he was a child:

If you assign (n + 1) to be the base and (n) to the height, the formula bears a striking resemblance to the formula for the area of a triangle, seen visually as:

As you can see the base of the triangle ends being one larger than the actual size of the triangle and results in an "area" is larger than would be expected by the continuous area. I think this might be doing the equivalent to what the ceiling function does to round up and acquire the "extra bits" of the discrete case. This triangle was referred to by Fermat as the Collateral Triangle, although I think the term "discrete triangle" is better, and this concept relates to the differences between discrete and continuous mathematics. This is explored in more detail by David J. Pengelley specifically in the following the paper: "The bridge between the continuous and the discrete via original sources" and partial chapter: "The Bridge Between Continuous and Discrete". Actually his work, especially in regards to the Collateral Triangle aka discrete triangle was something of a missing link for my ideas here.  The following graph was derived from a graph in one of his papers.  I refactored it to use the function y=x and pimped it out a bit to better demonstrate my triangle theme, visually shows the differences between the discrete and continuous "areas" in terms of Summation and Integration:

                  

         

The "area" of discrete vs. continuous triangles

The diagram shows how the discrete area picks up more in that jagged edge than the continuous area does and ends up being a larger quantity, n/2 to be exact as seen in the formulas, this is the type of graph that is usually used as an introduction to integral calculus to show how integration works, where in the continuous world the rectangular sections are infinitesimal.  In this example the function y = x creates an area under its curve that is a unique2 triangle, drawn in red, the famous 45-45-90 triangle which is both right and isosceles and in this case the height and base are equal, both the value of n so the area is (n*n)/2 which is the result of the above integral which measures the area of the red triangle where height = n and  base = n.  The discrete triangle, Fermat’s Collateral Triangle, is superimposed into each 1 x 1 cell of the graph as light blue circles with its area represented by the summation below the diagram and to the left.  Also note that integration and Summation are continuous and discrete analogs of each other.  I think there’s probably a lot more to this but that’s all I need for my purposes so I’m going to have to leave it there.

Triangular numbers have the property if you add two consecutive triangular numbers they add to a square which can be expressed as the following equation:

Tn-1 + Tn = n2

This addition can be seen visually as:

T4 + T5 =  10 + 15  =  52  =  25

Since, from our recursive definition, Tn = n + Tn-1 we can rewrite (Tn-1 + Tn = n2) from above as:

Tn-1 + Tn-1 + n  =  2 ∙ Tn-1 + n  =  n2

This can be seen visually as:

2 ∙ T4 + 5  =  2 ∙ 10 + 5  =  52  =  25

Another pattern in drawing triangular numbers is to replace the dots with ones (hold this thought for a moment):

Triangular numbers show up in Graph Theory, the Complete Graph, the graph where all vertices are connected to every other vertex exactly once, will have Tn -1 edges for a graph with n vertices, the complete graph is denoted as Kn.  The first three examples are:

T1, T2, and T3 are the count of edges for K2, K3, and K4. So if we define the function E(G) where G is a graph and E(G) is the edge count, i.e. number of edges in the graph, then E(Kn)=Tn-1, this only applies to the Complete Graphs Kn not all graphs. 

The formula for the E(Kn) and Tn-1 is:

Let’s consider the graph K6:

Now in looking at the fully constructed graph we don’t really get a true sense intuitively about the relation to triangular numbers so let’s look the construction from an inductive and combinatorial perspective:

So let’s build K6. We’ll start with vertex 1 which is blue and draw an edge (a blue line) to each other vertex which results in 5 blue edges.

 

Now we move (clockwise and numerically increasing) to vertex 2 in green, since we already have a blue line back to vertex 1 we don’t draw another, remember: no repeated edges. So we will only draw edges to vertices that we have not visited yet which leaves 4 remaining which are drawn in green.

As we continue this we add 3 red edges, 2 yellow, 1 black, and 0 purple since we have connected to all vertices upon reaching this last vertex.

As shown in the above diagram in the appropriate colors, the edges are added as 5 + 4 + 3 + 2 + 1 which is the 5th Triangular Number for K6 with 6 vertices.  Also this inductive construction is similar to the above recursive function; remember recursion and induction are related, except you would call it with (n - 1) e.g. triangularNumber(n -1).

The Adjacency Matrix is a way to represent the edges between vertices in a Matrix. For a graph of n vertices it has the form:

Each position ar,c where r is the row and c is the column represents an edge, or possibly a count of edges, that connects vertex number r to vertex number c. The following gives the template structure for the matrix for K6:

As you can see from the above matrix that each colored edge maps to the corresponding adjacency element, edge 1 –> 2 is represented by the blue a1,2 and that all of the 1 -> n and n -> 1 edge elements appear in blue and this pattern repeats for all the colored edges. For this example we are only going to consider Simple Graphs, usually just referred to as Graphs, which only have one edge between any two edges as opposed to Multigraphs which allow edges to repeat between vertices so all elements of the matrix can only be either 0 or 1.  Also we are not considering graphs which have Loops so the diagonal where each the row equals the column ai,i will always be 0.  So our template will now take the (très triangular) form:

We are only considering the Adjacency Matrix for the Undirected Graph which will have the property that the upper right triangular portion will be symmetric to the lower left triangular portion, or in other words the terms ai,j and aj,i are equal: ai,j = aj,i, this is referred to as a Symmetric Matrix.  From our colorful K6 example from above we have the following adjacency matrix.  The color which maps to the edges to their representation in the matrix also helps illuminate the symmetry, if you visualize rotating the matrix around the diagonal of zeros you can see that they are the same.

Also you may have noticed, and this is not a coincidence, that this structure is very similar to our square constructed from n and the two  Tn-1 triangles (Tn-1 + Tn-1 + n). If you rotate the above square by 90 degrees either in a clockwise or counterclockwise direction and replace the black dots with zeros and the blue dots with ones you get the adjacency matrix for the Kn graph.

Well this is my first, but not last "pure" math post, except perhaps for the two code snippets.  I feel that these concepts are important and relevant to programmers and probably willso all positions of the graph can either be 0 or 1 only be more so in the future, the Adjacency Matrix is pretty important regarding anything relating to graph Theory like the Page Rank Algorithm or Social Network Theory, there is a lot more to say about it and I have at least one follow on post planned in this area as well.  The goal here was to present the Adjacency Matrix in a context that makes it more interesting and intuitive, and to convey some of the beauty of these structures, hopefully I succeeded in some small measure to do that.  I admit my approach here is somewhat unconventional and I may have taken some artistic license with some ideas such as the terms combinatorial and inductive, but I think they fit. Also I am not sure I fully understand the concept of area in the discrete context.  I hope you found this as enjoyable to read as I found it to write.  When I first had the idea I realized that I needed to draw a number of images, Processing really came in handy and was a blast to use for it, and it has given me a whole new perspective on geometric/graphical programming and visualization, I hope to do more with it including writing about it.  Additionally LaTex has been great for creating math expressions.

1This was also known to ancient mathematicians.

2Only one right-isosceles triangle exists in Euclidean Geometry.