11 March 2012

Are you sure that you are editing in two dimensions?


Editing the Positive Orthant and The Topology of Text Editing





In my previous post on editing I mentioned column cutting and pasting, a powerful feature that many programmers I have worked with don’t use or even know about.  I started thinking about it in terms of math and seemed to me that this creates a dimensional difference between the standard Eclipse Editor and the editor that I use.  My observation is that the standard Eclipse Editor only lets you edit in one dimension, of course such a statement might seem absurd but I think it’s true and to explain it I am going to take a little tour of the concept of dimension.


From what I understand the idea of Dimension is not completely cut and dry and comes in a variety of forms, most people only know of or think of Spatial Dimension of the Cartesian Coordinate System in Euclidean Space for example the screen on which your editor is displayed.  But dimension both comes in many other flavors and allows the representation of so many things.  For example Dimensions can be in the form of Fractal Dimension, Hausdorff Dimension, Topological aka Lebesgue Covering Dimension, and the list goes on.


Another deceptive property of Dimension comes from the topological concept of surfaces or more generally Manifolds.  An interesting example of this is the older video game Asteroids or any game where you move from the bottom edge to reappear on the top edge of the screen or from top to bottom of the screen, the same applies to the left to right wrapping motion.  You are actually moving as shown above around on the surface of a torus along the two different axes shown in red and blue.  This idea can extend to higher dimensions, imagine, instead of a square video game screen, a cube where each face was connected to the opposite face (3 connections), if you moved in any direction, in three dimensions, far enough you would always end up at the same place, actually the universe could have this shape, I am not saying that it does though, interestingly there is a three dimensional version of asteroids.  This cube torus is known as a 3-torus which is one many 3 manifolds (three dimensional manifolds). Manifolds are a higher dimensional topological generalization of surfaces like spheres and tori.  In the case of the torus and 3-torus, you need to step up to a higher dimension to really see the structure that you are in.



Ok so some of these ideas may seem a little broad and not necessarily directly relevant but they really are and they are interesting and give a larger picture of the complexity of the idea of dimension so bear with me.  The next important and hopefully more obviously relevant idea is the concept of an Orthant or n-hyperoctant, these are just fancy and more general terms for a concept you probably already know.  An Orthant is the division of space around the origin, so in two dimensions there are four orthants also known as Quadrants or 2-hyperoctants.   The simplest case is two rays forming two 1-dimensional orthants, one positive and one negative.  The number of orthants grows by powers of 2, i.e. 2n, so one dimension (line) has 21= 2, two dimensions (plane) has 22= 4, three (space): 23= 8 and so on.  Each orthant has all combinations of positive and negative coordinates a set that will be of size 2n, so one dimension has {(+),(-)}, two dimensions has {(+,+),(-,+),(+,-),(-,-)},  and so on,  each dimension will have exactly one orthant that has only positive coordinates and exactly one orthant that has only negative coordinates.



The orthant concept is very relevant to text editing, when you edit you are working in a two dimensional Euclidean space, and you are working in the Positive Quadrant (Orthant).  When you edit, you start with line one, column one and count up to the right and down, this is actually a reflection around the x-axis of the traditional plane with the positive y to the left or counter clockwise from the x axis which is the way one usually represents a Cartesian plane.  Also we are working strictly with positive non-zero integer positions so our space is (n,n) ∈ (N x N)  n ∈ N > 0 the positive quadrant of the two dimensional integer plane.



At this point, if you are still with me, you might be saying: “Aha, you just demonstrated that editing is two dimensional”, but I’m not done yet and I want to take a small digression into Fractal Geometry first. The Koch or Snowflake curve shown above has a Fractal Dimension of log(4)/log(3) =log34= 1.2619.  But its Topological Dimension is 1 since it is a one dimensional continuous everywhere but differentiable nowhere curve.   In Topology you can stretch and bend things so imagine the curve is made of string and you pick it up and it loses its shape. You can then lay it down as a circle which is a one dimensional curve, a 1-torus.


The Koch Curve can be created using a Fractal L-System. The important thing to realize here is that Fractal L-Systems have a strong similarity to the Chomsky Hierarchy, now there are some substantial differences between the two rewriting systems, but I will focus only on one of the similarities here. The Koch Curve can be defined by the following L-System:


Alphabet : F

Constants : +, −

Axiom : F++F++F

Production rules:

F → F−F++F−F


Here, F means "draw forward", + means "turn right 60°", and − means "turn left 60°" (see turtle graphics).


The first four iterations, which correspond to the images above are:


F++F++F

F-F++F-F++F-F++F-F++F-F++F-F

F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F

F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F-F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F-F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F-F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F-F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F-F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F-F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F

Each iteration results in a new longer more complex one [topological] dimensional strand of characters.  The striking similarity here is that much as an L-System will give you a one dimensional strand of characters a Context Free Grammar which will describe most programming languages will give you a similar one dimensional “strand” of terminal symbols which would be lexemes corresponding to a linear sequence of programming language statements.


In the asteroids example we are looking at two dimensional screen that is really a two dimensional surface on a three dimensional torus, here we are looking at one dimensional strand represented on a two dimensional screen and this is the key observation that ties all of this together and shows the nuances of these ideas.  One dimensional editing is when you are editing on the positive ray, 1-hyperoctant, cursor movement follows the strand in one dimension, in both directions along the strand, and cutting will cut one dimensional strands and only allow them to and pasted into any single position within the strand.  The default behavior of the Eclipse editor offers one dimensional editing. Now let’s compare these specific operations to another editor which allows two dimensional movement and cutting and pasting.


Cursor Movement


The first example shows the how the cursor behaves at the end of the line in Eclipse. In the following image the cursor is at the end of line 21, which is a coordinate of (21,37) as shown at the bottom of the image:


Hitting the right arrow to move cursor right by one, moves you to the beginning of the next line, coordinate (22,1) as shown below:


This movement is along the one dimensional stand of the code itself, not the plane.  The confusing thing here is that we are looking at the text in two dimensions and using two dimensional coordinates to describe it, the next example will make it clearer.  Here is the same image from a different editor that supports two dimensional operations:


 

This example shows two dimensional cursor behavior at the end of line 21, which is a coordinate of (21,37) as shown at the bottom of the image, hitting the right arrow to move cursor right by one, moves you right by one position to coordinate (21,38) as shown above.


Cutting and Pasting



The one dimensional line cut:


Two dimensional cutting and pasting:



In both cases above we are selecting blocks of text with starting coordinate (21,12) to ending coordinate (26,18) however the Eclipse version (top) only allows the selection of a one dimensional strand of text, as you move in either direction you will only be able to grab complete lines between the start and end coordinates and if you paste this text you will be inserting a one dimensional strand into a single position in another one dimensional strand.  The two dimensional operation offers more flexibility beyond the only the one dimensional operation, actually the editor offers both one and two dimensional cutting, which is what you really want in an editor.  The resulting cuts are as follows, now admittedly the two dimensional cut in this case is not really practical.


The two dimensional column cut:



Hopefully this give a sense of the difference from editing on a one dimensional ray vs. a two dimensional plane.


If we revisit the one dimensional strand of characters idea, we notice a problem.  While the strand is one dimensional in that the position of each character can be given by a single integer index into the string, which is a single axis or ray.  Each position can have a different value. In our fractal each position could contain one of three characters {F,+,-} which we could map to {0,1,2} respectively.  Now each iteration of the L-System production generates a number in base three and each position can have one of three “pixels” in the strand. Now we need another dimension, or at least part of one, to represent this additional information.  A more practical example would be a 24 bit RGB image where each pixel has a position, remember this can be stored as a single one dimensional strand of pixels but displayed in two dimensions by cutting up the strand into rows.  Similarly a program can be “pixilated” where each pixel could be an ascii character (byte), a lexical token or lexeme, or a binary zero or one bit along the one dimensional strand.  The length of the strand would vary in accordance to height of the pixels a binary pixel representation/strand would have more positions than an ascii pixel representation/strand than would the lexeme pixel representation/strand.


Ok this is probably my zaniest post about my crazy ideas which I am pretty sure are correct. Hopefully I did a decent job explaining these ideas, I find them to be quite mind blowing, at least that's the way I felt when I first understood the basic ideas about topology and manifolds, and there is still so much I don't know. Also if you feel that you both have a better understanding of some of these ideas and at the same time feel more confused about them, I think that might be normal and I think a big part of learning math is just getting used to feeling that way all of the time, because the more you learn the stronger both of these feelings become, at least that's my experience.


An n-torus can mean either an n-dimensional torus or a two dimensional toriodal surface with n holes. In this post it always means n-dimensional torus.

4 comments:

  1. Eclipse supports column editing now (just found out):
    ALT-SHIFT-A in windows/linux, and Command-Option-A on mac
    to toggle on and off.

    ReplyDelete
    Replies
    1. One annoyance is it doesn't seem to work while the Aptana plugin is loaded, which I use for javascript/html view pages within eclipse. Hmm.

      Delete
  2. This comment has been removed by the author.

    ReplyDelete