Another way to scale computing

I was pondering upon the future of scaling computation since we’re pretty much out of hertz now… okay, I was reading this paper: http://research.cs.wisc.edu/multifacet/papers/tr2011-1coherencestays.pdf . Great to hear about cache coherency’s bright future but… what about when you need to move a lot of data around? Caches don’t help so much then.

And so I thought of DMA types of things and thought that it would be very useful to do DMA stuff… with inline computation. If you play it just right, you could maintain the bandwidth, there would be just a slight delay. You could have them use a feedback approach (or stream cache idea below) with local cache to do multiple computations over the same data and across data.

One possible application would be to handle VPN encryption and such by sitting between the CPU and network interface.

You could do memory to memory with computation copy operations. Take an entire array of main memory, send it through this programmable data processing engine, and it goes right back to main memory with no CPU involvement at all and at full memory bandwidth (with computation delay). I imagine scientific computing could benefit from that.

These could be “stream”-like processors where the cache was just to store a certain amount of the data. So, you’d have a moving window of data available via the local cache. From the programmer’s perspective, you’d have an array available each clock that represented up to a certain amount of data back in the stream.

I have an application I’m working on now that would be very useful to have network traffic go through one of these processors and then both to main memory and to disk. Logging and auditing could be offloaded to these processors so the CPU never had to consider it.

It would be interesting to be in a place where the “CPU” was just a manager of arrays of other processors, directing traffic but not doing that much number crunching itself.

Notes on a Light Table Post

It appears Chris Granger has discovered the beauties of Reactive Programming in his latest post: The IDE as a Value though he doesn’t call it that. It’s cool that he recognizes that an entire application can be considered a single composite time varying value. That’s what Reactive Programming is all about.

I’m still thinking we need to condense down all the useful developing approaches and things and let everyone learn it right away when they start being a developer, so we don’t have so much rediscovering and rehashing things again and again. Oh well. Each person needs to learn it their own way, but… seems really inefficient.

He also seems to be doing a very limited form of something I have planned for my IDE with the tags he talks about… I suppose I better write it up sometime soon. The IDE is one or two projects down the road, though, so… not sure how soon I’ll get to it.

Dependent Types in Java

No, you generally can’t do dependent types in Java. However, there are some techniques that one can do to get some of the same benefits. The thing with any nominative type system is you can always just put more stuff in the name. It’s klunky and sometimes absurd, but it can work in many cases. Let’s take the classic situation of a List/Vector of a certain size.

Here’s an interface that we’ll have them all use:

package test.list;
import fj.F;
import fj.F2;
public interface List extends Iterable \{
  List map(F f);
  List fold(F2 f, B init);
  List add(A value);
}

Note that there is no way to get a value from the list. That is deliberate. If you don’t know what size the list is, then you don’t know which indexes are safe to use to get an element. So, you can’t.

Fortunately, Java does have first order generic types, so we don’t have to put the type of the elements of the list into the name. But it can’t index on the length, so we do have to put that in the name.

Here’s an example of how one might implement a list of length 2:

package test.list;

import java.util.Arrays;
import java.util.Iterator;
import fj.F;
import fj.F2;

public class List2 implements List \{
  private A v0;
  private A v1;

  public List2(A v0, A v1) \{
    this.v0 = v0;
    this.v1 = v1;
  }

  @Override public Iterator iterator() \{
    return Arrays.asList(v0, v1).iterator();
  }

  @Override public List map(F f) \{
    return new List2(f.f(v0), f.f(v1));
  }

  @Override public B fold(F2 f, B init) \{
    return f.f(v1, f.f(v0, init));
  }

  @Override public List3 add(A v2) \{
    return new List3(v0, v1, v2);
  }

  public A getV0() \{
    return v0;
  }

  public A getV1() \{
    return v1;
  }

  public void setV0(A newV0) \{
    v0 = newV0;
  }

  public void setV1(A newV1) \{
    v0 = newV1;
  }
}

Some things to note:

  • Note that this is a mutable list. I did that because I think it’s less obvious that one can do a mutable list in this manner. An immutable one would be simpler to implement.
  • If one did this for a significant sized list, one would probably want to use a persistent data structure (PDS) to store the elements.
  • The map and fold implementations could be done in an abstract base class, using a reference to an array or PDS that was assigned in the subclass constructors
  • Element access is direct as methods, or it could be direct field access. You cannot access elements by passing in a numeric index because Java has no way to check at compile time that it is within the bounds of the list.

Some more examples would be useful. If there’s interest, I’ll perhaps do that in a followup post.

Constructive Type Theory is Hard but People Want It

“What every software developer should know”: I’m not going to list them here, but in my own current exploration into the various crevices of computer science, I’m finding it amazing how the software development industry is constantly rethinking of what has been around a long time.

The latest experience with this is a post “I want to fix programming” which I found from a response to it “Yep, Programming Is Borked”.

There is a comment that mentions Coq. Which is in the right direction since both of those posts are hinting at what is ultimately constructive type theory or intuitionistic type theory. So, yet again, we have someone posting: “we need X” and X has been worked on by researchers for 30 years.

My guess is that they don’t realize just how hard it is to do what he suggests. Having a “compiler” construct something to match his description of the result of a sort would be extremely difficult. Here’s a 24 page write up of how to get the proofs–and not automated, it requires user interaction–in order to do just addition.

I wonder if it would raise the overall state of the industry if everyone had a basic knowledge of all the things going on out there. Maybe there is just too much. I’m guessing that university programs don’t have survey type classes that cover the breadth of computer science. So we have graduates come out with large holes in their knowledge. Which is unfortunate since many who might contribute more take years to stumble across what they have been missing.

It’s kind of sad: in spite of being generally considered a “brainy” industry, the software world is extremely faddish, rehashes the same concepts again and again… You’d think we could find a way to be pushing steadily forward instead of so much sideways and circular.

Maybe I should put up a website that tries to just condense the high level topics and concepts that every developer should be aware of.

Compiling on the Fly with the Eclipse Compiler

There is this question about how to compile code on the fly using the eclipse compiler. A useful link was posted, but it didn’t deal with cascading compilation, which is something I found out the eclipse compiler can support in the comments here.

So, I thought it would be useful to get a sample of how to do that up.

Download ECJ by starting from this page, clicking on the latest release, then find and download the file ecj-[version].jar. For this, I’m using 4.2.1. Reference this jar in your classpath.

You use the org.eclipse.jdt.internal.compiler.Compiler. Most things for the constructor have defaults available. You just give it a callback for the results in the form of an ICompilerRequestor. The below example uses a simple byte class loader to test the results. To do cascading compilation, you create a subclass of FileSystem, overriding the methods from INameEnvironment.

(There are some \ in there because wordpress has some bug with the code as-is)


package test.eclipse.compiler;
import java.lang.reflect.Method;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;
import org.eclipse.jdt.internal.compiler.ClassFile;
import org.eclipse.jdt.internal.compiler.CompilationResult;
import org.eclipse.jdt.internal.compiler.Compiler;
import org.eclipse.jdt.internal.compiler.DefaultErrorHandlingPolicies;
import org.eclipse.jdt.internal.compiler.ICompilerRequestor;
import org.eclipse.jdt.internal.compiler.batch.CompilationUnit;
import org.eclipse.jdt.internal.compiler.batch.FileSystem;
import org.eclipse.jdt.internal.compiler.batch.FileSystem.Classpath;
import org.eclipse.jdt.internal.compiler.env.ICompilationUnit;
import org.eclipse.jdt.internal.compiler.env.INameEnvironment;
import org.eclipse.jdt.internal.compiler.impl.CompilerOptions;
import org.eclipse.jdt.internal.compiler.problem.DefaultProblemFactory;
import org.eclipse.jdt.internal.compiler.util.Util;

public class TestCompile {
  static class ByteClassLoader extends ClassLoader {
    private Map classMap;
    public ByteClassLoader(Map classMap) {
      super();
      this.classMap = classMap;
    }
    protected Class findClass(String name) throws ClassNotFoundException {
      byte[] bytes = classMap.get(name);
      if (bytes == null) {
        return super.findClass(name); } else { return defineClass(name, bytes, 0, bytes.length); 
      }
    }
  }

  public static void compile(String code, String filename) {
    ArrayList cp = new ArrayList();
    Util.collectRunningVMBootclasspath(cp);
    INameEnvironment env = new NameEnv(cp.toArray(new FileSystem.Classpath[cp.size()]), null);
    ICompilerRequestor requestor = new ICompilerRequestor() \{
        ClassFile[] cf = result.getClassFiles();
        HashMap classMap = new HashMap();
        classMap.put("Test", cf[0].getBytes());
        ByteClassLoader cl = new ByteClassLoader(classMap);
        try {
          Class c = cl.loadClass("Test");
          Method m = c.getMethod("test");
          m.invoke(null);
        } catch (Exception e) {
          e.printStackTrace();
        }
      }
    };
    Compiler compiler = new Compiler(env, DefaultErrorHandlingPolicies.exitAfterAllProblems(), new CompilerOptions(), requestor, new DefaultProblemFactory());
    ICompilationUnit[] units = new ICompilationUnit[] {
      new CompilationUnit(code.toCharArray(), filename, null)
    };
    compiler.compile(units);
  }

  public static void main(String[] args) {
    compile("public class Test \{ public static void test() \{ System.out.println("Hello, world."); }}", "Test.java");
  }
}

Comments on Code Testing Presentation at Strange Loop

This video was referenced on the Shen mailing list. Here’s sort of an outline I wrote as I watched. My response is summarized as: it’s useful information but it didn’t seem to go anywhere. The whole presentation seemed to leave it as: it’s all hopeless. They even show “abandon hope all ye who enter here” at some point in the video. I find that a little sad.

14:00 Tony Hoare introduced
17:00 Godel introduced
21:30 Here they talk about “taking it too far”: higher kinded types, blah blah. It seems to me they are just saying they don’t comprehend it: perhaps a sign that useful information is being ignored?
22:10 They mention correct by construction. This is something I am working on. I would really like to know why they just dismiss it and move on. Is there some work on this that has determined it to be intractable/impossible?
23:30 They mention the Curry-Howard isomorphism and its monster in the closet: recursion. You can say everything is true. They mention Godel’s theorem about that you can’t prove all true theorems in a system.
26:00 Introduce strong normalization
27:00 Example of strong normalization is System F–but it has no recursion, and is not turing complete
29:00 They say: “unless you’re using Agda in which case more power to you”. Again more dismissal of where it starts getting interesting. What if we are using Agda? Does it solve the problems mentioned?
30:50 “Haskell’s type system is turing complete”. Umm, yeah, though not sure it’s reasonable to use in that way.
31:29 This slide looks very similar to what Shen can do
They suggest functional programming to restrict what can be done so it resolves some issues

Q&A section:
Question about optional soft typing:
40:35 if your type system is removable from the language, it’s a verification system
“you can do something more powerful than a type system precisely because you have separated it from the language”
41.55: mentioned Qi/Shen

Constructor driven development

Again a misnomer, it’s more datatype driven development, but the constructor part is what just clicked in my head linking data types and functions.

I had been thinking along the lines of specifying entire applications by types. But what I couldn’t quite link together was… you define these types but where do functions come in? When does something actually happening?

And so it struck me: the purpose of functions in this approach is to construct the types. So, the entire application is just a big complex function that constructs the overall application type.

Let’s take a very simple example of a something that echoes whatever is typed at the keyboard. The types would break down as:
Echo Application -> (Echo Output (Echo Input)) -> (Output to Console (Keyboard Events))

Or something like that. You decompose the types down to the point until there is pretty much only one reasonable way to populate that type. For implementation, you just have to write constructors for those types. To write a constructor for the entire Echo Application, you just have to write a constructor that takes two combines the two values: Echo Input and Echo Output (Echo Input). To create those, you create further constructors until it’s down at the level of Keyboard Events which is a time varying value that fires events for any keyboard event and the output to console action that writes to the console a time varying value. So, given an appropriate framework already established, the whole application could be implemented with:
(bind (keyboard) (console))
where bind is a generic constructor that hooks up a value with a value listener. In a more likely case we might need a cast/conversion:
(bind (bind (keyboard) (keyboard->console)) (console))
where keyboard->console is the equivalent of a cast or value conversion. It casts/converts the keyboard value to the value expected by the console.

Auto typing Java things in Shen

I want to autotype Java things so integration with Java can all be type checked, too. I can use reflection at compile time to get information about the calls, but doing the appropriate thing with that will be more of a challenge. Let’s look at this expression:

(.setVisible (javax.swing.JFrame. "title":java.lang.String):javax.swing.JFrame true)

First, this part: (javax.swing.JFrame. "title":#String) That is a constructor call. It needs to have type:
{ java.lang.String --> javax.swing.JFrame }

I think that can be handled by automatically evaluating the call:
(declare javax.swing.JFrame. [java.lang.String --> javax.swing.JFrame])

We get a little trickier with the .setVisible call. If I understand right, Shen does not support dispatching on any arguments. A function can only be bound to a single function, and thus a single type signature. Or can you have the type of an application depend on the type of its arguments?

In Java, there could be any number of classes that have a setVisible method. So, I’m guessing that I’m going to have to diverge from java dot notation and include the type in the symbol for the call, something like:
(.javax.swing.JFrame.setVisible Receiver true)

That would remove the requirement of specifying the type after a : for the receiver, so it would result in the exact same number of characters.

For the above, then, I could auto evaluate the call:
(declare .javax.swing.JFrame.setVisible { boolean --> () }

Shen doesn’t have a void/nil/null type, right? So I’ll just have to choose something to return for void methods.

Implementing Echo App in Shen

Implementing the echo app described in my previous post poses a few challenges in Shen as it currently is. For example, my intention was that every keystroke would be echoed, but consoles don’t generally provide that level of interaction. So, let’s expand it a little to be a very simple gui application.

Echo App -> Echo Frame -> (Frame (Echo Label)) ->
(Frame (Label (text: Accumulator (Keyboard Events (Label)))))

Here, the Accumulator is there to accumulate the characters so we don’t just see one character, rather we see all of them that were typed.

The following code won’t compile, but it should give the general idea. One of the reasons it won’t compile is because you can’t type 0-place functions in Shen. I haven’t decided yet what I’m going to do for those cases which are quite common in Java.


(define construct-echo-frame { --> #JFrame } ->
(let Frame (#JFrame.) Label (construct-echo-label) (do (.add Frame:#JFrame Label) Frame)))
(define construct-echo-label { –> #JLabel } ->
(let Label (#JLabel.) Events (construct-keyboard-events Label) Accumulator (construct-string-accumulator Events) (bind Accumulator Label:text)))
(define construct-keyboard-events { #JComponent –> event-source #KeyEvent } Comp ->
(/. X (.addKeyListener Comp:#JComponent (subclass-#KeyAdapter. keyTyped:X)))
(define construct-string-accumulator { event-source string –> event-source string } Source ->
(fold (/. Acc X -> (cn Acc X)) “” Source)))

Obviously, this needs some refinement, and to implement the details, especially fold and bind, but I think it’s a step in the right direction.

Convenient “optional” args in functions

Inaccurate title, but… especially when working with persistent data structures, you often want a copy of a structure with one piece of it changed. Instead of creating a new function for each thing, you could create one with the pattern matching. I had a (vector * vector * list symbol) structure with meanings (alias * suffix * imports) so maybe I could:

(define struct-with
(@p _ Suffix Imports) alias:Alias -$gt; (@p Alias Suffix Imports)
(@p Alias _ Imports) suffixes:Suffix -> (@p Alias Suffix Imports)
(@p Alias Suffix _) Imports -> (@p Alias Suffix Imports))

Then to use it, you can call:

(struct-with Existing-structure alias:New-alias)

(struct-with Existing-structure suffixes:[suffix1 suffix2 suffix3])

(struct-with Existing-structure some.type.of.import)“`

I’m curious on what people think of this pattern.

Of course, one would probably use a macro for this, and since separate methods (struct-with-alias, struct-with-suffixes, etc.) would perform faster, maybe there’s no point to this, but it was interesting.