Tuesday, January 29, 2008

Lazy streams in mathematics and vision

ShapeLogic 0.9 contains new functional and declarative constructs
  1. Lazy streams
  2. Simplified and expanded lazy calculations
The functional and declarative construct of ShapeLogic can be used independently of the image processing code. It only requires a 200 KB jar file to use this in other application, and there is no dependency of third party jars.

ShapeLogic 0.9 letter recognition example is still using a different system for declarative programming. More on that later in this post.

To test lazy streams in pure form I tried them on the 10 first mathematical problems in Project Euler. I think that ShapeLogic streams provided for simple solutions to the mathematical problems.
However they are more verbose that say solutions in the Scala language:
http://scala-blogs.org/2007/12/project-euler-fun-in-scala.html

Solutions to a few of the Project Euler mathematical problems

Project Euler is a list of 178, mathematical problems, that can be solved by computers.

1 Add all the natural numbers below 1000 that are multiples of 3 or 5

NaturalNumberStream naturalNumberStream = new NaturalNumberStream(1,999);
ListFilterStream<Integer> filter = new BaseListFilterStream<Integer>(naturalNumberStream) {
public boolean evaluate(Integer object) {return object % 3 == 0 || object % 5 == 0;}
};
SumAccumulator accumulator = new SumAccumulator(filter);
System.out.println(accumulator.getValue());

2 Find the sum of all the even-valued terms in the Fibonacci sequence which do not exceed one million

BaseListStream1<Object,Integer>fibonacci = new BaseListStream1<Object,Integer>(){
{ _list.add(1); _list.add(2);}
public Integer invoke(Object input, int index) {return get(index-2) + get(index-1);}
};
ListFilterStream<Integer> filter = new BaseListFilterStream<Integer>(fibonacci) {
public boolean evaluate(Integer object) { return object % 2 == 0; }
};
SumAccumulator accumulator = new SumAccumulator(filter) {
{_inputElement = 0;}
public boolean hasNext(){ return _inputElement <= theNumber; }
};
System.out.println(accumulator.getValue());

More solutions to Project Euler


Here are the next 8 solutions:
10 first solutions to Project Euler in ShapeLogic
10 first solutions to Project Euler as Java unit tests

Comparison between streams and old declarative constructs

In ShapeLogic 0.8 a cornerstone in declarative programming was a goal driven system of tasks with sub tasks. The letter recognition example reads in a rule database and translated it into goals/tasks with sub tasks. Somewhat like the programming language Prolog.
In case of uncertainty the different choices would live in an artificial intelligence choice tree that is tied to the tree of tasks and sub tasks.

Integrating old declarative constructs with lazy streams

Based on what is needed in the current letter match example. It seems like the stream based approach is simpler and able to handle all the problem that the goal driven approach could.
I expect the stream approach to supersede the goal approach.
The rule database will be read and translated to streams. So the rule for the letter A would be translated into a filter that could filter a stream of polygons into a stream of polygons that represent the letter A.

Also currently the rules now are using JEXL to translate a text rule into something executable.
Example:
Rule for letter A: polygon.holeCount == 1.
In ShapeLogic 1.0, you should be able to use Java 6 Scripting, JSR 223, to be able to define these rules, in one of the 25 supported scripting languages. In ShapeLogic 0.9: Groovy, JRuby, and JavaScript was tested with streams, but not with the rule databases.

Using streams for concurrent programming

Streams can also support parallel or concurrent programming, which is important with the CPU intensive operations in image processing and computer vision. Especially with the advent of cheap multi processor machines.

Example: Find polygons in a stack of images

You define a lazy data stream for this and set a stream property
randomAccess = true
This indicates that individual elements can be calculated independently. The factory creating the stream could create a parallel version of the stream and assign each operation its own thread. Note that the result would be a stream of polygons for each image.

-Sami Badawi

Wednesday, January 23, 2008

ShapeLogic 0.9 with lazy stream library released

Here are the release notes, I will soon describe the changes in more details.

This is the first release where ShapeLogic is moving beyond current parameters as a plugin library for ImageJ, currently only used in a letter recognition example. The improved system will be for declarative programming where the user can define rules in either a database or flat file. The focus will still be on image processing and computer vision, but the system will be more broadly applicable. There has been no new work on image processing or letter recognition in this release. ShapeLogic 1.0 will combine these new changes with the current image processing code.

Changes

  • Introduce new functional, declarative and query constructs to Java
  • Implement lazy streams like Haskell, Scala or Scheme
  • These functional constructs are very lightweight and you only need one 200KB jar file to use it in other applications
  • Test streams by solving the first 10 mathematical problems from Project Euler
  • Enabled Java 6 Scripting for evaluating expressions.
  • Tested with Groovy, JRuby, JavaScript, but should work with other supported Scripting languages, currently that are 25 of these. This makes it possible for users to add rule, formulas and queries in real time using text format. They can interact with a running Java application, which can be useful in science, finance or web applications.
-Sami Badawi

Saturday, January 12, 2008

Lazy streams in Java, Groovy, JavaScript, JRuby

This is a follow up to my last blog: Functional constructs Java 7, Groovy, Commons

I have finished coding the first part of the lazy streams for ShapeLogic. They are going to be key for the declarative programming query interface to ShapeLogic. As a proof of concept I wanted to see how easy it would be to implement a lazy stream of Fibonacci numbers. So far I have tested my framework with definitions written in:
  1. Groovy
  2. JavaScript
  3. JRuby
  4. Java
It is essential that lazy streams are are easy to define so they can be put in a flat file or a database. I was very satisfied with the concise definitions of the Fibonacci streams in the different languages:

Lazy Fibonacci stream in Groovy

new FunctionStream("fibo","def fibo_FUNCTION_ = { fibo.get(it-2) + fibo.get(it-1) };",1,1);

fibo is the name the lazy stream has in the context / name space
fibo_FUNCTION_ it is a naming convention that the function that is use to create the stream has this name
1,1 is the first part of the lazy stream

Lazy Fibonacci stream in JRuby

new FunctionStream("fibo","jruby",null,"def fibo_FUNCTION_(it) return $fibo.get(it-2) + $fibo.get(it-1) end",1,1)

Lazy Fibonacci stream in JavaScript

new FunctionStream("fibo","javascript",null,"function fibo_FUNCTION_(it) { return parseInt(fibo.get(it-2) ) + parseInt(fibo.get(it-1))};",1,1);

Scripting using JSR 223

I added a formula language for users to an enterprise Java system a few years back using Antlr and BeanShell. JSR 223 is dramatically easier to work with, but there are still some problems.

The only scripting language that works out of the box with Java 6 is JavaScript. To get others to work you have to download jsr223-engines.zip from:
https://scripting.dev.java.net/servlets/ProjectDocumentList
For each language you want to use there is a engine jar file.
E.g. jruby-engine.jar.
They have to be on your path. So does the jar file implementing the language.

JSR 223 and Maven 2 problems

JSR 223 does not work well with Maven 2. The engine jar file does not reside in the Maven repository so you have to separately install them into you local Maven repository. Here is the command to install JRuby:
~/bin/maven-2.0.8/bin/mvn install:install-file -Dfile=jruby-engine.jar -DgroupId=org.jruby -DartifactId=jruby-engine -Dversion=1.0.1 -Dpackaging=jar

Scripting languages currently under JSR 223

beanshell, browserjs, ejs, freemarker, groovy, jacl, jaskell, java, javascript, jawk, jelly, jep, jexl, jruby, jst, judo, juel, jython, ognl, pnuts, scheme, sleep, velocity, xpath, xslt.
These languages should in theory work with ShapeLogic, without any additional code.

Other implementations of lazy streams

  1. BaseStream: That is the abstract base class with most of the lazy stream functionality
  2. IteratorStream: Generates elements using Java Iterator
  3. TransformerStream: Generates elements using Java interface
  4. FunctionStream: Generates elements using JRS 223 as described above

Next step for functional constructs for ShapeLogic

  1. Filter that is easy to define in external text using scripting
  2. Transformer transforming one lazy stream to the next
  3. Query interface from where a result can be retrieved
-Sami Badawi

Wednesday, January 9, 2008

Functional constructs Java 7, Groovy, Commons

What functional constructs to use for ShapeLogic

I have started to code the lazy streams for ShapeLogic. They are going to be key for the query interface to ShapeLogic. I need some functional constructs to work with these. The top sources candidates are:
  1. Apache Commons
  2. Groovy
  3. Java 7
  4. Hand coding
None of these are a perfect fit. This post list some of the advantages and disadvantages of these.

Apache Commons functional constructs

I am currently using Apache Commons JEXL for user expressions.
  1. Apache Commons do not use templates
  2. Apache Commons Functor are still in the sandbox and not actively developed
  3. The code is not uniform
  4. You cannot make user define functions in Apache Commons JEXL

Groovy functional constructs

I need to use a scripting language, to define user functions.
  1. Groovy contains all of Java's constructs
  2. Groovy comes with good functional constructs
  3. I like the Groovy syntax for using these
  4. I cannot use these expressions directly since Groovy is not a lazy language

Java 7 functional constructs

Java 7 comes with good functional constructs.
  1. There are a lot of interest for functional constructs in Java 7
  2. I would rather use Java 7 than compete with it
  3. Java 7 still seem to be pretty far away
  4. It is not sure that closures will make it into Java 7

Thoughts on immutable constructs

I will probably make the convention that a lazy stream like the Fibonacci numbers are immutable, but not enforce this by making a LISP list.

Scala envy

Now I suffer from Scala language envy. These Scala language features would come in handy now:
  1. Lazy streams
  2. List comprehension, with same syntax for lists, iterators and streams
  3. First class function
  4. Closures
  5. Lazy calculation
  6. Uniform access to functions and lists
I could of cause try to include the scala.jar and directly use the Scala constructs in ShapeLogic, but there lay the road to madness.

Hand coding functional constructs

This might not be too much work, but I would rather not create yet another implementation of functional constructs. Apache Commons Functor never took off, despite some good press. That is probably an indication that Java was not that well suited for elegant functional constructs when this library was made, and maybe still isn't.

Current work plan

  1. I will start by implementing a lazy Fibonacci stream
  2. Then I will move the polygon finder to a lazy stream, before it could only find one polygon
I will try to get ShapeLogic 0.9 out soon, even if it is a very limited version.

If you know of any libraries that would fit my need and not be too heavyweight, please let me know.

Two Fibonacci implementations

The Haskell language has an incredibly elegant lazy stream implementation:
fibs :: [Int]
fibs = 1 : 1 : [ a + b | (a, b) <- zip fibs (tail fibs)]

Drools Fibonacci implemented. This is not a good fit for my computer vision project.

-Sami Badawi

Saturday, January 5, 2008

Good resources about generating a Site and Documentation in Maven 2

The ShapeLogic project site is generated with Maven site generation tool.

I am trying to customize the ShapeLogic site a little more. Unfortunately the documentation for using Maven site generation tool is very spotty, but Eric Redmond has made 2 great resources about this:

His online Maven 2 book:
http://propellors.net/maven/book/site-generation.html

His blog entry:
過労死 Death by Overcoding: Generating a Site and Documentation in Maven

Now the whole ShapeLogic site is generated cleanly with this command:
mvn clean site
Before you manually had to copy the directories: images and css

-Sami Badawi

Wednesday, January 2, 2008

ShapeLogic for general declarative programming

ShapeLogic 0.9 is going to be the first release where users define rule sets read from either databases or flat files. I have decided to broaden the scope of the project.

First I hoped that I would be able to make minor adjustment to ShapeLogic's letter match example so that it would read the rules from an external source and gradually expand. But since backward compatibility is not a problem yet, I came to the conclusion that it was better to try to construct a solid foundation for declarative programming. While it will be primarily geared toward computer vision problem solving, the system will have other, broader, applications as well.

Do we need another system for declarative programming?


SQL rules supreme in the field of simple data, but once you go outside this field there are many different directions that you can take.

The declarative logic programming system that I had highest expectations for was CYC. It is a large, hand coded knowledge base trying to capture common sense, using many different inference techniques. Strangely, it never got a big following even after they released part of the engine as open source.

I hope to find a sweet spot where a simple system will be able to do some real work and be simple to learn. My work is geared toward a domain where there are a limited number of objects with somewhat constant features.

A few of the changes in ShapeLogic 0.9


Lazy data streams


Implement lazy data streams like Scala or Scheme.

Decouple annotation


Make the annotation of shapes more loosely coupled to classes in ShapeLogic. I was not super happy with the first solution that I came up with of how to annotate point, lines and polygon.
If ShapeLogic should be used for more general problem this is completely unacceptable.

Better interpreter


I order to save user defined rule in a database or a flat file, I need to be able to compile or interpret this code. This code in the rules should mainly be fairly straightforward.

The top contenders are:


  • Java 6 Scripting interface, that should give access to different scripting languages for the JDK

  • Java 6 Compiler interface, the problem is that Java is pretty verbose

  • Groovy intepreter

  • Do the parsing in ShapeLogic and write a little interpreter

  • Stick with JEXL for now, and make the move in a later release



-Sami Badawi