Wednesday, December 26, 2007

Language and tutorials for the external rule base

Language


I have spent a couple of weeks thinking about how to organize the external rule database for ShapeLogic, and have done more reading about Java 6 Scripting. That should be adequate for my need, but I am reluctant to add more dependencies to ShapeLogic than absolutely necessary.

I am now debating what scripting language would be best for Java 6 Scripting:
Groovy: Comes out as maybe the strongest contender, but I tried Groovy 2 times before only to find out it was not ready for prime time yet, but maybe with version 1.5 it is finally there.
BeanShell 2: It has been in beta for over 2 years and does not seem to be in active development.
Jyton: I have been a big fan of Python for almost 10 years now, but the white space indention does not work so well with code stored in a database or flat file.
JavaScript/Rhino: I like it and people know it, but it would be better if it was a language that was using native Java types.

Tutorial


After seeing a 20 minute screen cast for Ruby on Rail by David Heinemeier Hansson, my new test for if a programming library or language is worth spending time on is if it has a 20 minute screen cast, where they can do something non-trivial. I do not adhere to this rigorously.
Given that I have a thicker Danish accent than David Hansson, I have been looking for other options.
One of my friends Joe Orr, see Joe's blog 3DTree Notebook, has created a very interesting alternative to screen casts called Screenbook Maker, it is a program that takes screen shots of a demonstration and adds text to it, to turn it into a tutorial, which is searchable.
Joe has promised me to help make a Screenbook tutorial when the external rule database is released.

-Sami Badawi

Wednesday, December 19, 2007

Declarative programming using Java 6 Scripting

I am working on moving the declarative programming in ShapeLogic into an external rule database now.

Currently the rules in ShapeLogic are parsed from strings using Apache Commons JEXL library.
So the letter A would have a rule saying:
polygon.holeCount == 1

This is not trivial since a variable say polygon.holeCount could have different values in different contexts.
E.g. if there was a choice of 2 different thresholds levels, then in one part of the choice tree we could have
polygon.holeCount == 1 and in another we could have
polygon.holeCount == 2.

I am considering changing from JEXL to using the Java 6 Scripting instead.
JEXL has not been released for over 1 year, and it is a little awkward to handle static fields and functions.
It might also be better to let the user chose what scripting language they want to use.
Currently there languages should be available for scripting: JavaScript, BeanShell, Jython, Groovy and JRuby.

One issue is that I cannot just use variable binding in a global scripting context.
In my example from above if the variable polygon.holeCount does not exist in the top context, I will have to make sure that it is taken from the right context. This was relatively easy in JEXL since a context here mainly is just a map you store your key values pairs in, I am not sure if this is a problem when you are dealing with a whole dynamic scripting language. I am also a little concerned about performance.

I might make a release of ShapeLogic 0.9 where you just can select another rule database stored in a flat file or a database, but using the current system, in order not to drag the next release out too long. This should allow the users to define rules for matching a separate alphabet, say the Greek.
But it is far from what I want ShapeLogic to be able to do.

-Sami Badawi

Tuesday, December 18, 2007

Declarative and Object Oriented programming

One of the main objectives for ShapeLogic is to make a good hybrid of Declarative programming and Object Oriented programming. This is not specific to computer vision, but is a general problem. This is a daunting task, and many people have tried and the state of the art still leaves a lot to be desired.

I have started to work on the first release of ShapeLogic with an external rule based engine, I have not worked through all the problems yet. I think that this is a case of evolutionary programming where you have to try out and approach, not knowing if it will lead to anything useful. Hopefully I will have ShapeLogic 0.9 ready pretty soon.

I think that the key is to keep it simple and keep the syntax easy to work with. Let me just give my 2 cents on a few project that combined Declarative programming and Object Oriented programming.

Approaches that impressed me


Prova is a Java Prolog hybrid. I was very impressed by the simplicity and how well it managed to integrate queries with normal database access. Unfortunately I do not think that Prolog is applicable to the approach to computer vision, that I am pursuing in ShapeLogic now.

List Comprehension in Python and Haskell. It is somewhat limited, but it is very convenient to work with.

Microsoft LINQ, I think that it is great that you can use the same simple syntax to query databases, XML and collections.

Hibernate and ORM tools: While I do not think that the dust has settled yet as to how feasible they are for production system with large databases. I think they are very promising. This was the reason that I included Hibernate in ShapeLogic, despite it not being used much yet.

Promising approaches that I found hard to work with


Drools: An open source RETE engine for the Java JVM. It comes with a lot of cool features, but I thought that the example program setting a rule up to calculate Fibonacci numbers was too complicated.

OWL: Works with XML / RDF. It is a standard. It comes with good open source tools, but it just seems too heavy weight for my purpose.

-Sami Badawi