Tuesday, March 22, 2016

Static vs. Dynamic Functional Languages

You can divide functional programming languages into 2 groups: Static and dynamic.

Dynamic functional languages: Clojure, Common Lisp, Racket and Scheme. They have few types often only known at run time.

Statically typed functional languages: F#, ML, Haskell, Idris and Scala. They have advanced types that are known at compile time.

Functional programming languages have similarities but are very different from one another. Some are quite hard to learn. What should you pick?

Slogans

Here are two extreme positions in functional programming reduced to slogans:

Lisp: Everything is Data

Lisp has a great story: Everything is data.

Lisp is homoiconic. There is one datatype: The S-expression in Lisp or an EDN in Clojure. This encodes:
  • Records
  • List
  • Map
  • Stream
  • Programs
Everything is unified and first class. This makes Lisp very elastic and adaptable to handle open ended problems like AI. It also leaves a lot of room for mistakes when dealing with complex data structures since everything sticks to everything.

Haskell: Everything is a Computation

Computation sounds like an equally strong unifying foundation. This is a strong counter argument to Lisp.

Haskell turns the world into mathematics by giving strong guarantees and the ability to reason about programs. It is fast, elegant and remarkably safe.

Most of the world is messy so programming in Haskell is both an art and a science.

A Type System is a Must For Production Code

In my experience:

A complex production server application demands a static type system for stability

The type system is doing at least half of my work when I work alone and prevents total anarchy when working in teams.

History of Static and Dynamic Type System

Like many other programmers, I have gone back and forth between preferring static and dynamic languages several times.

C++ and Java

I started using and loving the sophisticated statically typed object oriented languages: C++ and Java.
Why would anybody want to program in Basic?

Perl, Python and Ruby

At some point I had to make a small script for text processing. I realized that dynamically typed languages Perl, Python and Ruby are much simpler and faster to work with.
They borrow a lot of ideas form functional languages and saves you a lot of boiler plate. Programming became fun again.
I never wanted to go back.

F#, Scala and Haskell

Then came the raise of F#, Scala and Haskell.
I thought that you got the best of both worlds:
  • There are few visible types due to type inference
  • They look like dynamic language
  • Still you get strong safety from the invisible type system
  • They are fast
My stability concern for production application ruled out dynamic languages. The future belongs to F#, Scala and Haskell.

Living with Static Types

For the last 4 years I have been happy programming in Scala. It really improved my productivity.
I mainly deal with stable data types. Each data structure get immutable case class and they flow beautifully and it even works well in a concurrent system.

I am a little concerned about the amount of black magic going on at the type level in Scala and Haskell.

Web, Scripting and Data Exploration

Some fields continue to be dominated by dynamic languages:
  • Data exploration
  • Data science
  • Scripting
  • Web front end work in JavaScript, PHP and Ruby
I do data mining in Scala and can quickly add a new data source with unit tests to a stable functional reactive ingestion pipeline, but during a hackaton I had to explore a lot of different data sources and my normal startup time was too slow for the deadline.

Dynamic languages have an edge for small systems.

Problems Using Scala for NLP

Idiomatic Scala has been great for NLP.
I had to extract all the hidden and visible information on a html page and had to parse the DOM tree for everything: elements, attributes, code and json data.
The DOM tree is similar to an S-expression.
The best idiomatic Scala representation I could find was Play JSON. The DOM tree and Play JSON are not that similar and processing json in dynamic languages is more natural than in strongly typed languages.

Dynamic languages have an edge for some complex systems.

Lisp Revisited

I used Lisp in school. It was the cool AI language and my first functional language. I loved it, it blew my mind but I had a very shallow understanding.

Impressions from revisiting Lisp after using statically typed functional languages:
  • Lisp is small and elegant
  • Easy learning curve
  • Great at traversing dynamic data
  • Well suited for exploration
  • A lot of the principles of statically typed functional programming translate directly
  • I still think in Scala like types making my Clojure code better organized
  • Macros feel natural unlike in C++ and Scala
  • Lisp is really fluid combining in so many crazy ways
  • You lose a lot of safety

Going from Haskell to Clojure left me with the feeling I had when moving from C++ to Python. You get a lot of value for less effort.

Raise of the Gradual Type Systems

There has been slow movement towards gradual types. Here are a few place where they have popped up:

Ambrose Bonnaire-Sergeant on gradual typing in Clojure

The type systems in Typed Clojure and Typed Racket are pretty different than in Scala and Haskell. Generally weaker, but Typed Clojure and Typed Racket have union types that are only now investigated in Scala's experimental new type system Dotty.
These advances in gradual types make it possible to harden Lisp code to improve stability.

Data or Calculation

I was puzzled by the Lisp and Haskell slogans:
  • Everything is data
  • Everything is a calculation
It was a paradox. Which is a better foundation for computer science?
I could not easily dismiss either. For now I have accepted that we are stuck with both.

For a long time I suffered from the misunderstanding that F#, Scala and Haskell are like dynamic languages, with the addition of speed and safety. But they are fundamentally different.

4 comments:

Mark T. Kennedy said...

I'd like to read more about the 'Problems using Scala for HTML' part. of all the points made in your article, it is the one that left me confused.

Sami Badawi said...

Hi Mark,

I rewrote the paragraph: "Problems Using Scala for NLP".
Hope the new version is easier to understand than the old.

Thanks for the feedback,
-Sami

Unknown said...

Hi Sami,

I believe that the core of "everything is data" vs. "everything is code" is not necessarily a paradox, but a fact of the universality of turing machines.
https://en.wikipedia.org/wiki/Universal_Turing_machine
The fact that these machines exist makes it possible in the first place that we write "software", e.g. to encode a program as a piece of data that can then be loaded by a UTM to be executed.
We are "stuck with both", because both are two sides of the same coin.

Actually I remember the Lisp slogan as: "Code is data, data is code."

This now sounds simpler than it is, because when I think about problems I can think about it in the "data picture" or in the "computation picture", but somehow I still do not manage to deeply merge the two pictures in my head. It still feels like two worlds.

Not sure if this helps,
Christian

Sami Badawi said...

Hi Cristian,

Thanks for your comment.

You are right about universality of truing machine.
In category theory or topos theory I am sure that that code and data are isomorphic.

I would not rule out that 10 years from now we have languages where the distinction between static and dynamic have gone away.

I tend to put too many details in my posts, so I try to keep it simple.

Good to know about the Lisp slogan.

-Sami