## Introduction

Algebraic graphs are a wonderful structure to work with, and especially to learn optimization techniques!

In this article the goal is to optimize, at compile-time, composition of some paramorphisms and homomorphisms of graphs (don’t worry with these barbaric terms, all of this is discussed below).

If you never approached algebraic graphs, I suggest that you read the original paper by A. Mokhov, my tutorial or just the documentation on Hackage of Alga, the Haskell library that implements the idea.

## The problem

### Shortcut deforestation

This process is introduced in “A Short Cut to deforestation” (by A. Gill, J. Launchbury and S. Peyton Jones). Basically it is the idea that one can remove intermediate structures when working with “good functions” (functions that are preserving the structure of the object on which they are working on).

For example, the expression

builds an intermediate (and unneeded) list: `map` will build a new list that will be destructed right away by `foldr`. It can be optimized (and GHC does it) to

Note that this is possible only because `map` preserves the list structure.

On algebraic graphs, we have an equivalent to `foldr` named `foldg`:

Consider the following expression

where `f :: Graph a -> Graph b` is an operation that preserves the structure of the graph (mathematicians like to call this an homomorphism) and `x` is a graph.

We are pretty much in the same situation as for lists: indeed, the above expression currently goes through the structure two times (one for `f`, the other for `foldg`). The goal is to rewrite it to an equivalent expression that goes through the graph only once, using the fact that `f` is an homomorphism.

### Implementation for algebraic graphs

I already wrote about how I implemented this technique for `foldg` using GHC’s rewrite rules. I suggest that you to read it to understand better the following, even if I will try to quickly summarize it just below.

### Quick reminder

The basic idea is to view a graph homomorphism as a function transforming arguments passed to `foldg` (which are generally graph constructors).

First of all, we need to remember a cool result (proved in this blog): graphs homomorphisms are exactly functions that are made with `bind`, which is written for graphs as:

So we only have to optimize something like:

The question is: how to represent “functionally” the `bind` operation? Remember, if we can represent `bind` as a function operating on `foldg` arguments, we can easily fuse it with an actual `foldg`-function (imitating what GHC does for lists).

First, there is a type that encapsulate this notion:

Then we can use these rules:

It effectively transforms any occurrence of `bind` into a “function-like” foldg, that transform its arguments.

Then if we encounter an expression like `foldg e v o c . buildR`, we can optimize the composition using `"foldg/buildR"`!

If there was nothing to optimize, things are rewritten back to their normal form with the inlining of `buildR`.

### hasEdge

We can now optimize the composition of several functions of the Alga library.

Let me introduce a killjoy: the `hasEdge` function, that test if an edge is present in the graph. It is actually defined as:

The algorithm is pretty simple:

• There is of course no edge in an `Empty` graph, so you `Miss` it.
• If you `hit` a `Vertex` that correspond to the `Tail` of the edge, you say it, else you `Miss`.
• For an `Overlay` node:
• If you missed it in the left part, the edge can only be in the right one.
• If you hit a `Tail` in the left part, maybe there is the whole edge in the right part, but at least you hit the `Tail`.
• If you hit an `Edge`, you say it!
• For a `Connect` node,
• If you missed it in the left part, the edge can only be in the right one.
• If you hit a `Tail` in the left part:
• If the right part contains the other part of the edge, you found the whole `Edge`.
• Otherwise, you only have its `Tail`.
• If you hit an `Edge`, you say it!

#### Observations

`hasEdge` seems a bit like a foldg-function but not totally. The problem come with the call to `hasVertex` on an unconsumed part of the graph in the middle of the process.

Our actual rules will not optimize `hasEdge (x >>= f)`, since our `hasEdge` is not expressed with `foldg`.

`hasEdge` seems anyway not so extraordinary and it is easy to see that it can be indeed optimized when used with `fmap` to traverse the graph only once. For example `hasEdge s t (fmap f g)` is equivalent to:

Note that in this expression, the `fmap` operation was incorporated in `go (Vertex x)` and in the `hasVertex` call (which was also inlined).

Can the theory help us to optimize it automatically? And even better, for any function like `hasEdge`, that are looking almost like `foldg` but not?

## More theory

### Of bananas and barbed wire

The great article “Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire” of E. Meijer, M. Fokkinga and R. Paterson, describes some recursion operators; here, we will focus on two major: catamorphisms (written between bananas bracket in the article) and paramorphisms (written between barbed wire bracket in the article). If you haven’t read it, it is really worth it!

#### Catamorphisms

Catamorphism (which comes from Greek “κατα” meaning “downwards”, “down from”) is a term to describe functions that:

• In functional programming, generalize the `fold` function, that is, on a list, a function of type `cata :: (a -> b -> b) -> b -> [a] -> b`. `cata` take a combination function, a base case, and fold the structure accordingly.
• From a mathematical point of view, it (quoting the great Wikipedia) “denotes the unique homomorphism from an initial algebra into some other algebra”. Here, the important word is, I think, “homomorphism”: being a homomorphism, a catamoprhism will preserve the underlying algebraic structure.

You already know `cata` for lists; as a clever reader has already noticed from its type, `cata` for lists is nothing more than `foldr` in disguise:

Did you guess what is the function that express a catamorphism on graph? It is `foldg`!

#### Paramorphisms

Quoting again Wikipedia: “In formal methods of computer science, a paramorphism (from Greek παρά, meaning “close together”) is an extension of the concept of catamorphism […] to deal with a form which ‘eats its argument and keeps it too’”.

The idea is to extend the combination function: you can allow it to reason on the structure itself. When you are doing recursion, you sometimes want to inspect (or directly use) what is left of the structure (note that you cannot simply do that in a catamorphism (since this argument is not available to the combination function)).

For example, again on lists, you will have something like: `para :: (a -> ([a], b) -> b) -> b -> [a] -> b`. The combination function can use the not already consumed list.

But why can I need such a function ? `foldr` does already anything I want!

Indeed `foldr` is equivalent (in term of expressivity) to `para`. This is nicely explained in this stackoverflow answer. Basically, you can define `foldr` with `para` and you can define `para` from `foldr`. But if you do it with `para'` (again from the mentioned stackoverflow answer):

`para'` will be slower than the one we defined: it builds an exact copy of the list as it traverses it.

So the answer to the above question is efficiency: `para` allows you to make more efficient functions, and when you work with graphs (and when you simply work), you just want the more efficient implementation!

#### Paramorphisms for graphs

One can make paramorphisms for graphs by changing a bit `foldg` (and making more complex its signature):

(Thanks Andrey for the great name :) ).

### Back to `hasEdge`

You guessed it, `hasEdge` is a paramorphism!

Indeed we can write `hasEdge` using `paragraph`:

### The goal

Our current goal is to optimize

to something that traverses the original graph `x` only once. If we achieve this, it will solve our problem with `hasEdge` and for every function expressed with `paragraph`.

#### Is this even possible?

A legitimate question is: can we indeed do such a thing? Is not `paragraph` too general to allow fusion?

This is not so simple and I did not found an optimization that works for every paramorphism (and I don’t think it is possible), but only for a subclass of them.

I proved in Coq the following theorem (that is valid for every paramorphism):

(`∘` stands for functional composition in `Coq`).

With the right part of the equality, you will traverse the graph once using `paragraph` until you find a vertex, and then you will apply your “binded” function and traverse the result with `paragraph` again.

The only “little” problem is with `apply2L`. Indeed, you need to change the last arguments of your paramorphism to reflect the bind in the unconsumed part of the graph.

You can view this by a duplication of the `bind` operation: it gets incorporated in the `paragraph` expression but also for the two lasts arguments of the combination function. This leads to a simple problem: you don’t want to rewrite functions that use both the unconsumed graph and the recursive call. If you allow this, the rewriting will lead to a double computation of the “binded” function (one for the recursive call, and one for the delayed `bind` on the unconsumed arguments).

We need to ensure that we rewrite only functions that make the promise to use either the recursive call or the unconsumed graph. Note that `hasEdge` fulfill this promise, so all is well!

## Implementation

### Promises need to be kept

I don’t like “function that make promises”, so we need to encode the promise “I either use the recursive call or the unconsumed graph” in types.

#### The not working but elegant version

Ideally, this kind of function can be encoded with:

First of all, this indeed encode the promise, almost literally: your function either use the left recursive call or the left unconsumed part and then produce a function that either use the right recursive call or the unconsumed right graph.

But with this definition comes a problem: the second argument of the combination function is “hidden” in another function. For example, with such a definition, the only way to define `apply2L` is something like:

Your only choice (and this is logical) is to apply first the function (either on the recursive call of the unconsumed graph) and discuss its result. Sadly, this way will lead to a problem: one cannot fuse any operation (like `bind`) on the unconsumed right part of the graph. GHC is “blinded” by the function encoding and cannot inline what the returned function (`f'`) really is (since its depends at run-time on the result of `f`).

A solution to this problem is to write explicitly the combination function, giving the user only the control on some parameter.

#### A working solution

Here is a solution, not very classy, but working (if you find something better, don’t hesitate to tell me) :

You have three possibilities:

• `B`: it is a classic combination function (like the one we can find in `foldg`).
• `L`: you inspect the left recursive call using a predicate `b -> Bool`, decide what to do, either call the recursion on the right part or inspect the right unconsumed part.
• `R`: the same, but you inspect first the right recursive call.

Here is `switchCombi` that effectively transform this data into a combination function for `paragraph`

Note that providing a constructor for the case `Graph a -> Graph a -> b` is not very interesting, since it does not use any recursive calls.

Thus we can define `paragraphR`:

Then it is easy to define `hasEdge` using `paragraphR` :

To adapt this for `paragraphR` we need to change the previous encoding of `Foldg` with

That is a structure that represent exactly the type of `paragraphR` (like `Foldg` represented `foldg`).

Given a function into its `Paragraph` form, it is easy to get the graph it represents:

### Bind arguments

Some problems come with the choices we made, lets see directly the rules and some auxilliary functions (don’t run away!):

Far more scary than the previous ones, isn’t it?

Let me explain what are doing the rules:

• `"buildR/bindR"` is converting a `bind` into its `buildR` equivalent (it is, up to some names, exactly the previous theorem).
• `"foldg/buildR"` assure the compatibility with the previous work: you can always fuse a `foldg` on the left of a `bind`!
• `"paragraphR/buildR"` is the main rule and fuse a `paragraphR` on the left of a `bind`.

The complication comes from the fact that we fuse with `bind`. Because we allow the combination function of `paragraph` to reason on the graph itself, we will have to remember the `bind` operation that needs to be made on the “not already consumed argument” of our paramorphism (more explicitly, we are delaying the bind operation). This is done by `apply2LR` which apply `bind` to the two lasts arguments.

Even with this, the above rules seems a lot more complicated than necessary.

#### Warning: cycle

Why do you use buildBindR and not directly bind?

Because there is a little problem. The `"buildR/bindR"` rules is precisely rewriting `bind` occurrences, so using it directly in the right part of the equality will lead to a rewriting cycle.

BUT

We need this `bind` and we need it in a “re-writable form”. If you look at `hasEdge`, you see that it uses `hasVertex` on the unconsumed part of the graph, and this function can be fused with `bind`. So, ideally, standard shortcut deforestation will also happen here, directly on the argument of combination function for `Connect`.

The only possibility is to use `buildBindR`, which is only `bind` is its `buildR` form. We use the direct code of `bind` for the arguments, because we do not want to come into a rewriting cycle.

## Conclusion

### Is all of this working?

Yes! Using `inspection-testing` (a great tool that compare the code produced by GHC at compile-time), the following code compiles:

What happened?

The `fmap` was entirely fused with `hasEdge`. By entirely, I mean not only the test on vertices, but also the composition of `hasVertex` and `fmap` happened!

More precisely, the `fmap` operation is made only on the needed part of the graph. That is, if your graph is like `Overlay a b` where the desired edge is on the `a` part, the `b` part of the graph will not be evaluated (and particularly not the `fmap` operation on this part).

### What remains to be done

• Add list optimizations: a classic way to create an algebraic graph is to use a list (an adjacency list for example). For example `edges :: [(a,a)] -> Graph a` that make an algebraic graph from a list of edges. We already have some rewrite rules to optimize expressions like `foldg e v o c (edges xs)` (because we know how the graph is made, we can avoid building an intermediate graph). Is this possible with `paragraph`?

• Alga: all I described here is already implemented on my fork of Alga, but there is certainly some cleanup and verifications to make it available in the official Alga repository!

• Proofs: I only proved that we can optimize things. It will be nice to actually prove that the rewriting process is correct (that is: we rewrite things only to equivalent ones).