Some results about algebraic graphs homomorphisms
Prelude
In my previous post “Rewrite rules and a specific fold: use optimization techniques from GHC.Base”, I described a technique to optimize composition of a custom fold and some functions on algebraic graphs, implemented in Haskell in the alga
library.
The goal of this post is to try to specify exactly what are the classes of functions we can optimize. I will try to do this in a “mathematical” way (don’t run, this is actually fun), with all my results proved in Coq, a very cool proof assistant (it is a first time for me, please be indulgent with my code :) ).
All the work can be found at https://github.com/nobrakal/coq-alga/.
The version of this repositiory when I wrote this article can be found at https://github.com/nobrakal/coq-alga/tree/4586bda0847b059b263cc4a1c8685004c9461d6c.
Alga theory
I had to implement some of the already known algebraic graphs theory in Coq (already know because proved in agda
on a dedicated repository).
All of this work is in the src/Graph.v
file:
- The data structure, an inductive one called
Graph
. - Some standards functions (
foldg
,isEmpty
…). - Useful lemmas on them.
- Axioms on the graph equality (in a Coq class called
EqG
). - Indications for Coq to allow rewriting (using the
rewrite
tactic) when working with equal graphs. - Some proofs about the graph equality (especially, that
Empty
is the identity forOverlay
).
A note on equality
Here the =
symbol is ambiguous.
So we will distinguish:
a = b
for the structural equality (a.k.a. Leibniz equality) betweena
andb
.R a b
for graph equality relation betweena
andb
(note that the axioms of this relation are satisfied by others relations than graph equality, so things are not so simple).
Homomorphism
The work described here can be found in the src/Homomorphism.v
file.
Graph homomorphism
An homomorphism
in algebra theory is (quoting Wikipedia) “a structure-preserving map between two algebraic structures of the same type”.
For us, a correct homomorphism f :: Graph a -> Graph b
should satisfy:
f Empty = Empty
f (Overlay a b) = Overlay (f a) (f b)
f (Connect a b) = Connect (f a) (f b)
This is encoded in a Coq class called Homomorphism
:
Class Homomorphism {A B:Type} (f : Graph A -> Graph B) : Prop := {
Hom_Empty :> f Empty = Empty;
Hom_Overlay :> forall a b, f (Overlay a b) = Overlay (f a) (f b) ;
Hom_Connect :> forall a b, f (Connect a b) = Connect (f a) (f b)
}.
Equivalence with bind-functions
Let us recall briefly the concerned Haskell functions:
data Graph a = Empty
| Vertex a
| Overlay (Graph a) (Graph a)
| Connect (Graph a) (Graph a)
foldg :: b -> (a -> b) -> (b -> b -> b) -> (b -> b -> b) -> Graph a -> b
foldg e v o c = go
where
go Empty = e
go (Vertex x ) = v x
go (Overlay x y) = o (go x) (go y)
go (Connect x y) = c (go x) (go y)
(>>=) :: Graph a -> (a -> Graph b) -> Graph b
g >>= f = foldg Empty f Overlay Connect g
A cool result is that such homomorphisms are exactly bind-functions, that is to say any graph homomorphism can be expressed with the (>>=)
operator.
Proof
So any function expressed with (>>=)
is clearly a graph homomorphism (this is proved in the bind_is_hom
lemma).
A fun fact is that the reciprocal is also true, like shown in the hom_is_bind
lemma:
Lemma hom_is_bind A B (hom: Graph A -> Graph B):
Homomorphism hom -> hom = bind (hom ∘ Vertex).
That is, an homomorphism is equal to bind
with its definition for a vertex. A graph homomorphism is indeed entirely defined by what it is doing on a leaf of the graph expression.
Examples
fmap f
is a graph homomorphism, usingVertex . f
asbind
argument.removeVertex v
is a graph homomorphism, using\x -> if v == x then Vertex x else Empty
asbind
argument.
Theorem 1: Composition of a graph homomorphism and a foldg function
Any graph homomorphism can be fused with a foldg-function, like show in the foldg_bind
theorem:
Theorem foldg_bind A B C (e : C) (v: B -> C)
(o:C -> C -> C) (c:C -> C -> C) (f_v : A -> Graph B):
foldg e v o c ∘ bind f_v = foldg e (foldg e v o c ∘ f_v) o c.
Basically, because a graph homomorphism preserve the structure, it can easily be fused with something that destruct this structure: this is shortcut deforestation.
Theorem 2: Composition of graph homomorphisms
A cool corollary of the previous theorem is that the composition of two graph homomorphisms is again a graph homomorphism.
Proof
This is proved in the bind_compo
theorem:
Theorem bind_compo A B C (hom1 : Graph A -> Graph B) (hom2 : Graph B -> Graph C):
(Homomorphism hom1) /\ (Homomorphism hom2) ->
hom2 ∘ hom1 = bind (hom2 ∘ hom1 ∘ Vertex).
Problems
So graph homomorphisms are cool, but not always what we want:
in the examples of graph homomorphisms, removeVertex
is leaving empty leaves in the structure. If it was removing them (like the actual removeVertex
in the alga
library), it will change the graph structure, and thus will not be a graph homomorphism as we defined it.
Such functions are very useful, so we must lower our restrictions and only require preservation of the graph equality to include them.
Reduced graph homomorphism
Theorems of this section are in the src/ReducedHomo.v
file.
Definition
A function f :: Graph a -> Graph b
is called a reduced graph homomorphism if:
R (f Empty) Empty
R (f (Overlay a b)) (Overlay (f a) (f b))
R (f (Connect a b)) (Connect (f a) (f b))
So we require only that the function has the same properties than a graph homomorphism, but with graph equality instead of the structural one.
This idea is encapsulated in the Reduced_hom
Coq class:
Class Reduced_hom {A B : Type} (R: relation (Graph B)) (f : Graph A -> Graph B) : Prop :=
{
RHom_EqG :> EqG B R;
RHom_Empty :> R (f Empty) Empty ;
RHom_Overlay :> forall a b, R (f (Overlay a b)) (Overlay (f a) (f b)) ;
RHom_Connect :> forall a b, R (f (Connect a b)) (Connect (f a) (f b))
}.
Theorem 3: Reduced graph homomorphism are indeed reduced graph homomorphism
Graph homomorphism are reduced graph homomorphism.
The reverse is false.
Proof
- “graph homomorphism are reduced graph homomorphism.”
This is proved in the hom_is_reduced_hom
theorem.
The idea is only to use the fact that, for all a b :: Graph A
, if a = b
then R a b
.
- “The reverse is false”.
Using
wforget :: Graph a -> Graph b wforget _ = Empty
wforget
is clearly a reduced homomorphism (this is proved in the const_empty_is_reduced_hom
theorem).
But it is also clearly not a graph homorphism, this is proved by the absurd in the const_empty_is_not_hom
theorem.
The case of empty leaves: a class of reduced graph homomorphism
Theorems of this section are in the src/SmartHomo.v
file.
Prelude
-- Remove empty leaves around a graph operator (usually Overlay or Connect)
kSimpl :: (Graph a -> Graph a -> Graph a) -> Graph a -> Graph a -> Graph a
kSimpl c x y =
if isEmpty x
then if isEmpty y
then Empty
else y
else if isEmpty y
then x
else c x y
-- Remove empty leaves of a graph
dropEmpty :: Graph a -> Graph a
dropEmpty = foldg Empty Vertex (kSimpl Overlay) (kSimpl Connect)
Definition
A function f :: Graph a -> Graph b
is called a smart graph homomorphism if it can be described using f_w :: a -> Graph b
such that
f g = dropEmpty (g >>= f_w)
Obviously, such a function can go through the graph only once, using
f = foldg Empty f_w (kSimpl Overlay) (kSimpl Connect)
This is proved in the smart_hom_single
theorem.
Example
induce
is a smart graph homomorphism (maybe the definition was made for it…):
induce :: (p -> Bool) -> Graph a -> Graph a
induce p =
foldg
Empty
(\x -> if p x then Vertex x else Empty)
(kSimpl Overlay)
(kSimpl Connect)
This is proved in the induce_smart_hom
theorem.
Theorem 4: smart graph homomorphisms are reduced homomorphisms.
This seems clear because empty leaves does not play any role in graph equality, but it takes me anyway 44 lines to prove it.
It is proved in the smart_hom_is_reduced_hom
theorem:
Theorem smart_hom_is_reduced_hom A B (R: relation (Graph B))
(f : Graph A -> Graph B) :
EqG B R -> Smart_hom f -> Reduced_hom R f.
Theorem 5: Composition of smart graph homomorphisms
The composition of two smart graph homomorphisms is again a smart graph homomorphism.
Proof
This was hard to prove because not clear at first glance (well mostly, it was long, using 280 lines ^^). The idea is to see that the composition of two smart graph homomorphisms is something not so strange:
Theorem smart_hom_compo A B C (s1 : Graph A -> Graph B)
(s2 : Graph B -> Graph C): (Smart_hom s1) /\ (Smart_hom s2) ->
s2 ∘ s1 = foldg Empty (s2 ∘ s1 ∘ Vertex) (kSimpl Overlay) (kSimpl Connect).
(The proof is here).
Then see that this form allow us to prove that the composition is again a smart graph morphism:
Theorem smart_hom_compo_is_smart A B C (s1 : Graph A -> Graph B)
(s2 : Graph B -> Graph C):
(Smart_hom s1) /\ (Smart_hom s2) -> Smart_hom (s2 ∘ s1).
(The proof is here).
Conclusion
So all this work to prove that we can optimize anything like:
foldg e v o c . morph1 . morph2 . [...] . morphn
to a function that will only go through the graph once.
if morph1
, morph2
, …, morphn
are graph homomorphisms (theorem 1 and theorem 2).
We can also optimize any combination of smart graph homomorphisms (theorem 5).
What is left
There is certainly some things to do playing with these results:
- Find a cool (and efficient) implementation for the
alga
library. - The composition of smart graph homomorphisms is preserving graph structure, maybe there is an efficient one that is only preserving graph equality ?
- Determine precisely the class of reduced homomorphisms. That is to say, is there an equivalence with a well known class of functions like for graph homomorphisms ?
- What is the class of reduced homomorphisms that we can compose ?
- Can this be generalised to arbitrary algebraic data types that have some notion of an Empty constructor?
- And certainly many other things :)
Thanks
Thanks to Andrey Mokhov for his review of this article.