Application of Category Theory
to Computer Science

This is a list of miscellaneous applications of Category Theory to Computer Science, mostly based on examples given during the lecture [7] but also on some ideas I found elsewhere. The references that I used to explain or imagine these applications are given at the end of the document.

1. Concurrency ([7])

We have seen an overview of the category of transition systems TS in exercise A3. A transition system can be used to describe concurrency of parallel programs: the states of the set of processes (states) and the way it may evolve (transitions). Another example has been mentioned during the lecture: the synchronization trees ST. It is a subcategory of TS, where we restrict ourself to graphs of State/Transition that are trees. This is a very specific kind of concurrency, where the set of processes may move to a new state but never come back to an old one.

A Synchronization Tree 1 i 2 s 1->2 a 3 1->3 4 1->4 5 2->5 6 2->6 7 3->7 8 3->8 9 3->9 10 4->10 A synchronization tree

One of a very widespread model for concurrency is given by Petri Nets. They can be represented by an oriented graph, where we distinguish two kinds of nodes, square and circle shaped. There are small balls that can move in the circle nodes and split/merge with respect to some particular rules. The position of the balls corresponds to the states of the system and the restriction on their movements prevent conflicts between the concurrent processes.

A Petri Net 1 A A 1->A 2 B B 2->B 3 C C 3->C 4 D D 4->D 5 5->D 6 A->2 A->3 B->4 C->5 D->6 A petri net that describes parallel execution of B and C

Again, Petri Nets are objects of a category and we may find appropriate morphisms between Petri Nets as arrows. There are several other concurrency models with their own characteristics (scope of application, semantics...). Category Theory provides a uniform framework that is very convenient.

2. Functional programming ([6])

Suppose we have a pipeline of functions, each one using two operations: return that puts a value into a container and bind that extracts the value and sends it to the successor in the pipeline. In functionnal programming, there exists an abstract data structure used to transmit information between the functions. This structure is called a "monad" because the two operations roughly correspond to the unit and the counit of a monad in Category Theory. As an example, consider the category Set and the powerset functor ℘ defined by ( f ) ( X ) = f ( x ) x X . One can verify that this functor defines a monad on Set with unit and counit:

η S : S ( S ) x { x } and μ S : ( ( S ) ) ( S ) X x X x

From these functions we get the two operations mentioned above:

return: A → PA
        x ↦ ηA(x)

  bind: PA × (A → PB) → PB
        (X ,    f   ) ↦ μB({fx, x∈X})

Here, the return operation allows to pack an element x of "type" A in a set {x}. The bind operation unpacks a set of elements of type A, applies an operation f to these objects and transmits a new set of elements of "type" B. The choice of the function f and the generalization to other monad functors than the powerset provides a wide range of computation methods for functionnal programming.

3. Adjoint functors and Monads ([7])

Adjoint functors and Monads are two fundamental concepts of Category Theory that have many applications in Computer Science. We have already seen an application of monad in the previous section, so let us look at the related concept of adjunction. Consider the categories Set and Mon. We have an obvious "forgetful" functor G from Mon to Set.

Adjunction between Set and Mon 1 Set 2 Mon 1->2 F 2->1 G

In Computer Science, we often have a set of symbols Σ and we build the language of all the words over this alphabet Σ * . Doing so, we get a monoïd with the empty word ε as the identity and concatenation as internal law. This transformation F : Σ Σ * is actually a functor which is adjoint to G. Another example is given by the category TS and ST of the first section. The functor inclusion and the "unfolding" of transition systems form an adjunction!

4. Complete Partial Order ([7])

A partial order is said to be complete if every increasing sequence of elements d 0 d 1 d 2 . . . has a least upper bound sup i ω d i . We can endow the class of complete partial order with a structure of category as follows: the objects are the CPO's and the arrows are the functions that are increasing and continuous (in the sense that f ( sup i ω d i ) = sup i ω f ( d i ) ). CPO are very common in Computer Science and Category Theory gives a nice picture that permits to understand and solve many problems. For instance, one often considers a structure of CPO ( D , ) and wants to find a fixed point of a given function f from D to itself. If f is an arrow for the category previously defined and D additionnaly has a bottom element ⊥ then it's easy to show that fixf = sup i ω ( f i ( ) ) is such a fixed point.


5. Construction of a model of λ-calculus ([2], [7])

In λ-calculus, objects behave both as functions and arguments. Hence in order to construct a model D of λ-calculus, we need a way to "identify" elements of D with functions from D to itself. If D has several elements, then there is no one-to-one map from D onto D D (by Cantor's theorem). However, only a limited subset of the functions from D to itself is really interesting in the context of Computer Science. In 1969, Scott was able to produce such an "identification" by restricting to continuous functions in the category of complete lattices. Another approach is to choose the continuous function for the CPO category described above. In any case, we get a map θ : D D D . Then, for all x , y D , x ( y ) can be understood as θ ( x ) ( y ) .

6. The Presheaf Category ([7])

We saw that it is possible to define a category [ C op , Set ] with functor as objects and natural transformations as arrows. Let us first consider the case where Set is replaced by the category 2 = ( { 0 , 1 } , { 0 0 , 0 1 , 1 1 } ) . An object of [ C op , 2 ] is a functor that assigns to each element c C a value 0 or 1, so can be viewed as the characteristic function of a subset of C. Note that such a functor must "preserve" arrows in C op , so we have an additional control on the subsets of C. Moreover, we can replace the boolean values { 0 , 1 } by any other category so that we can refine the description of "appartenance" to a subset. Consider for example "fuzzy logic" where we would replace { 0 , 1 } by the interval [ 0 , 1] to get a probability of "appartenance".

Words of {a, b}* 1 ε 2 a 1->2 3 b 1->3 4 aa 2->4 5 ab 2->5 6 ba 3->6 7 bb 3->7 Saa Sa

To illustrate how this generalization can be used in computer science, let C = { a , b } * with arrows that indicate extension of words. Then we can assign to each word w the set S w of the ways to "generate" it. A word w ' that extends another one w is going to use the same ways to generate and probably additional ones, so we should have a function from S w ' to S w . But this exactly means that F : w S w is an object of [ C op , Set ] !

7. Categorical Logic ([5], [7])

Logic is an essential tool of Computer Science in various fields such that formal methods, automated theorem proving or artificial intelligence. The different applications may not have all the same requirements and consequently several kinds of logic have been invented. One benefit of categorical logic is to permit dealing with all these logical systems in a unified fashion.

In order to give an example where Category Theory is useful in logic, let's consider the definition of Kripke semantics. One can find this kind of representation in modal logics, intuitionistic logic or even in the method of Forcing in Set Theory. We have a category where the objects and the arrows are respectively "the worlds" and "the accessibility relation" in the terminology of Kripke semantics. To each world, we may assign the set of formulae φ that are true in this world. The accessibility relation implies certain kinds of relation between the sets assigned to a world. Hence Category Theory gives an interpretation similar to the one we saw in previous section: the "words" are replaced by the "worlds" and the "ways to generate a word" by the "formulae that hold in a world".

8. B-method ([1])

B-method is a formal method for software development generally used for safety-critical system (such that the Ligne 14 du métro de Paris, with a fully automated driving). The idea is to design an abstract machine with variables and operations so that it is possible to generate certified code (i.e. program whose correctness with respect to the specification is proved formally). Below is an example of a simple B machine that deals with booking of seats. Note that the INITIALISATION makes the INVARIANT true. Also, each operation has PREconditions that ensure that the INVARIANT remains true.

MACHINE
  booking(nb_seats)
VARIABLES
  available_seats
INVARIANT
  available_seats⊆[1;max_seats]
INITIALISATION
  available_seats := [1;max_seats]

OPERATIONS
book i = 
  PRE i∈available_seats THEN
    available_seats := available_seats \ {i}
  END;

cancel i = 
  PRE i∈([1;max_seats]\available_seats) THEN
    available_seats := available_seats ∪ {i}
  END;

The implementation of this machine does not seem straightforward. One of the main feature of the B-method is that it allows refinement of machine from the really abstract specification until something very close to actual programming languages, so that automated generators of code can be used. For instance, consider the refinement below. The available seats are now represented by a boolean function that can direcly set its values to true or false. This is more similar to tables of booleans (that exist in programming languages) than the abstract notion of set of seat numbers. Note that the "glue" invariant gives a relation between old and new variables and remains true during the execution of the machine.

REFINEMENT
  booking2
REFINES
  booking(nb_seats)
VARIABLES
  available_seats2
INVARIANT
  ∀i∈[1;max_seats] available_seats2(i) = (i ∈ available_seats)
INITIALISATION
  available_seats2 := [1;max_seats]×{TRUE}

OPERATIONS
book i = 
  PRE available_seats2(i) THEN
    available_seats2(i) := FALSE
  END;

cancel i = 
  PRE ¬available_seats2(i) THEN
    available_seats2(i) := TRUE
  END;

Where is Category Theory involved? First, the B-method uses sets to represent objects. This can be handle as the Set category but one can imagine an extension of the B-method where we use another kind of category. Next, the B-method requires a lot of logics. We have to ensure that the INVARIANTs are satisfied after INITIALISATION and preserved by each operations, ensure that the semantics of the machines are preserved by refinement... Of course in this simple example it is really trivial to check these conditions but for complex machines we use automated proofs. It is also necessary to define semantics for formulae and machines. As we have seen in section 7, Category Theory is really helpful when working with logic. Finally, B-method can be understood in a categoric framework, using the same techniques as we saw in section 1. The objects are the B-machines and we can define arrows as refinements (actually the B-method contains other kinds of relation such that those given by the keywords SEES or INCLUDES but we ignore them for the sake of simplicity).

9. XML families [[3], [4]]

(Note: In this section, we assume familiarity with the XML syntax and related formats. Also, we simplify the structure of XML documents, for instance by ignoring attributes or text nodes)

XML is a generic format of documents that can be represented as trees. One defines a family of XML documents by restricting the syntax allowed (and generally adding a semantic). For example, XHTML is a family of XML documents used for Web pages: it contains specific nodes such that <a/> to describes hyperlinks, <table/>, <tr/>, <td/> to describe tables etc. There exists a lot of XML formats used for various applications.

It is possible to transform an XML document to another one. For example, consider the following XML document that gives the price of different items A, B, C.

<?xml version="1.0" encoding="UTF-8"?>
<root>
<item>
  <name>A</name>
  <price>10€</price>
</item>
<item>
  <name>B</name>
  <price>5€</price>
</item>
<item>
  <name>C</name>
  <price>18€</price>
</item>
</root>

This XML document is useful as a storage format but if one wants to display the document in a browser, then a conversion into an XHTML document is needed (we do not give all the mandatory XHTML elements: <body/>...)

<?xml version="1.0" encoding="UTF-8"?>
<html>
 ...
<table>
  <tr><td>Name</td><td>Price</td></tr>
  <tr><td>A</td><td>10€</td></tr>
  <tr><td>B</td><td>5€</td></tr>
  <tr><td>C</td><td>18€</td></tr>
</table>
 ...
</html>

This transformation converts a family of XML documents (here the one that describes prices of items) into another family (here a subfamily of XHTML). Hence we can define a category with XML families as objects and XSLT transformations as arrows (XSLT is a family of XML document that represent transformations, so the sets of arrows in our category is actually itself an object of the category!).

We have not described precisely how an XML family is defined. There are several schema languages to do this, such that DTD, XML Schema and RELAX NG. XML family describable by RELAX NG is always transformed by XSLT into another family describable by RELAX NG. Hence The sets of XML families describable by RELAX NG is a subcategory (I have not checked whether this also holds for DTD and XML Schema).

In order to study these different schemas, Murata et al. introduce a generalization of context-free grammars for trees. For example, the initial family of items can be described by the grammar (see the paper for more details):

NonTerminalSymbols = {Root, Item, Name, Price, Pcdata}
TerminalSymbols = {root, item, name, price, pcdata}
StartSymbols = {Root}
Rules = { Root→root[Item*], Item→item[NamePrice], Name→name[Pcdata],
          Price→price[Pcdata], Pcdata→pcdata[ε] }

A grammar describing the transformed documents is (again, without details):

NonTerminalSymbols = {Html, ... Table, Tr, Td1, Td2, Name, Price, ...}
TerminalSymbols = {html, ... table, tr, td, ...}
StartSymbols = {Html}
Rules = { Html→html[...], ...→...[Table*] Table→table[Tr*], Tr→name[Td1 Td2],
          Td1→td[Name], Td2→td[Price]... }

Then we define an arrow between these two grammars, such that the function that assigns to each grammar the corresponding family is a functor (we could probably define arrows in the grammar category intrinsically, but again I've not thought much about that).

Two grammars may define the same family. Not all XML families can be described by a grammar but according to Murata et al., all RELAX NG XML families can. Hence, if we restrict ourselves to the subcategories of XML families describable by RELAX NG and if we use a quotient category for the grammars (with two grammars equivalent iff they define the same XML family) then the (quotiented) functor F above is a bijection. Finally F , F 1 form an adjunction between the quotient set of grammars and the RELAX NG subcategory.

10. References

  1. J.R. Abrial - The B-Book. Cambridge University. Press, 1996.
  2. H.P. Barendregt - The Lambda Calculus: Its Syntax and Semantics. North Holland, Amsterdam (1984). http://mathgate.info/cebrown/notes/barendregt.php#5
  3. M. Murata et al. - Taxonomy of XML schema languages using formal language theory.
  4. A. Møller - Advanced Web Technology. Lecture at the University of Aarhus (November-December 2009).
  5. J. van Oosten - Basic Category Theory. BRICS Lecture Series (January 1995)
  6. wikibooks.org - Haskell/Category theory
    http://en.wikibooks.org/wiki/Haskell/Category_theory
  7. G. Winskel - Category Theory for Computer Science. Lecture at the University of Aarhus (August-September 2009).
This page is W3C-compliant - Author: Frédéric WANG
Valid XHTML 1.1 Valid MathML 2.0 Valid SVG Valid CSS