*It is the sixth article from the cycle "The Category Theory for Programmers". The previous articles were already published on Habré:*

0. The category theory for programmers: preface

1. Category: composition essence

2. Types and functions

3. Categories, big and small

4. Kleysli's categories

5. Works and koproizvedeniye

In the previous article basic operations over types were considered: work and koproizvedeniye. Now we will show that the combination of these mechanisms allows to construct many of daily data structures. Such creation has essential applied relevance. For example, if we are able to check base data types for equality, and also we know how to reduce equality of work and a koproizvedeniye to equality a component, then operators of equality for composite types it is possible to display automatically. In Haskell for an extensive subset of composite types operators of equality and comparison, converting at line and back and many other operations are automatically brought.

Let's consider in more detail the place of work and a koproizvedeniye of types in programming.

## Work of types

Canonical implementation of work of types in programming languages is a couple. In Haskell of steam is a primitive designer of types, and in C ++ it is rather difficult template from standard library.

Strictly speaking, work of types is not commutative: it is impossible to substitute couple of type

`(Int, Bool)`

instead of `(Bool, Int)`

, though they also contain the same data. However work is commutative to within the isomorphism set by function `swap`

, which back to:```
swap :: (a, b) -> (b, a)
swap (x, y) = (y, x)
```

It is possible to consider such couples as different formats of storage of the same information as big endian and little endian.

Putting one couples in others, it is possible to combine some types in work. The same can be received more simply if to notice that the enclosed couples are equivalent to trains. This consequence of the fact that different orders of attachment of couples are isomorphic among themselves. There are two possible orders of a combination in work of three types

`a`

, `b`

and `c`

(in the set sequence). Namely,`((a, b), c)`

or`(a, (b, c))`

These types are different in the sense that the function expecting argument of the first type cannot transfer argument of the second, however values of types are in one-to-one correspondence. There is a function which sets this display in one party:

```
alpha :: ((a, b), c) -> (a, (b, c))
alpha ((x, y), z) = (x, (y, z))
```

And here the return to it:

```
alpha_inv :: (a, (b, c)) -> ((a, b), c)
alpha_inv (x, (y, z)) = ((x, y), z)
```

So, isomorphism of types takes place; these are just different methods of repacking of the same data.

Considering work of types as binary operation, we see that isomorphism is very similar to the associative law in monoides above:

`(a * b) * c = a * (b * c)`

The only difference is that in a monoida both works absolutely match, and for types — are equal to within isomorphism.

If to consider this distinction insignificant, then it is possible to prolong analogy to monoides further and to show that Singleton

`()`

is a neutral element concerning multiplication of types, just as 1 it is neutral concerning multiplication of numbers. Really, accession `()`

to a type element `a`

does not add any information. Type`(a, ())`

it is isomorphic

`a`

, where isomorphism is set by functions```
rho :: (a, ()) -> a
rho (x, ()) = x
```

```
rho_inv :: a -> (a, ())
rho_inv x = (x, ())
```

Our supervision can be formalized in the form of the statement that the category of sets

**Set**is

*monoidal*, i.e. category which is also monoidy concerning multiplication of objects (in this case — rather Cartesian product). Below strict definition will be given.

In Haskell more general method of a task of works, especially convenient as we will soon be convinced when they are combined with the sums of types is available. Let's use the named designers with several arguments. For example, alternative determination of couple looks so:

`data Pair a b = P a b`

Here

`Pair a b`

— this name of the designer of types parametrized by two other types `a`

and `b`

, and `P`

— name of the designer of data. We define specific type, transferring in the designer of types `Pair`

two types, also we create couple of this type, transferring suitably the typified values in the designer `P`

. For example, we will define a variable `stmt`

as to steam from `String`

and `Bool`

:```
stmt :: Pair String Bool
stmt = P "This statement is" False
```

The first line is a declaration of type. It consists of the designer of types

`Pair`

with `String`

and `Bool`

instead of `a`

and `b`

. The second line defines the variable value received by application of the designer of data `P`

to a specific line and logical value. Once again: designers of types are used for designing of types, designers of data — for designing of values.As namespaces of types and data in Haskell are not crossed, often same name is used for both designers. For example,

`data Pair a b = Pair a b`

If to look at the built-in type of couple with a Lenin squint, then she is recognized that she actually is a variation on the last declaration, only the designer

`Pair`

it is replaced on binary operator `(,)`

. It is possible to use `(,)`

as well as any other named designer, in the prefix notation:`stmt = (,) "This statement is" False`

Similarly

`(,,)`

designs the three etc.Instead of use of the generalized couples or trains, it is possible to enter a separate name for work of specific types. For example,

`data Stmt = Stmt String Bool`

represents work

`String`

and `Bool`

, but possesses own name and the designer. The benefit of such determination is that it is possible to get many types with the same contents, but different semantics and functionality which the type system will not allow to mix among themselves.Programming on trains and mnogoargumentny designers often leads to confusion and errors because it is necessary to trace all the time what component, for what is responsible. It would be better to have an opportunity to give to components proper names. Work of types with the named fields is called record in Haskell and

`struct`

in C.## Records

Let's review a simple example. Let's describe chemical elements the uniform structure consisting of two lines (the Latin name and the character) and the integral number corresponding to atomic mass. For this purpose it is possible to use a train

`(String, String, Int)`

and all the time to hold in the head what component, for what is responsible. A component from a train we will apply pattern matching to extraction. The following function checks whether the character of chemical element is a prefix of its Latin name (for example, **He**—

**Helium**prefix):

```
startsWithSymbol :: (String, String, Int) -> Bool
startsWithSymbol (name, symbol, _) = isPrefixOf symbol name
```

It is easy to be mistaken in such code, it is difficult to read and support it. It is much better to define record instead of a train:

```
data Element = Element { name :: String
, symbol :: String
, atomicNumber :: Int }
```

These representations are isomorphic what it is possible to be convinced by means of the following vzaimnoobratny conversions of:

```
tupleToElem :: (String, String, Int) -> Element
tupleToElem (n, s, a) = Element { name = n
, symbol = s
, atomicNumber = a }
```

```
elemToTuple :: Element -> (String, String, Int)
elemToTuple e = (name e, symbol e, atomicNumber e)
```

Let's notice that names of writing fields at the same time are also functions-aksessorami. For example,

`atomicNumber e`

returns a field `atomicNumber`

records `e`

. Thus function `atomicNumber`

has type:`atomicNumber :: Element -> Int`

With use of records of type

`Element`

function `startsWithSymbol`

becomes more readable:```
startsWithSymbol :: Element -> Bool
startsWithSymbol e = isPrefixOf (symbol e) (name e)
```

In Haskell it is possible to turn the trick turning function

`isPrefixOf`

in infix operator, having framed her with the return apostrophes. It does a code to more readable:`startsWithSymbol e = symbol e `isPrefixOf` name e`

We could lower brackets because a priority of infix operator of low priority of function call.

## Sum of types

How work in category of sets induces works of types, the koproizvedeniye generates the sums of types. Canonical implementation of the sum of types in Haskell is as follows:

`data Either a b = Left a | Right b`

As well as couples,

`Either`

- y are commutative (to within isomorphism), can be enclosed, and the order of attachment is not important (to within isomorphism). For example, the sum of three types looks so:`data OneOfThree a b c = Sinistral a | Medial b | Dextral c`

It turns out that

**Set**forms (symmetric) monoidal category concerning operation of a koproizvedeniye. The place of binary operation is taken by the dizjyunktny sum, and a neutral element is the initial object. In terms of types

`Either`

— monoidal operation, but not inhabited type `Void`

— its neutral element. Consider that `Either`

— this addition, and `Void`

— it is zero. Really, adding `Void`

to the sum of types does not change a set of values of type. For example,`Either a Void`

isomorphically

`a`

. Really, as type `Void`

it is not inhabited, there is no method to construct `Right`

- value. Means, the only inhabitants `Either a Void`

are `Left`

- values which are just a wrapper over value of type `a`

. Symbolically it can be written as `a + 0 = a`

.The sums of types very often meet in Haskell. In C ++ their analogs (associations or options) are used significantly less often for a number of reasons.

First, the simplest sums of types are transfers which are implemented in C ++ with the help

`enum`

. Equivalent`data Color = Red | Green | Blue`

in C ++ will be

`enum { Red, Green, Blue };`

Simpler sum of types

`data Bool = True | False`

in C ++ is primitive type

`bool`

.Further, the sums of types coding existence or lack of value are implemented in C ++ by means of different tricks with "impossible" values, such as blank lines, negative numbers, null pointers etc. In Haskell explicit, intended optionality of value registers with the help

`Maybe`

:`data Maybe a = Nothing | Just a`

Type

`Maybe`

is the sum of two types. Let's mentally turn it designers into separate types. The first will take a form:`data NothingType = Nothing`

This transfer with a unique value

`Nothing`

. In other words, it is Singleton equivalent to type `()`

. Second part`data JustType a = Just a`

represents a wrapper over type

`a`

. We could write `Maybe`

as`data Maybe a = Either () a`

More difficult sums of types are fabricated in C ++ by means of pointers. The pointer can be or zero, or point out value of a certain type. For example, in Haskell the list is defined as (recursive) sum of types:

`List a = Nil | Cons a (List a)`

In C ++ the same type registers so:

```
template<class A>
class List {
Node<A> * _head;
public:
List() : _head(nullptr) {} // Nil
List(A a, List<A> l) // Cons
: _head(new Node<A>(a, l))
{}
};
```

The null pointer codes the empty list here.

Let's notice that designers

`Nil`

and `Cons`

from Haskell turned in two overloaded designers of a class `List`

with similar arguments: without arguments in a case `Nil`

, value and the list for `Cons`

. Class `List`

does not need a tag which would distinguish components of the sum of types; instead he uses special value `nullptr`

for `_head`

, to provide `Nil`

.Important distinction between types in Haskell and in C ++ consists that in immutabelna data structure Haskell. If the object was created by means of a certain designer, then he will forever remember what designer and with what arguments was used. So class copy

`Maybe`

, created as `Just "energy"`

, will never turn back `Nothing`

. Similarly empty list forever will remain empty, and the three-element list will always store the same three elements.The Immutabelnost does designers reversible: the object can always be taken to the components used during its creation. Such deconstruction is executed by pattern matching as which this or that designer acts. Arguments of the designer are substituted with names of variables (or other samples).

At type

`List`

two designers, so deconstruction of any `List`

consists of two corresponding pattern matchings. The first sample matches the empty list `Nil`

, the second — with the list created with the help `Cons`

. For an example we will define simple function:```
maybeTail :: List a -> Maybe (List a)
maybeTail Nil = Nothing
maybeTail (Cons _ t) = Just t
```

First part of determination

`maybeTail`

the designer uses `Nil`

as the sample for comparison also returns `Nothing`

. The second part uses as an example the designer `Cons`

. The first argument of a sample is provided by a crossed out section as we are not interested in the value which is contained in it. Second argument `Cons`

contacts a variable `t`

(hereinafter we will speak about variables though, strictly speaking, they are invariable: once the variable connected with value never changes). Function value on this sample is equal `Just t`

. So, depending on a method of creation of value of type `List`

, it will match one of samples. If it was created with the help `Cons`

, function will receive both used at the same time argument (first of which it will be ignored).More difficult sums of types are implemented in C ++ by hierarchy of polymorphic classes. The family of classes with the general primogenitor can be treated as the sum of types in which as an implicit tag components serves the table of virtual functions. What serves as pattern matching in Haskell to, is implemented in C ++ by scheduling function call.

The reader will seldom meet in C ++ use

`union`

as the sum of types because of its excessive limitation. In `union`

it is impossible to place even `std::string`

, because this class has a designer of copying.## Algebra of types

Separately works and the sums of types allow to define a set of useful data structures, but true power results from their combination.

Let's sum up the above results. We considered two commutative monoidal structures which are the cornerstone of type system. It is the sum of types with a neutral element

`Void`

and work of types with a neutral element `()`

. It is convenient to imagine them as addition and multiplication. In that case `Void`

corresponds to zero, and `()`

— to unit.Let's look how far this analogy stretches. For example, whether truly that multiplication by zero gives zero? In other words, whether any work is isomorphic on

`Void`

to type `Void`

?Whether there are couples consisting from, say,

`Int`

and `Void`

? For creation of couple both values are necessary. Value of type `Int`

— it is not a problem, and here with `Void`

there is a hitch: this type is not inhabited (there is no value of this type). Thus, for any type `a`

type `(a, Void)`

also it is not inhabited and, therefore, it is equivalent `Void`

. In other words, `a*0 = 0`

.Addition and multiplication of numbers are connected by the distributive law:

`a * (b + c) = a * b + a * c`

Whether it is executed for the sum and work of types? Yes, to within isomorphism. The left part of identity corresponds to type

`(a, Either b c)`

and right —

`Either (a, b) (a, c)`

Let's show functions which will transform types there and back:

```
prodToSum :: (a, Either b c) -> Either (a, b) (a, c)
prodToSum (x, e) =
case e of
Left y -> Left (x, y)
Right z -> Right (x, z)
```

```
sumToProd :: Either (a, b) (a, c) -> (a, Either b c)
sumToProd e =
case e of
Left (x, y) -> (x, Left y)
Right (x, z) -> (x, Right z)
```

Construction

`case of`

it is used for pattern matching in function. The arrow separates a sample and the expression corresponding to it. For example, by a challenge `prodToSum`

with argument```
prod1 :: (Int, Either String Float)
prod1 = (2, Left "Hi!")
```

variable

`e`

in `case e of`

it will be equal `Left "Hi!"`

. It will match a sample `Left y`

, substituting `"Hi!"`

instead of `y`

. As variable `x`

earlier it was already connected with `2`

, result of construction `case of`

(and all function) will be, as expected `Left (2, "Hi!")`

.We will leave the proof that functions are higher than a vzaimnoobratna to the reader as exercise. They only repack the same data from one format in another.

Two monoid connected by the distributive law in mathematics are called

*a half ring*. It not a complete

*ring*as we not in forces to define subtraction of types. A set of statements, correct for the natural numbers forming a half ring, it is possible to transfer to types. There are several examples:

Numbers |
Types |

0 |
`Void` |

1 |
`()` |

a + b |
`Either a b = Left a | Right b` |

a * b |
`(a, b)` or `Pair a b = Pair a b` |

2 = 1 + 1 |
`data Bool = True | False` |

1 + a |
`data Maybe = Nothing | Just a` |

The type of the list is of special interest as it is defined as an equation solution. The defined type meets in both parties of equality:

`List a = Nil | Cons a (List a)`

Having made normal substitutions and having replaced

`List a`

on `x`

, we will receive`x = 1 + a * x`

This equation cannot be solved by traditional algebraic methods as types cannot be read or divided. Let's try to substitute recursively instead of

`x`

on the right there is an expression `(1 + a*x)`

and to remove the brackets on distributivity. Let's receive```
x = 1 + a*x
x = 1 + a*(1 + a*x) = 1 + a + a*a*x
x = 1 + a + a*a*(1 + a*x) = 1 + a + a*a + a*a*a*x
...
x = 1 + a + a*a + a*a*a + a*a*a*a...
```

Eventually we will come to the infinite sum of works (trains) which can be treated so: the list or is empty,

`1`

; or consists of one element, `a`

; or consists of couple, `a*a`

; or from the three, `a*a*a`

, etc. The determination received formally completely answers intuitive idea of the list as about a line where instead of letters — values of type `a`

.We will return to lists and other recursive structures further, after studying of functors and motionless points.

The solution of the equations with character variables is an algebra! Therefore such data types are also called algebraic (ATD).

Summing up the results, we will give one very important interpretation of algebra of types. Let's notice that work of types

`a`

and `b`

has to contain also value of type `a`

, *and*value of type

`b`

, what is attracted by density of population of both types. On the other hand, the sum of types contains or value of type `a`

, *or*value of type

`b`

, so it is enough that at least one of them was inhabited. Logic operations *of conjunction*and

*a disjunction*are formed by the half ring which is in the following compliance with algebra of types:

Logic |
Types |

false |
`Void` |

true |
`()` |

a || b |
`Either a b = Left a | Right b` |

&& b |
`(a, b)` |

This analogy can be deepened and is a Curry-Howard isomorphism basis between logic and the type theory. We will return to this question by consideration of functional types.

## Exercises

- Show that
`Maybe a`

and`Either () a`

are isomorphic.

- Let's consider the following sum of types in Haskell:

`data Shape = Circle Float | Rect Float Float`

To define on type`Shape`

function`area`

, we will use pattern matching:

`area :: Shape -> Float area (Circle r) = pi * r * r area (Rect d h) = d * h`

Implement`Shape`

on C ++ or Java as the interface also create two classes:`Circle`

and`Rect`

. Then write`area`

as virtual function.

- (Continuation) is simple to add new function
`circ`

, which calculates perimeter`Shape`

. In Haskell determination of type`Shape`

will remain invariable; the following code can be added to any place of the program:

`circ :: Shape -> Float circ (Circle r) = 2.0 * pi * r circ (Rect d h) = 2.0 * (d + h)`

Add`circ`

in your program on C ++ or Java. What parts of the original program had to be modified?

- Add (continuation) to type
`Shape`

new figure`Square`

also enter the corresponding updates to other code. What had to be changed in Haskell? And what in C ++ or Java? (Even if you do not know Haskell, changes have to be quite obvious.)

- Show that formal identity
`a + a = 2 * a`

it is executed for types (to within isomorphism). We remind,`2`

in language of types corresponds`Bool`

(see the table above).

## Thanks

The author is grateful Gershoma Bazermana for reviewing of a post and useful comments.

This article is a translation of the original post at habrahabr.ru/post/274103/

If you have any questions regarding the material covered in the article above, please, contact the original author of the post.

If you have any complaints about this article or you want this article to be deleted, please, drop an email here: sysmagazine.com@gmail.com.We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.

Shared knowledge makes the world better.

Best wishes.