Monads in Category Theory are defined in a different manner than how Monads in Haskell are defined. But both definitions are equivalent. That is what I wrote at the end of my last blog post. In this blog post I will explain what those definitions are and why they are equivalent. Before reading this blog post, be sure to read my last two blog posts, titled “Haskell today” and “Monads in Haskell finally explained”.
In my last blog post “Monads in Haskell finally explained”, I described how Monads are defined in Haskell. A Monad in Haskell is essentially a type constructor armed with the definitions of “return” and “bind” for it. But the definition of a Monad in Category Theory looks different. In due course I will give you this definition and then I will explain why the two definitions are equivalent. You see, it is not at all obvious why they are equivalent.
OK, let us begin from the very beginning. To define Monads in Category Theory, you have to define Natural Transformations. To define Natural Transformations, you have to define Functors. To define Functors, you have to define Categories. So, let us begin from there.
A Category is a collection of objects and arrows. Objects can be anything really. An arrow is a relationship between two objects. An arrow points from a source object to a destination object. You can think of arrows as relationships or functions. You see, arrows do not have to be functions per se. They could be a relationship like “greater than or equal”. For the rest of this discussion though, arrows and functions mean the same thing. Arrows are called “morphisms” in Category Theory parlance.
If you think of arrows as functions, then the source object is the input to the function and the output object is the output of the function.
What about functions with multiple arguments (inputs)? Well, we will only consider functions with one argument, but a function with multiple arguments can be thought of as a function that takes the first argument and returns a function that takes the rest of the argument and returns the same output and so on, until we reach a function that takes the last argument and returns the same output.
Categories can be thought of as sets. Then again, they do not have to be sets. They are collections of things, objects. In our case, which is programming in Haskell, a Category is the Category with objects that are the types that Haskell provides. So, an object in this Category is Bool. Another is Int. Another is Char. And so on. And morphisms in this Category are functions from one type to another. Thus a morphism is Int->Bool. Another is Char->Char. Another is Char->Int. And so on. Thus our category does not have “tangible” objects like the integers (1, 2, 3, …) but it has “intangible” objects like types (Int, Char, Bool, …). Another Category can be one with objects that are lists, with one object corresponding to lists of Ints, another object corresponding to lists of Bools, etc. Another Category can be the Category with objects that are trees, with one object corresponding to trees of Ints, another object corresponding to trees of Bools, etc.
Two important things about Categories are that for each Category there has to exist an identity morphism for each object and that morphisms that are composable have to obey the associativity of composition, that is:
(f o g) o h = f o (g o h)
If these do not hold, we cannot call our collection a “Category”.
But why do we have this obsession about identities, composability and associativity? Well, it is the minimum we can ask in order to study the Mathematical and other entities that come up and still keep meaning in order to do so. The meaning we get is that we can use these entities to study the relations between them and use them to compose more complex and useful entities from them. Composability allows us to use less complex entities in order to compose more complex ones that do more “stuff”, so they are more useful. And the basic rule for composability, the rule that over and over again comes up when we study it is associativity. Not commutativity mind you (commutativity is f o g = g o h). Cases where commutativity does not apply are numerous. But associativity, that is (f o g) o h = f o (g o h), almost always occurs in mathematical entities that we study and have meaning. Whereas in cases when we find that associativity does not hold, corresponding entities usually have no useful meaning to us. So, identities and associativity usually need to be there in order for entities to be useful, interesting and composable in a meaningful way.
All right, now I will present some examples to make sure you grasp the material. Suppose we have a Category, its objects are denoted with capital letters and its morphisms are denoted by small letters.
In this first example, we have a morphism f from K to L and a morphism g from L to M. We can have their composition g o f. The notation g o f means that we apply f first and then g. In other words, we begin from f and continue and end with g. This is denoted by the direction of the arrows, but please note that when we write g o f, it is the other way round. When we use the o operator, we write the first morphism at the end. In the case depicted in this example, we can have g o f but we cannot have f o g, because there are no arrows with the opposite direction here.
In this second example, we draw the identity morphisms as well. (Remember that every object in a Category has to have an identity morphism.) Try to understand why the following equations are true:
f o idA = f
idB o f = f
Answer: The first equation holds true because: We begin at object A. We first apply idA and we are back were we started. Then we apply f and we arrive at B. The second equation holds true because: We begin at object A. We first apply f and we arrive at B. We then apply ibB and we continue to remain at B.
Please remember that f o g means that we first apply g and then f, in other words, we start applying the g morphism and then continue applying the f morphism. And please remember that the existence of f o g does not imply the existence of g o f. Also remember that even if both f o g and g o f exist, they do not have to be equal (commutativity does not have to apply in a Category). At last remember that the following always has to hold:
(f o g) o h = f o g o h = f o (g o h)
(associativity always has to apply in a Category when the composition is defined).
I discussed Functors in my blog post titled “Haskell today”, but let me briefly discuss about them here, as well. A Functor is a morphism between Categories. Again, “morphism” means “mapping”. So, we need to have a source Category and a destination Category in order to talk about a Functor. A Functor maps objects in the source Category to objects in the destination Category. And it also maps morphisms (arrows) from the source Category to morphisms in the destination Category.
And, again, we have the same obsession with identities, composability and associativity. In order to call a morphism from one Category to another a “Functor”, it has to map objects to objects and morphisms to morphisms whilst preserving identities, composability and associativity. I am now going to explain what it means for a Functor to preserve identities and composability and associativity.
So, as I wrote, a Functor F from one Category to another will preserve identities. Let us see what this means. We have a Functor F, so we have a source Category and a destination Category for our Functor. Suppose we have an object A in the source category. This object is mapped to an object F(A) in the destination Category. Out object A in the source Category has to have an identity morphism, idA. The Functor preserves the identity by mapping this morphism to the identity morphism of the object F(A) in the destination Category. For our Functor to preserve identities (identity morphisms) it means that the following equation holds true:
F(idA) = idF(A)
This equation states that the Functor F maps the identity morphism idA of object A in the source Category to the identity morphism idF(A) of the corresponding mapped object F(A) in the destination Category.
Also a Functor has to preserve composability. This means that for every two morphisms in the source Category that are composable , the correspondingly mapped by Functor morphisms in the destination Category have to be composable as well. Suppose that f and g are two morphisms in the source Category and their composite f o g is defined. Then our Functor will map them to the morphisms F(f) and F(g) correspondingly in the destination Category in such a way that the following equation will hold true:
F(f o g) = F(f) o F(g)
This equation states that the Functor F maps the composite f o g of the morphisms f and g in the source Category to the composite of the corresponding maps F(f) o F(g) of each of the two morphisms in the destination Category.
Now let us see how associativity is preserved. All we have to prove is that the following equation holds true:
F(f o (g o h)) = F((f o g) o h)
This is straightforward, since both the source and destination Categories respect associativity:
F(f o (g o h)) =
= F(f) o F(g o h) =
= F(f) o (F(g) o F(h)) =
= F(f) o F(g) o F(h) =
= (F(f) o F(g)) o F(h) =
= F(f o g) o F(h) =
= F((f o g) o h)
All these features make a Functor a structure-preserving map between Categories.
Lastly, please note that Functors are composable (and associativity holds true for their composition). We will use this fact in due course.
I explained how all this Category Theory analysis about Functors corresponds to Functors in Haskell in my blog post titled “Haskell today”. Let me remind you the basic points. A Functor in Haskell has as source Category the Category of types that Haskell provides. The Functor is essentially a type constructor armed with a function called fmap that we provide. A type constructor essentially creates new types based on existing types. The new types exist in the destination Category. Thus our Functor creates/constructs (since it is a type constructor) the objects (types) in the destination Category from the objects (types) in the source Category. The function fmap that we provide describes how the morphisms from the source Category map to the destination Category. This is why fmap has the following type:
fmap :: (a -> b) -> F a -> F b
In the preceding declaration, F is our Functor and a -> b is a morphism in the source Category. fmap describes how the morphism a -> b of the source Category is going to be mapped to the morphism F a -> F b of the destination Category. Thus fmap takes as arguments the morphism a -> b of the source Category and the source object F a of the morphism in the destination Category and has to provide the destination object of the morphism in the destination Category. In essence, a Functor is a type constructor armed with an fmap function which elevates the morphisms in the source Category to morphisms in the destination Category whose objects are created by the type constructor. Or, to put it more clearly, the type constructor part of the Functor maps objects from the source Category to the destination Category and the fmap function part of the Functor maps morphisms from the source Category to the destination Category.
We saw that a Functor is a morphism from one Category to another. A Natural Transformation is a morphism from one Functor to another. Let us see what features a morphism between Functors needs to have in order to be considered a Natural Transformation.
Suppose we have one source Category and one destination Category. And we also have two objects A and B in our source Category and a morphism f from object A to object B. And we also have two Functors F and G.
So far, we have that the objects A and B and the morphism f from A to B in the source Category are mapped to the destination Category as follows:
We see that the Functor F maps object A to object F(A), object B to object F(B) and the morphism f to the morphism F(f). We also see that the Functor G maps object A to object G(A), object B to object G(b) and the morphism F to the morphism G(f).
Now suppose that we have a mapping which assigns to each object X in the source Category an arrow F(X) -> G(X) in the destination Category. Let us call this mapping μ (Greek letter mu) and let us call this arrow μX. So this mapping μ is a mapping from an object to the source Category to an arrow in the destination Category. (The mapping μ does not map morphisms in the source Category.) So we have the following situation:
We see that the mapping μ maps object A to arrow F(A) -> G(A) and object B to arrow F(B) -> G(B).
If the above diagram commutes, then μ is a Natural Transformation between Functors F and G. What does it mean for the above diagram to commute? It means that the path from F(A) via μA and G(f) should lead us to the same G(B) as the path from F(A) via F(f) and μB.
(Note: I stated that “the mapping μ does not map morphisms in the source Category.” I made this statement to help you understand how we define Natural Transformations. But every Natural Transformation also defines a function that maps arrows from the source Category to arrows in the destination Category and thus allows us to have an arrows-only description of Natural Transformations. Just saying.)
Natural Transformations are the reason Category Theory was invented. The inventors saw Natural Transformations occurring in many fields in Mathematics and created Category Theory to study them. We should expect to study Natural Transformations, that is, mappings (between Functors) that respect the commutativity of the definition diagram I provided earlier. Mappings like those provide a structure-preserving transformation of the mapping of one Functor to the mapping of another Functor, when both Functors have the same source and destination Categories. For our discussion, we can just think of Natural Transformations as functions or mappings or transformations.
So , what is this that Natural Transformations transform? Well, they transform the image that the source Functor F creates in the destination Category, to the image that the destination Functor G creates in the destination Category. This transformation is denoted with the horizontal arrows in the definition diagram.
(Note: You might ask where this is going to end: We defined Functors as morphisms between Categories. We then defined Natural Transformations as morphisms between Functors. What is next, defining morphisms between Natural Transformations? And so on and so forth? And if so, where is this going to end? Well, for our discussion, we need only concern ourselves up until Natural Transformations. The rest is beyond the scope of this blog post.)
We have already seen Natural Transformations. In my blog post titled “Monads in Haskell finally explained” I talked about “return” and “bind”. Guess what: these are Natural Transformations. The “how and why” will become clear in the coming sections. But just to give you a spoiler of the things to come, a Monad is defined as a triple of a Functor and two Natural Transformations. These Natural Transformations are “unit” (or “return” in Haskell parlance) and “join”. But due to an analysis called “Kleisli Category of a Monad”, a Monad can be expressed equivalently by the same Functor and the Natural Transformations “unit” and “bind”. Kleisli Category of a Monad gives us the relationship between “join” in the original definition and “bind” in the Kleisli definition. And it is the Kleisli definition that used in Haskell.
Right! The time has come to talk about Monads. In my last blog post “Monads in Haskell finally explained”, I described a Monad in Haskell as a type constructor armed with the definitions of “return” and “bind” for it. Let us see how Category Theory defines Monads.
In Category Theory, a Monad is a triple comprised of a Functor and two Natural Transformations. Before we talk about the Functor and the two Natural Transformations, we have to understand that since we are talking about a Functor, we need to have a source and a destination Category. And since we are talking about two Natural Transformations, we need to have a source Functor and a destination Functor for each Natural Transformation (and each of these Functors need to have its source and destination Category).
In my blog post “”Haskell today”, I was desperately seeking the source and destination Categories in the definition of a Functor in Haskell. I gave a (sufficient enough ?) answer that the source Category contains the Haskell types and their morphisms (functions) and the destination Category contains the types we obtain from the type constructor of the Functor. The type constructor operates on the types in the source Category and produces the types we have in the destination Category. Well, there is another school of thought that says that in all functional languages all Functors are Endofunctors, meaning that their soucre and destination Category is the same and comprises of types. Either way, you can think of source and destination Categories for Functors using my approach or the Endofunctor approach and still be correct. But when we talk about Monads, things are specific: the Functor in the Monad triple is definitely an Endofunctor. Thus, the Functor of a Monad has the same source and destination Category and this Category contains types. We can symbolize the Monad as a triple (M, η, μ) where M is the Functor and eta and mu are the Natural Transformations. The Functor (Endofunctor) M maps a type a to a type M a via its type constructor M and a morphism f to M f via its fmap function.
What about the two Natural Transformations eta and mu? Well things remain easy because they use “powers” of the Functor M as source and destination Functors. Specifically, the eta Natural Transformation has the unit Functor I as source Functor and the M Functor as destination Functor. The mu Natural Transformation has the Functor M2 as source Functor and the Functor M as destination Functor.
The unit Functor I is the Functor that takes an object or morphism and maps it to itself. And remember when I wrote that Functors can be composed? Well, they can also be composed with themselves, thus providing “powers” of them. So we have:
I o M = M o I = M
M2 = M o M
M3 = M o M o M = M2 o M = M o M2
and so on. Thus a Natural Transformation such as mu from M2 to M is well defined, since the “powers of Mare well defined.
Why do we need these two Natural Transformations? What is their meaning? While you will have to wait until the next section to find out, you can think of the eta (unit or “return” in Haskell parlance) Natural Transformation from I to M as performing an elevation of types to the M realm. You can also think of the mu (“join” in Haskell parlance) Natural Transformation from M2 to M as performing a de-elevation or flattening of the M realm. You will see why this is important in the next section.
Since a Monad is defined as a tripe (M, η, μ) and all source and destination Categories and Functors stem from either the unit Functor or the Functor M (which is an Endofunctor) in the triple, we should be covered. We only have one source and destination Category, that of Haskell types and the morphims (functions, arrows) between them.
But that is not all we need for our definition to be complete. In order for our triple to be a Monad, the following diagrams have to commute:
You can think that the first diagram corresponds to the Monad’s adherence to the associative law. You can think of the second diagram as the Monad’s adherence to the identity/unit laws.
Ok, this triple we call a Monad may have potential, but what is this all for, anyway? Why did we define it in such a way? What good will a Monad do to us? What are a Monad’s benefits to Category Theory and also to functional programming and Haskell? Those and more will be revealed after I present another definition for a Monad which stems from the Kleisli Category analysis. After that, we will have a more complete view of the Monad and its Natural Transformations. And we will be able to understand its usefulness as denoted in the Kleisli Category analysis and in its original definition. Then we will have come full circle, but hopefully we will also be a little more knowledgeable than our first time in.
The Kleisli Category of a Monad
The Kleisli Category of a Monad is an analysis that provides another definition/explanation for a Monad. It is this definition that Haskell uses. And this definition is equivalent to the original definition we saw in the last section.
In this section I am going to present the “Kleisli Category of a Monad” analysis and then I am going to provide the proof of its equivalence to the original definition of a Monad.
The Kleisli analysis is based in a clever thought: Suppose that we have a Monad (M, η, μ) in a Category X. Let us define a Category K (denotes the name “Kleisli”). The Category K has the same objects as X. And an arrow A -> M B in Category X is an arrow A -> B in Category K. Thus Category K is a mathematical construct that help us reason about the objects and morphisms of the Category X, without the pesky M type constructor getting in the way. If we have two morphisms in Category X as follows:
g :: A -> M B
f :: B -> M C
we cannot compose them in a straightforward manner, because g gives M B as its output, whereas f takes B (and not M B) as its input.
But in Category K, these two functions become:
g :: A -> B
f :: B -> C
and so their composition f o g is straightforward:
f o g :: A -> C
(f o g) A = f(g(A))
And now comes the important part: Let us see how this composition would look like in Category X. Again, we have the functions
g :: A -> M B
f :: B -> M C
in Category X and we are looking for their composition f os g in Category X. Please note that I used the symbol os instead of the regular symbol o for the composition. I just created the symbol os in order to denote this abnormal composition. “S” in my mind means “special” in this case. I could have used “A” (as the index for o) for “abnormal” and it would have been more correct. Anyway, how are we going to define and calculate f os g? Is it possible? Yes, it is. The Kleisli analysis shows us how.
First of all, we understand that
f os g :: A -> M C
We just have to find out how to force the output M B of g to the input B of f.
The answer comes for the Kleisli analysis in the following line:
The line above contains the answer we were looking for. It shows us how to compose f and g in the Category X. Let me explain the meaning of it.
We start from object A in Category X. We apply the morphism g and we arrive at object M B in Category X. Indeed, this is the definition of morphism g. We are now at object M B. Unfortunately, we cannot apply the morphism f to object M B, because morphism f goes from B to M C. Morphism f does not start at object M B. It starts at object B. So, we cannot apply the morphism f to object M B, but we can apply the morphism M f to object M B.
Indeed, M is a Functor (an Endofunctor to be precise). So given an object A, Functor M maps it to M A. And given a morphism f :: B -> M C, Functor M maps it to M f :: M B -> M(M C)), that is to f :: M B – > M2 C.
Now, the mu Natural Transformation provides the mapping from M2 to M. So, by applying the mu Natural Transformation, we get from object M2 C to object M C.
So, have we finished? Indeed we have. But do you understand the implications? First of all, we have been able to compose the two functions f and g, although the Functor M was “elevating” the output types and, thus, was getting in the way. So, we have been able to compose functions of types X -> M Y and Y -> M Z. Thus we have been able to achieve the holy grail of what we were talking about in my blog post “Monads in Haskell finally explained”: we have been able to compose functions that are “impure”, have “side effects” and so on.
Did you see how we did it? We used the definition of a Monad for that. Not only a Monad provides the M Functor/type constructor that does the elevation of our types to the M “realm”, a Monad also provides us with the mu Natural Transformation that flattens M2 to M. And it is this flattening that helped us compose the two functions f and g.
Have you seen how we composed the two functions? I noted their composite as f osg. Did you see that it corresponds to applying g, then M f and then μ? Well it is straightforward to see this by looking at the line that we had from the Kleisli analysis. Since we denote the composition in the opposite order of the application of the functions, we have that
f os g = μ o (M f) o g
Of course, since M is a Functor, when we apply it to an object it gives us another object, but if we apply it to a morphism, it gives us another morphism. If we apply M to an object, we use the type constructor of M to get the destination object. If we apply M to a morphism (function) we use the fmap function of M to get the destination function. Thus the last equation can be rewritten as:
f os g = μ o fmap f o g
The previous equation can be rewritten as
f os g = join o fmap f o g
I saved the best for last: Did you realize that the symbol os that has been staring you all along is really the symbol bind >>= that I was talking about in my blog post “Monads in Haskell finally explained”? Indeed, f osg is another way to write g >>= f. So, not only have we defined >>=, we have also calculated it with respect to mu (join). And here lies the correspondence and equivalence between the definition of a Monad and the Kleisli analysis. In the definition of a Monad, we use mu (“join”). In the Kleisli analysis we use os (“bind”). But “join” and “bind” are equivalent and one derives from the other. “join” maps objects operated by M2 to objects operated by M, thus flattening the realm of M. At the same time, this flattening, that “join” makes possible, helps us obtain the composition of functions that have types of X -> M Y. But this is also what “bind” does as defined in Haskell:
>>= :: M b -> (b -> M c) -> M c
which is taking the output M b of one function a -> M b and forcing it (binding it) to the input of another function b -> M c, thus obtaining the output M c.
Thus “join” and “bind” are equivalent. In Category Theory we mostly use “join” and in Haskell we mostly use “bind”, but we can see that they are both “sides of the same coin”, so to speak.
Here is a helpful reminder:
η (eta), return, a -> M a
μ (mu), join, M (M a) -> M a
>>=, bind, M a -> (a -> M b) -> M b
g >>= f is equal to join . fmap f . g
We have come far. Not only did we study Category Theory and talked about objects, morphisms, Categories, Functors, Natural Transformations and Monads, we saw how their definitions in Category Theory relate and correspond to their definitions in Haskell. It is a shame that the material about Haskell do not provide the correspondence of the concepts and entities in Haskell with those in Category theory.
For example, when you read about Haskell, all you get is something along these lines:
(a -> b) -> F a -> F b
F(a -> b) -> F a -> F b
M a -> (a -> M b) -> M b
If you learn the basics about Category Theory, you realize what these types mean. If you don’t learn the basics, you have no clue. For example, I added Applicatives here, even though we have not discussed them at all. But just by knowing Category Theory, you can understand what they are about. Let us see, together these three entities.
Functors: The Functor is symbolized by F. It maps object a to object F a and it maps object b to object F b. We can consider a Functor to be a way to take a function a -> b and get a corresponding function F a -> F b. We can also consider a Functor to be a way to take a function a – > b and the object F a and get the corresponding object F b.
Applicatives: Just from this definition we see that the Functor F maps a morphism a -> b to the morphism F(a -> b). We can consider an Applicative to be a way to take the mapping of a function a -> b and get a corresponding function F a -> F b. We can also consider an Applicative to be a way to take the mapping of a function a -> b and the object F a and get the corresponding object F b.
Monads: We see that we have a Functor M that maps object a to object M a and object b to object M b. We can consider a Monad to take an elevated type M a and feed it (force it, give it) as input to a function that takes a simple type a and produces an elevated type M b. The output will of course be of type M b. We can also consider a Monad as the way to force the elevated output M a of a function to the simple input of a function that produces an elevated output. It is important to note that “elevated” means types that can contain whatever side effects we want them to. By providing bind, which is the definition of the higher order function
M a -> (a -> M b) -> M b
we provide the way for functions of type X -> M Y to produce output with side effects and at the same time obey the laws that Category Theory imposes, i.e. adherence to identities and composability. And since “bind” and “join” are equivalent, we can also consider a Monad to be a way of flattening the realm of the Functor M.
By knowing a little bit of Category Theory, we are able to give meaning to the entities that Haskell uses. Functors seem familiar, because we understand objects, morphisms and their mapping from one Category to another. Without any talk about Applicatives we immediately understood that their difference from Functors is that instead of a morphism as their input, the have the Functor’s mapping of a morphism as their input. Thus, both inputs and outputs F(a -> b), F a and F b belong in the Functor’s destination Category. By knowing about Monads we understand that M is an Endofunctor, thus it has the same Category as source and destination and Natural Transformations such as “return”, “join” and “bind” are available.
It is much better to able to “visualize” and to think about entities from Category Theory like objects (dots, types), morphisms (arrows, functions, mappings, transformations), Categories, Functors, Natural Transformations and Monads when we deal with Haskell. It makes everything clearer. In Haskell, I always try to understand what is transformed into what, what gets mapped into what, what changes into what. And I always try to understand what are the underlying objects, morphisms, Categories, etc. Only then do I feel that I have a clear understanding of the workings of a Haskell program.
Monads are strange beasts and it is difficult to cover them extensibly. I tried to give you an idea of Category Theory and Monads in order to get a better understanding about Haskell. The more you study Category Theory, the more you will begin to appreciate Monads, as well as the theory itself and all the entities it defines. In this blog post, I just tried to show that Monads in Category Theory and Monads in Haskell are the same thing, although they do not appear to be at first “sight”. And yes, there are a lot of things to be said about Monads, both in Category Theory and in Haskell, that I have not covered here. Hopefully, this blog post, along with my previous two ones, will help you get a first grasp of Monads and get you on the road to studying them in a more advanced manner.
Just as I previously presented the two schools of thought about the source and destination Categories concerning Functors in functional languages, I would like to finish by presenting the two schools of thought about the name “Monad”. There is the notion that the word “Monad” comes from “Monoid”, which is another entity of Category Theory. Indeed, Monads can be defined and described from Monoids, which is beyond the scope of this blog post. Also, Monads can be defined and described from Adjunctions, which are yet another entity in Category Theory and also beyond the scope of this blog post. By learning how Monads are derived from Monoids, or by learning how Monads are derived from Adjunctions, will help you understand Monads even better. But, as I wrote, these matters are beyond the scope of this blog post.
So, one school of thought says that the name “Monad” is derived from the name “Monoid”, since a Monoid leads to the definition of a Monad. I respectfully disagree. I believe that the name “Monad” does not derive from the name “Monoid”. I believe that the name Monoid means “single”, “something that is alone”, something that is by itself”, “something that is only one by counting”. I also believe that the name “Monad” means “God”.
As crazy as it sounds, I think that Monads were name as such because they have the remarkable ability of producing whole Algebras by themselves (a topic also beyond the scope of this blog post). Thus they are like God. Just by themselves, they produce whole mathematical entities. So, I think that Monad means “something that is by itself, not just by counting, but because it needs nothing else and there is nothing else like it and it acts like God”. But, feel free to doubt me.
Finally, I would like to give you a heads up. I just wrote that Monads can also be described via Monoids (and also Adjunctions). There is a widely used definition of Monads that was first given by Saunders Mac Lane. The definition is as follows: “A monad is a monoid in the category of endofunctors”. And it is widely quoted and used. As a matter of fact, Saunders Mac Lane’s exact phrase was: “All told, a monad in X is just a monoid in the category of endofunctors of X, with product x replaced by composition of endofunctors and unit set by the identity endofunctor”. It is not difficult to explain what all this means, but the explanation is beyond the scope of this article. Of course, Saunders Mac Lane wrote this phrase with the best of intentions. And when he wrote it, he previously explained why it holds true. The problem is that people use it by itself and without any explanation in order to answer the question: “What is a Monad?”.
Thus, when newcomers to Category Theory or functional programming ask “What is a Monad?”, all they get is the answer “A monad is a monoid in the category of endofunctors” and nothing else. Well, alone and by itself this answer means nothing and should not be accepted.
In all honesty, this has not happen to me, but I have heard that it happened to other newcomers in these fields. I would like to expose this ugly phenomenon and state that such answer is unacceptable. If someone answers to you that “A monad is a monoid in the category of endofunctors”, this is like mocking you; this is like telling you to “buzz off”. If someone wants to answer “the question what is a Monad?”, at a minimum she should say that “A Monad is an entity from Category Theory with great significance in Category Theory and in functional programming, where it allows the composition of impure functions, meaning functions that contain side effects”.