Tuples vs. records in Haskell

Title: Tuples vs. records in Haskell
Alternative title: How to do Object-Oriented Programming in Haskell

Many newcomers to Haskell learn about the support for tuples in the language and immediately fall victims of them. This is because they use them not only where they tuples are good at but also instead of records, another Haskell construct that is way more useful and powerful.

To cut a long story short, in most cases we should use records instead of tuples. Many people do not know about records, thus they use tuples irrespectively; a practice that creates awkward programs and entails great difficulty for the programmer.

In this blog post I will describe tuples and records. I will explain why records are way more powerful and should be the default, normal and usual choice of any Haskell programmer. I will also explain how to do Object-Oriented Programming in Haskell using records as objects (as structures, rather). And I will show a great Haskell technique that lets you create a list of objects (records) from a string, thus showing that Haskell can function as a dynamic programming language.

Tuples

OK, let us start with tuples. Tuples are different than lists. Lists contain a variable number of things that are the same. A list is denoted by []. Tuples contain a fixed number of things that can be different. A tuple is denoted by (). Let me explain.

The following is a valid Haskell list:

[2,1,4]

It is a list of three integers. The following is also a valid Haskell list:

["cat","dog","cow","chicken","pig","science"]

It is a list of six strings. The following is not a valid Haskell list:

[2,"hello"] -- Not a valid Haskell list

It is not a valid Haskell list, because a list contains things that are of the same type. Thus, we can have a list of integers or a list of strings, but we cannot have both integers and strings in the same list.

Now, a tuple contains things that can be different, but we should imagine that a specific type of tuple contains the same number of things and in the same position. Again, let me explain.

Here is a valid Haskell tuple containing two elements:

("hi",1)

This tuple contains a string and an integer. Now, for another tuple to be considered of the same type, it too has to contain a string and an integer in the same order. Also, we can have a list of tuples. The following is a valid Haskell list:

[("mountain",100), ("lake", 200), ("river",300)]

because all tuples in the list are of the same type. But the following is not a valid Haskell list:

[("mountain",100), ("lake", 200), ("river",300), ("forest", "valley")] -- Not a valid Haskell list

It is not a valid Haskell list because the last tuple is not like the previous ones. Instead of containing one string and one integer, it contains two strings.

Just as we can have a list of tuples, a tuple can have a list as an element. The following is a valid Haskell list:

[("mountain",[100]), ("lake", [100,200]), ("river",[100,200,300]), ("forest",[])]

which means that all tuples inside the list are of the same type. The following is not a valid Haskell list:

[("mountain",[100]), ("lake", [100,200]), ("river",[100,200,300]), ("forest",["valley"])] -- Not a valid Haskell list

because the last tuple is not of the same type as the others in the list. The last tuple has a list of strings as its second element, whereas all other tuples have a list of integers as their second element.

So far we have seen only tuples of two elements. But a type of a tuple can be made to have from two to any number of elements, as follows:

("Dimitrios", "Kalemis", "Male", "1968-10-22", 1.75, "Anarchist", "Atheist")

This tuple has seven elements: a string that can contain a person’s first name, a string that can contain a person’s last name, a string that can contain a person’s gender, a string that can contain a person’s birth date, a float that can contain a person’s height (mine is in meters and without wearing high stiletto heels), a string that can contain a person’s political views, and string that can contain a person’s religion.

As can be seen, tuples provide a way to group attributes about an object and thus can be used the way structures are used in the C programming language. But I would advise against it. Instead of using Haskell tuples as C structures, we should use Haskell records as C structures, as I am going to explain in the rest of this blog post.

So, what is it that makes tuples inadequate for use as C structures? Well, before I explain, I just want to say that I do not forbid you to use tuples, I am just trying to protect you. Tuples are not as versatile as records in Haskell. I am going to address the concept of records in the following section. For now, let me pose a question: How are you going to obtain, say, my date of birth, from the previous tuple?

Well, Haskell provides the functions fst and snd that obtain the first and second element respectively from a tuple that has two elements. These functions do not work on a tuple that has more than two elements. And even if they did, my date of birth is in third position. So, no luck. The answer is that to obtain each element from a tuple that contains more than two elements, you either have to write some code (like: getThirdElement (_,_,a,_ ,_ ,_) = a) or use an existing library (like: Data.Tuple.Select) that addresses this need. And even if you follow one of these approaches, you will still have to address the tuple elements by position alone. You will have to constantly remind yourself that the birth date is the third element, in order to obtain it. But if you use records instead of tuples, you will be able to forgo all these nuances.

Actually, there is another bad technique that people use in order to “stretch” the concept of tuples. Since you can have tuples inside tuples, people do something like the following:

("Dimitrios", ("Kalemis", ("Male", ("1968-10-22", (1.75, ("Anarchist", ("Atheist")))))))

To get to my birth date, you would have to do the following:

fst . snd . snd . snd $ ("Dimitrios", ("Kalemis", ("Male", ("1968-10-22", (1.75, ("Anarchist", ("Atheist")))))))

What is going on here? Well, all tuples here are tuples with two elements, but they are deeply nested. It is a bad technique because, still, the problem remains: You have to constantly remind yourself that the birth date is the third element, in order to obtain it.

So, let us see what are records in Haskell and why are better suited to be used as C structures.

Records

All right! Records in Haskell! Let us jump right in! Here is a record in Haskell:

data MyRecord = MyRecord { firstName :: String
                         , lastName :: String
                         , gender :: String
                         , dateOfBirth :: String
                         , height :: Float
                         , politicalViews :: String
                         , religion :: String
                         } deriving (Read, Show)

You are allowed to write the previous definition in one single line, but I find that this presentation is easier to read and understand. As you can see, each one of the record’s elements has a corresponding identifier/name , so that we do not have to remember each element’s position in the record, and so that we can retrieve each element in a straightforward and easy manner. At the end we add the deriving(Read, Show) sub-statement, in order to let Haskell provide default read and show handlers for our record.

Here is a program that demonstrates the use of records in Haskell:

module Main where

data MyObject = MyObject { myField1 :: String
                         , myField2 :: String
                         , myField3 :: Int
                         , myField4 :: Int
                         } deriving (Read, Show)

main :: IO ()
main  =
   do
      putStrLn "Program begins."

      let myInstance1 = MyObject {myField1 = "One", myField2 = "111", myField3 = 10, myField4 = 100}
      let myInstance2 = MyObject {myField1 = "Two", myField2 = "222", myField3 = 20, myField4 = 200}
      let myInstance3 = MyObject {myField1 = "Three", myField2 = "333", myField3 = 30, myField4 = 300}

      print (myField1 myInstance1)
      print (myField2 myInstance1)
      print (myField3 myInstance1)
      print (myField4 myInstance1)

      print (myField1 myInstance2)
      print (myField2 myInstance2)
      print (myField3 myInstance2)
      print (myField4 myInstance2)

      print (myField1 myInstance3)
      print (myField2 myInstance3)
      print (myField3 myInstance3)
      print (myField4 myInstance3)

      putStrLn "Program ends."

In the previous program, I name the record as MyObject, to signify that this is as close to objects as we can get in Haskell. Actually, a record is like a C structure. It brings together different attributes. But a record is not like a C++ class, because it can only hold attributes (data) about an imaginary object; it cannot hold functions that manipulate these attributes/data. Thus, we cannot really do Object-Oriented Programming in Haskell. All we can do is create records that hold together the data that makes up an object. In Haskell, we have to operate on that data using functions that are not part of the records/objects. Thus, the word “Object” for a Haskell record is a misnomer. Still, a Haskell record is as close as we can get to an object. To be precise, a Haskell record is like a C structure.

In the previous program, with the definition data MyObject = … we loosely create what Object-Orient Programming calls a class. It is the mold out of which individual instances of our object will be made. Each one of our objects will have four attributes, of those, the first two are strings and the last two are integers. In the main program, we create three instances (objects) and we give values to their attributes. Then we print each object’s attributes one by one.

As we can see, the name of each attribute is also a function that returns the value of the attribute given the instance of the object/record. But this also entails that we cannot have different types of records with the same attribute name in a module. This is a shortcoming of Haskell.

What the previous program demonstrates is that we can get a specific attribute from a specific instance in a straightforward manner.

In the rest of this blog post, we will treat the handling of records in a more dynamic manner. First of all, we will create a list of records and add records to it.

module Main where

import Data.List

data MyObject = MyObject { myField1 :: String
                         , myField2 :: String
                         , myField3 :: Int
                         , myField4 :: Int
                         } deriving (Read, Show)

main :: IO ()
main  =
   do
      putStrLn "Program begins."

      let myInstance1 = MyObject {myField1 = "One", myField2 = "111", myField3 = 10, myField4 = 100}
      let myInstance2 = MyObject {myField1 = "Two", myField2 = "222", myField3 = 20, myField4 = 200}
      let myInstance3 = MyObject {myField1 = "Three", myField2 = "333", myField3 = 30, myField4 = 300}

      let myListOfMyObjects = [myInstance1, myInstance2, myInstance3]

      print "Initial list:"
      print myListOfMyObjects

      print "Length of initial list:"
      print (length myListOfMyObjects)
      print "Second attribute of the first object of the list:"
      print (myField2 (myListOfMyObjects !! 0))
      print "Fourth attribute of the second object of the list:"
      print (show (myField4 (myListOfMyObjects !! 1)))

      let myInstance4 = MyObject {myField1="Four", myField2 = "444", myField3 = 40, myField4 = 400}
      let myNewList = myInstance4 : myListOfMyObjects

      print "New list:"
      print myNewList

      let myNewList' = sortBy (\x y -> compare (myField2 x) (myField2 y)) myNewList

      print "New list, sorted by the second attribute:"
      print myNewList'

      putStrLn "Program ends."

As the previous program demonstrates, you can do a great deal of things with records. You can create a list of records and access any record in the list and any attribute of that record. You can add records to a list of records and you can sort the list by any attribute of the record you want.

In the previous program, I used import Data.List in order to use the sortBy function. Also I used the show function in order to print a numeric attribute of a record.

Now let me present a nice looking program that explores lists of records (objects) in great detail. The program follows:

module Main where

import Data.List

data MyObject = MyObject { myField1 :: String
                         , myField2 :: String
                         , myField3 :: Int
                         , myField4 :: Int
                         } deriving (Read, Show)

main :: IO ()
main  =
   do
      putStrLn "Program begins."

      let initialList = createTheList
      putStrLn "=== Initial list printed by default ==="
      print initialList
      putStrLn "=== Initial list printed specifically ==="
      printTheList initialList

      putStrLn "=== Sorted by field1 ascending ==="
      let list1asc = sortBy (\x y -> compare (myField1 x) (myField1 y)) initialList
      printTheList list1asc
      putStrLn "=== Sorted by field1 descending ==="
      let list1desc = sortBy (\x y -> flip compare (myField1 x) (myField1 y)) initialList
      printTheList list1desc

      putStrLn "=== Sorted by field2 ascending ==="
      let list2asc = sortBy (\x y -> compare (myField2 x) (myField2 y)) initialList
      printTheList list2asc
      putStrLn "=== Sorted by field2 descending ==="
      let list2desc = sortBy (\x y -> flip compare (myField2 x) (myField2 y)) initialList
      printTheList list2desc

      putStrLn "=== Sorted by field3 ascending ==="
      let list3asc = sortBy (\x y -> compare (myField3 x) (myField3 y)) initialList
      printTheList list3asc
      putStrLn "=== Sorted by field3 descending ==="
      let list3desc = sortBy (\x y -> flip compare (myField3 x) (myField3 y)) initialList
      printTheList list3desc

      putStrLn "=== Sorted by field4 ascending ==="
      let list4asc = sortBy (\x y -> compare (myField4 x) (myField4 y)) initialList
      printTheList list4asc
      putStrLn "=== Sorted by field4 descending ==="
      let list4desc = sortBy (\x y -> flip compare (myField4 x) (myField4 y)) initialList
      printTheList list4desc

      putStrLn "==========="
      putStrLn "Program ends."

createTheList :: [MyObject]
createTheList  =
   [ MyObject {myField1 = "One",   myField2 = "111", myField3 = 20, myField4 = 300}
   , MyObject {myField1 = "Two",   myField2 = "222", myField3 = 10, myField4 = 200}
   , MyObject {myField1 = "Three", myField2 = "333", myField3 = 30, myField4 = 400}
   , MyObject {myField1 = "Four",  myField2 = "444", myField3 = 50, myField4 = 500}
   , MyObject {myField1 = "Five",  myField2 = "555", myField3 = 40, myField4 = 100}
   ]

printTheList       :: [MyObject] -> IO ()
printTheList []     =
   do
      return ()
printTheList (x:xs) =
   do
      let myString = (myField1 x) ++ " - " ++
                     (myField2 x) ++ " - " ++
                     (show (myField3 x)) ++ " - " ++
                     (show (myField4 x))
      putStrLn myString
      printTheList xs

In the previous program, we have a list of five objects that the function createTheList provides to the main program. We also create a function printTheList that prints a list of MyObjects in a nice format. Then we sort the list by each field, in both ascending and descending ways. Although it is possible and encouraged to use a lambda like (\x y -> compare (myField1 y) (myField1 x)) in order to sort the list by myField1 descending, I use the lambda (\x y -> flip compare (myField2 x) (myField2 y)) which is equivalent. All that the function flip does is that it changes the order of the arguments.

Of course, you can provide any lambda you can think of to sortBy. For example, you can create a record type that has a list as one of its attributes and then create a list of those records. You can then sort the list of records by the length of the list that each record has.

Creating lists from strings

So far I have demonstrated that we can easily write programs in Haskell that have records and lists of records. To make these programs more dynamic, I am going to demonstrate a Haskell technique that allows us to create a list from a string. And this list can be an ordinary list, or a list of records. The following program is a proof of this concept:

module Main where

data MyObject = MyObject{myField1 :: Int, myField2 :: Int} deriving (Read, Show)

main :: IO ()
main  =
   do
      putStrLn "Program begins."

      let myList1 = toList
      print myList1

      let myList2 = toMyObjectList
      print myList2

      putStrLn "Program ends."

toList :: [Integer]
toList  = read "[1,2,3,4]"

toMyObjectList :: [MyObject]
toMyObjectList  = read "[MyObject{myField1=10,myField2=100},MyObject{myField1=20,myField2=200}]"

As we see in the previous program,we can create a list from a string. The list’s elements can be of a built-in type (here integers) or of a user-defined type (here MyObjects).

Now I will present again the nice looking program (that I presented at the end of the previous section) that explores lists of records (objects) in great detail. But this time I will change the function create the list to create the list from a string. The backslashes in the string are escape characters for beginnings and endings of lines, as well as for string quotes.

module Main where

import Data.List

data MyObject = MyObject { myField1 :: String
                         , myField2 :: String
                         , myField3 :: Int
                         , myField4 :: Int
                         } deriving (Read, Show)

main :: IO ()
main  =
   do
      putStrLn "Program begins."

      let initialList = createTheList
      putStrLn "=== Initial list printed by default ==="
      print initialList
      putStrLn "=== Initial list printed specifically ==="
      printTheList initialList

      putStrLn "=== Sorted by field1 ascending ==="
      let list1asc = sortBy (\x y -> compare (myField1 x) (myField1 y)) initialList
      printTheList list1asc
      putStrLn "=== Sorted by field1 descending ==="
      let list1desc = sortBy (\x y -> flip compare (myField1 x) (myField1 y)) initialList
      printTheList list1desc

      putStrLn "=== Sorted by field2 ascending ==="
      let list2asc = sortBy (\x y -> compare (myField2 x) (myField2 y)) initialList
      printTheList list2asc
      putStrLn "=== Sorted by field2 descending ==="
      let list2desc = sortBy (\x y -> flip compare (myField2 x) (myField2 y)) initialList
      printTheList list2desc

      putStrLn "=== Sorted by field3 ascending ==="
      let list3asc = sortBy (\x y -> compare (myField3 x) (myField3 y)) initialList
      printTheList list3asc
      putStrLn "=== Sorted by field3 descending ==="
      let list3desc = sortBy (\x y -> flip compare (myField3 x) (myField3 y)) initialList
      printTheList list3desc

      putStrLn "=== Sorted by field4 ascending ==="
      let list4asc = sortBy (\x y -> compare (myField4 x) (myField4 y)) initialList
      printTheList list4asc
      putStrLn "=== Sorted by field4 descending ==="
      let list4desc = sortBy (\x y -> flip compare (myField4 x) (myField4 y)) initialList
      printTheList list4desc

      putStrLn "==========="
      putStrLn "Program ends."

createTheList :: [MyObject]
createTheList  = read "\
   \[ MyObject {myField1 = \"One\",   myField2 = \"111\", myField3 = 20, myField4 = 300}\
   \, MyObject {myField1 = \"Two\",   myField2 = \"222\", myField3 = 10, myField4 = 200}\
   \, MyObject {myField1 = \"Three\", myField2 = \"333\", myField3 = 30, myField4 = 400}\
   \, MyObject {myField1 = \"Four\",  myField2 = \"444\", myField3 = 50, myField4 = 500}\
   \, MyObject {myField1 = \"Five\",  myField2 = \"555\", myField3 = 40, myField4 = 100}\
   \]"

printTheList       :: [MyObject] -> IO ()
printTheList []     =
   do
      return ()
printTheList (x:xs) =
   do
      let myString = (myField1 x) ++ " - " ++
                     (myField2 x) ++ " - " ++
                     (show (myField3 x)) ++ " - " ++
                     (show (myField4 x))
      putStrLn myString
      printTheList xs

Advertisements

About Dimitrios Kalemis

I am a systems engineer specializing in Microsoft products and technologies. I am also an author. Please visit my blog to see the blog posts I have written, the books I have written and the applications I have created. I definitely recommend my blog posts under the category "Management", all my books and all my applications. I believe that you will find them interesting and useful. I am in the process of writing more blog posts and books, so please visit my blog from time to time to see what I come up with next. I am also active on other sites; links to those you can find in the "About me" page of my blog.
This entry was posted in Development. Bookmark the permalink.