Chapter 6 Collections of Data
Section 6.1 Data Structures
So far, we’ve only dealt with single values at a time. All our functions take in single value arguments and return single values. Sometimes, we need to be able to work with more than just single values. In theses times, we need a Data Structure.
A data structure is a combination of multiple values, the way those values are arranged, and a set of rules on how we can interact with those values. There are many different kinds of data structures used in computer science, each with their own costs and benefits. Understanding how they work allows us to pick the best one for a particular task.
Before we get into our first programming data structure, let’s look at a non-cs related example. Consider how two different people store their clean clothes. We’ll call these people Lisa and Bart.
- Lisa stores all her clothes in her dresser, and she has different drawers for socks, underwear, shirts and pants.
- Bart stores all his clothes in a pile on the floor.
Both Lisa and Bart store their clothes in a single place, but the ways the clothes are organized and how one interacts with those clothes are quite different. Lisa’s clothes are well organized, and when she wants to get a clean shirt, she can go directly to the shirt drawer and find one. But this organization comes at a cost. Whenever Lisa gets new clean clothes, she has to take the time to fold them and put them in the right place.
When Bart wants a clean shirt, he has to pull clothes off the pile until he finds one. This is certainly less effective that Lisa’s drawer method. But, when Bart gets new clean clothes, he can just empty the laundry basket on the floor and move on with his day.
Section 6.2 Linked Lists
Racket is a descendant of the Lisp programming language. Lisp is an acronym for LIST Processing. So it is fair to say that Lists are fairly important to the development of Racket. The lists we will work with are specifically called linked lists. A linked list collects values in nodes, where each node has two parts:
- A particular value.
- A link to the next node.
Section 6.3 Linked Lists in Racket
In Racket, there are many functions that work with lists. Let’s start with some basic list manipulation functions:
list
: Takes 0 or more arguments and returns a list containing those arguments as elements.(list)
returns the empty list:'()
(list 1 2 3)
returns the list:'(1 2 3)
(list 1 #t "happy" -9.4)
returns the list:'(1 #t "happy" -9.4)
first
: Takes a list as an argument and returns the value stored in the first node of that list.first
will result in an error if you try to use it on the empty list.(first (list 1 2 3))
returns:1
(first (list "hello" 2))
returns:"hello"
(first (list (list 1 2) 3))
returns:'(1 2)
rest
: Takes a list as an argument and returns node that the first element of the list is linked to. This will be the entire list after the first element.rest
will result in an error if you try to use it on the empty list.(rest (list 1 2 3))
returns:'(2 3)
(rest (list 2))
returns:'()
(rest (list (list 1 2) 3))
returns:'(3)
Subsection 6.3.1 build-list
build-list
is a function that, as you can probably guess, builds a list. But how it builds a list is where it gets interesting.$ (build-list 5 sqrt)
> '(0 1 1.4142135623730951 1.7320508075688772 2)
(build-list n func)
takes a positive integer, n
and a function func
, and creates a list of n
elements generated by performing func
on each integer starting a t 0 and going. to n-1. (build-list 5 sqrt)
(list (sqrt 0) (sqrt 1) (sqrt 2) (sqrt 4) (sqrt 5))
The second argument of This code will generate the list
build-list
must be a function. Sometimes (often) you’ll need to provide your own function instead of relying on one of Racket’s built in ones. Let’s say you wanted to write a list that contains the powers of 2. There’s no powersOfTwo function in Racket, so we’d have to build our own. (define powersOfTwo
(lambda (n)
(expt 2 n)))
(build-list 5 powersOfTwo)
(1 2 4 8 16)
, but it’s worth asking Did I need to write and define powersOfTwo? The only reason we needed that function was because Racket didn’t have the function we needed. So do we have to write helper functions every time we need a custom build-list
?Yes,
but...
We don’t need to define a new function every time.
You may vaguely recall that we we introduced writing functions,
lambda
was described as something that creates and returns an anonymous function. Because we can pass functions as values, we can then pass the return value of lambda
, which is a function, into define
which will assign a name to the anonymous function just created. At the time, we didn’t have a good use for anonymous functions, so we kept using define
every time we made a function with lambda
.But now we do have a good use for anonymous functions! Why spend time defining a function to use in These are all valid
build-list
if that’s the only time we’re going to need it? If build-list
takes a function as a parameter, then we can use lambda
to make that function. Since we don’t need to reuse it, it can be anonymous. Turns out, anonymous functions are incredibly useful for list processing! (build-list 5 (lambda (n) (expt 2 n)))
(build-list 5 (lambda (n) (* 3 n)))
(build-list 5 (lambda (n) n))
build-list
function calls that use lambda
in place of a named function. The three lines of code above would produce: '(1 2 4 8 16)
'(0 3 6 9 12)
'(0 1 2 3 4)
It is worth noting that even though
func
must be a function with one parameter, you don’t have to use that parameter. The function call below will make a list of 5 random integers between 0 and 10. In this case, the argument 5
is there to provide the size of the list, but the values 0 - 4 are not used. (build-list 5 (lambda (x) (random 10)))
Section 6.4 Manipulating Lists
list
or build-list
, but how do we work with them after we’ve made them? There are two major ways to approach list manipulation in Racket. One was involved using built in functions that work with an entire list at once, the other is to work with each list node individually. We’ll explore both options, but let’s start with the build in functions.
Subsection 6.4.1 filter
filter
is a function that we use to filter out (or more accurately, in) certain values from a list. (define g (list 58 36 73 46 21 42 28 46 34 64))
g
as a list. To do this, we’d use: (filter odd? g)
build-list
one of the arguments to filter
is a function. In the example above, we’re using Racket’s odd?
function. Specifically, the function argument to filter
must be a function that takes one argument and returns a boolean value. filter
will call the provided function on every element in the provided list. Any value that results in #true
will be put into the list returned by filter
.Also like It’s important to pay attention to the syntax here.
build-list
, anonymous functions are quite useful here. Instead of getting the odd numbers, let’s get all the values in g
greater than 50: (filter (lambda (n)
(> n 50))
g)
(lambda (n) (> n 50))
is the first argument to filter
, and g
is the second. It’s a common error to accidentally add an extra )
to the end of the lambda
function, which would actually close off filter
.
Subsection 6.4.2 foldl
So far, our list functions have had reasonably good names
list first rest build-list filter
all do a great job of describing what they do. Then we have foldl
. To be fair, the name does make sense, but in a more technical way than is useful to start. In other languages, the same operation is named reduce, which might be a little more helpful. The purpose is to take a list, perform a function on the elements in the list and get a value back. Often, the returned value is a single value (as opposed to a list) so we say that the list is reduced to a single value.So how does this thing work? Here are the parameters for
foldl
: (foldl func startValue g)
func
is a function that must take exactly two arguments.startValue
is a value used the first timefunc
is called.g
is the list to work on.
Each time
func
is called, the first argument will be the next value from the list, and the second argument will be the value returned from the previous time func
was called. startValue
takes the place of the previous return value at the beginning, since there will not have been a previous value. So foldl
starts by doing: (func (first g) startValue)
Let’s look at a concrete example: To start, we’ll have: That will return That will return Which gives us the answer,
(define g (list 58 36 73))
(foldl + 0 g)
(+ 58 0) ;(first g) is 58 0 is startValue
58
(0 being the ideal startValue
for this case). The next operation will be: (+ 36 58) ;(first(rest g)) is 36, 58 was the previous return value
94
, and next: (+ 73 94)
167
.Like Table 6.1.
filter
and build-list
we can also use lambda
to help with foldl
. Here is another more complex example: (define g (list 58 73 36 46))
(foldl (lambda (value previous)
(and previous
(> value 50)))
#t
g)
value |
func |
result |
---|---|---|
58 |
(and #t (> 58 50)) |
#t |
73 |
(and #t (> 73 50)) |
#t |
36 |
(and #t (> 36 50)) |
#f |
46 |
(and #f (> 46 50)) |
#f |
Subsubsection 6.4.2.1 What’s in a Name?
So what’s up with the name
foldl
? As it turns out, there are actually two parts to the name, fold and l, which in this instance stands for left. Another way of thinking about the function is that if folds the list down, element by element, from left to right. In the examples above, the direction of the folding doesn’t matter much, because order doesn’t matter +
and and
. When order is important, there is another option, foldr
which works exactly like foldl
, except it works right to left: (foldl - 0 (list 58 73 36 46))
; this would be run as:
(46 (- 36 (- 73 (- 58 0))))
; and return:
25
(foldr - 0 (list 58 73 36 46))
; this would be run as:
(- 58 (- 73 (- 36 (- 46 0))))
; and return:
-25
Subsection 6.4.3 map
The last list processing function we’ll look at is
map
. Like filter
and foldl
, map
goes through each element of a list, performing a provided function. Each individual return value is placed into a new list that is returned: (define g (list 58 36 73 46 21 42 28 46 34 64))
(map (lambda (value)
(if (> value 50)
value
50))
g)
;this would return:
'(58 50 73 50 50 50 50 50 50 64)
If you give In fact you can give as many lists as you want, as long as the following rules are followed:
map
a function that has one parameter and a single list, it will call that function on each element of the provided list. But if you give map
a function with two parameters and two lists, it will call the function using one element from each list at a time: (define g (list 1 2 3))
(define h (list 5 7 11))
(map (lambda (value0 value1)
(* value0 value1))
g
h)
;this would return:
'(5 14 33)
- The number of list arguments to
map
must equal the number of arguments to the provided function. - Each list must be the same length.
Here is another example worked out further: Table 6.2.
(define g (list 58 73 36 46))
(define h (list 32 97 42 14))
(map (lambda (value0 value1)
(if (> value0 value1)
value0
value1))
g
h)
value0 |
value1 |
func |
result |
---|---|---|---|
58 |
32 |
(if (> 58 32) 58 32) |
'(58) |
73 |
97 |
(if (> 73 92) 73 97) |
'(58 97) |
36 |
42 |
(if (> 36 42) 36 42) |
'(58 97 42) |
46 |
14 |
(if (> 46 14) 46 14) |
'(58 97 42 46) |