I’m going to try to explain in the simplest way how clojure’s iteration function works. It’s hard to discern how it works by reading the documentation (atleast the first few times).
One of the most common use cases for iteration
is making paginated api calls.
A paginated api call usually returns two important data points,
the actual data
and a cursor
which is used to fetch the next set of data points from the sequence.
Let’s define a mock api-call.
(defn api-call [cursor]
{:body (take 10 (drop cursor (range 100)))
:next-cursor (when-not (<= 100 cursor) (+ cursor 10) )})
(api-call 0) ; {:body (0 1 2 3 4 5 6 7 8 9), :next-cursor 10}
(api-call 10) ; {:body (10 11 12 13 14 15 16 17 18 19), :next-cursor 20}
(api-call 100) ; {:body (), :next-cursor nil}
One way to consume all the elements in the paginated api is using a loop-recur form.
(loop [cursor 0
acc []]
(let [result (api-call cursor)
body (:body result)
cursor* (:next-cursor result)
acc* (conj acc body)]
(if (some? cursor*)
(recur cursor* acc*)
acc)))
; [(0 1 2 3 4 5 6 7 8 9)
; (10 11 12 13 14 15 16 17 18 19)
; (20 21 22 23 24 25 26 27 28 29)
; (30 31 32 33 34 35 36 37 38 39)
; (40 41 42 43 44 45 46 47 48 49)
; (50 51 52 53 54 55 56 57 58 59)
; (60 61 62 63 64 65 66 67 68 69)
; (70 71 72 73 74 75 76 77 78 79)
; (80 81 82 83 84 85 86 87 88 89)
; (90 91 92 93 94 95 96 97 98 99)]
Let’s break down the loop-recur form, this will be useful when we try express the same thing with iteration
.
(loop [cursor 0 ; step 1. initialize the cursor
acc []] ; step 2. initialize a container/accumulator
(let [result (api-call cursor) ; step 3. call the api with a cursor
body (:body result) ; step 4. extract the result/data that you care about
cursor* (:next-cursor result) ; step 5. get the next cursor
acc* (conj acc body) ; step 6. add the result/data to the accumulator
]
(if (some? cursor*) ; step 7. check a condition to know whether to recurse or return data.
(recur cursor* acc*)
acc)))
They way I like to think of it is
if the iteration function needs to replace the loop-recur form, it must at least replicate some of the functionality of the loop-recur form.
Let’s look at the arglist of iteration
.
[step & {:keys [somef vf kf initk]}]
Its api depends on 5 different values, out of these, except for initk
all are functions (trust me on this for now).
The things iteration
needs to do is
- Call the
api-call
function with a new cursor. so we NEED to pass theapi-call
function to it.api-call
will be the value ofstep
. - It also needs to have an initial value for the cursor so it can make the first
api-call
. The initial value of cursor will be the value ofinitk
(step 1) - After the first call of
api-call
withinitk
(step 3), it needs to know how to extract the data from the result. Thevf
function will do this. (step 4) - After the first call of
api-call
withinitk
(step 3), it needs to know how to extract the cursor from the result. Thekf
function will do this. (step 5) - finally,
iteration
needs to know when to terminate (step 7) we do that by passing asomef
function which should return a truthy/falsy value.
Now you might ask what about step 2 and step 6? doesn’t iteration
need to know where and how to accumulate the results?
No, iteration
leaves that up to you, it returns a reducible
or a seqable
, it’s upto you on how you want to realize the results.
Which means you could use any function that you use to realize a transducer
.
We will use into
(into [] conj (iteration ...))
What does the iteration function call look like?
To recap.
(loop [cursor 0 ; 1. initk
acc []] ; 2.
(let [result (api-call cursor) ; 3. step
body (:body result) ; 4. vf
cursor* (:next-cursor result) ; 5. kf
acc* (conj acc body) ; 6.
]
(if (some? cursor*) ; 7. somef
(recur cursor* acc*)
acc)))
step
corresponds to the 3rd step.somef
corresponds to the 7th stepvf
corresponds to step 4kf
corresponds to step 5initk
corresponds to step 1
The important thing to remember is-
(step initk)
produces a return a value, let’s call it iteration-result
.
vf
, kf
, and somef
is called on iteration-result
.
The arguments to iteration
step
will beapi-call
somef
will be#(some? (:next-cursor %))
(remember thatsomef
will be passed the result and not the next cursor)vf
will be:body
kf
will benext-cursor
initk
will be0
(remember: keywords are functions)
(into [] conj (iteration api-call {:initk 0
:vf :body
:kf :next-cursor
:somef #(some? (:next-cursor %))}))
;=>
; [(0 1 2 3 4 5 6 7 8 9)
; (10 11 12 13 14 15 16 17 18 19)
; (20 21 22 23 24 25 26 27 28 29)
; (30 31 32 33 34 35 36 37 38 39)
; (40 41 42 43 44 45 46 47 48 49)
; (50 51 52 53 54 55 56 57 58 59)
; (60 61 62 63 64 65 66 67 68 69)
; (70 71 72 73 74 75 76 77 78 79)
; (80 81 82 83 84 85 86 87 88 89)
; (90 91 92 93 94 95 96 97 98 99)]
the nice thing about iteration
returning a reducible is that we can concatenate the results into sets/vectors/list.
(into #{} cat (iteration api-call {:initk 0
:vf :body
:kf :next-cursor
:somef #(some? (:next-cursor %))}))
; #{0
; 65
; 70
; 62
; .
; .
; .
; 8
; 49
; 84}
Why use iteration over loop-recur?
The loop-recur approach is easier to understand at the cost of verbosity (not a bad thing). It’s also familiar to most.
But it’s not lazy, there’s no caching (unless you explicitly add it).
iteration
gives you the benefit of laziness and caching.