Tail recursive List.map
Asked Answered
G

3

8

The typical List.map function in OCaml is pretty simple, it takes a function and a list, and applies the function to each items of the list recursively. I now need to convert List.map into a tail recursive function, how can this be done? What should the accumulator accumulate?

Guinness answered 9/12, 2014 at 18:47 Comment(3)
This sounds like homework (to me). So maybe you should start by suggesting how you think it could work.Bleeder
This is in fact not homework, but some old exam I'm studying from. But I do have some ideas, such as pattern match the list, if it's empty then return the accumulator (standard thing to do for tail recursion), if it's in the form hd::tl then pass tl to the next function and append (f hd) to the accumulatorGuinness
This sounds great to me. If you do this, how do you handle the base case (when your input is empty). What does the accumulator look like at that point?Bleeder
E
13

Arguably the simplest approach is to implement map in terms of an tail-recursive auxiliary function map_aux that traverses the list while accumulating an already mapped prefix:

let map f l =
  let rec map_aux acc = function
    | [] -> acc
    | x :: xs -> map_aux (acc @ [f x]) xs
  in
  map_aux [] l

However, as the list-catenation operator @ takes time linear in the length of its first argument, this yields a quadratic-time traversal. Moreover, list catenation is itself not tail-recursive.

Hence, we want to avoid the use of @. This solution does not use list catenation, but accumulates by prepending newly mapped arguments to the accumulator:

let map f l =
  let rec map_aux acc = function
    | [] -> List.rev acc
    | x :: xs -> map_aux (f x :: acc) xs
  in
  map_aux [] l

To restore the mapped elements in their right order, we then simply have to reverse the accumulator in the case for the empty list. Note that List.rev is tail-recursive.

This approach, finally, avoids both recursive list-catenation and list reversal by accumulating a so-called difference list:

let map f l =
  let rec map_aux acc = function
    | [] -> acc []
    | x :: xs -> map_aux (fun ys -> acc (f x :: ys)) xs
  in
  map_aux (fun ys -> ys) l

This idea is to have the accumulated prefix list be represented by a function acc that, when applied to an argument list ys, returns the prefix list prepended to ys. Hence, as an initial value of the accumulator we have fun ys -> ys, representing an empty prefix, and in the case for [] we simply apply acc to [] to obtain the mapped prefix.

Epicarp answered 9/12, 2014 at 22:11 Comment(4)
How is the memory usage for the difference list method? Passing around those big closure-of-closure-of-closures sounds like a bit more trouble than just a list …Aldine
@unhammer: From what I have seen, the difference-list approach is typically less efficient than the list-reversal approach. Indeed: due to excessive closure creation.Epicarp
Isn't that "different list" continuation passing style?Pelag
Two years later: yes, @JonHarrop, it is.Sultan
B
7

(I'll take your word that this isn't homework.)

The answer is one of the classic patterns in functional programming, the cons/reverse idiom. First you cons up your list in reverse order, which is easy to do in a tail recursive way. At the end, you reverse the list. Reversing is also a tail-recursive operation, so that doesn't pose a problem for stack safety.

The downside is extra allocation and somewhat more clumsy code.

let map f list =
  let rec loop acc = function
    | [] -> List.rev acc
    | x::xs -> loop (f x::acc) xs in
  loop [] list

A good exercise is to (re)implement a bunch of the 'standard' list functions (append, rev_append, fold_left, fold_right, filter, forall, etc), using this style to make them tail-recursive where necessary.

Barrie answered 10/12, 2014 at 6:27 Comment(1)
I find the question extremely interesting and the answers valuable and instructive. As such, it's all valuable in and of itself regardless of any external circumstances. Or, in other words, who cares whether it's homework or not?!Craggie
D
4

In the category of waving at you from the future, in OCaml 4.14 and later, there is tail_mod_cons which provides an elegant answer to this question, allowing a very natural implementation of map without the stack overflow.

# let[@tail_mod_cons] rec map f = 
    function
    | [] -> []
    | x::xs -> f x :: map f xs;;
val map : ('a -> 'b) -> 'a list -> 'b list = <fun>
# List.init 10_000_000 Fun.id 
  |> map ((+) 1);;
- : int list =
[1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; 16; 17; 18; 19; 20; 21; 22;
 23; 24; 25; 26; 27; 28; 29; 30; 31; 32; 33; 34; 35; 36; 37; 38; 39; 40; 41;
 42; 43; 44; 45; 46; 47; 48; 49; 50; 51; 52; 53; 54; 55; 56; 57; 58; 59; 60;
 61; 62; 63; 64; 65; 66; 67; 68; 69; 70; 71; 72; 73; 74; 75; 76; 77; 78; 79;
 80; 81; 82; 83; 84; 85; 86; 87; 88; 89; 90; 91; 92; 93; 94; 95; 96; 97; 98;
 99; 100; 101; 102; 103; 104; 105; 106; 107; 108; 109; 110; 111; 112; 113; 114;
 115; 116; 117; 118; 119; 120; 121; 122; 123; 124; 125; 126; 127; 128; 129;
 130; 131; 132; 133; 134; 135; 136; 137; 138; 139; 140; 141; 142; 143; 144;
 145; 146; 147; 148; 149; 150; 151; 152; 153; 154; 155; 156; 157; 158; 159;
 160; 161; 162; 163; 164; 165; 166; 167; 168; 169; 170; 171; 172; 173; 174;
 175; 176; 177; 178; 179; 180; 181; 182; 183; 184; 185; 186; 187; 188; 189;
 190; 191; 192; 193; 194; 195; 196; 197; 198; 199; 200; 201; 202; 203; 204;
 205; 206; 207; 208; 209; 210; 211; 212; 213; 214; 215; 216; 217; 218; 219;
 220; 221; 222; 223; 224; 225; 226; 227; 228; 229; 230; 231; 232; 233; 234;
 235; 236; 237; 238; 239; 240; 241; 242; 243; 244; 245; 246; 247; 248; 249;
 250; 251; 252; 253; 254; 255; 256; 257; 258; 259; 260; 261; 262; 263; 264;
 265; 266; 267; 268; 269; 270; 271; 272; 273; 274; 275; 276; 277; 278; 279;
 280; 281; 282; 283; 284; 285; 286; 287; 288; 289; 290; 291; 292; 293; 294;
 295; 296; 297; 298; 299; ...]
# List.init 10_000_000 Fun.id 
  |> List.map ((+) 1);;
Stack overflow during evaluation (looping recursion?).

In fact, tail_mod_cons is now how List.map is implemented.

let[@tail_mod_cons] rec map f = function
    [] -> []
  | [a1] ->
      let r1 = f a1 in
      [r1]
  | a1::a2::l ->
      let r1 = f a1 in
      let r2 = f a2 in
      r1::r2::map f l
Depict answered 19/11, 2022 at 21:3 Comment(3)
TRMC FTW! :) ____Garate
too bad it doesn't do + and the like, as suggested by the original 1974 paper. as a bridge between cons and +, consider pop_and_cons (a, x::xs) = (a+x)::xs used in a TRMC position.Garate
seems like this paper describes an approach that can deal with + etc. though I haven't read it yet.Garate

© 2022 - 2024 — McMap. All rights reserved.