Translate imperative control flow with break-s/continue-s to haskell
Asked Answered
C

9

7

Consider the following imperative code which finds the largest palindrome among products of 3-digit numbers (yes, it's the one of the first tasks from "Project of [outstanding mathematician of 18th century]" site):

curmax = 0
for i in range(999,100):
for j in range(999,100):
    if ((i*j) < curmax): break
    if (pal(i*j)):
        curmax = i*j
        break
print curmax

As I'm learning Haskell currently, my question is, how do you translate this (and basically any imperative construct that contains something more complex than just plain iteration, e.g. breaks, continues, temporary variables and all this) to Haskell?

My version is

maxpal i curmax
    | i < 100 = curmax
    | otherwise = maxpal (i-1) (innerloop 999)
    where 
        innerloop j
            | (j < 100) || (p < curmax) = curmax
            | pal p = p
            | otherwise = innerloop (j-1)
            where p = i*j
main = print $ maxpal 999 0

but this looks like we're still in imperative uglytown.

So what could you advise, what are the approaches of dealing with such cases FP-style?

Closegrained answered 22/12, 2010 at 18:47 Comment(3)
Well, if you try to translate an imperative control flow into Haskell, then you will be in imperative uglytown, pretty much by definition.Mccollough
That's the actual question: not how to blindly translate imperative code to Haskell syntax, but what approaches are used in functional programming to elegantly deal with complex flows.Closegrained
This is perhaps a little gnomic, but rather than translate from imperative to functional I usually go back to the spec instead (specs often don't care about control flow) and work from that. When you do have complex control flow, continuation-passing style (CPS) and lazy evaluation (which can separate production from consumption of intermediate values) are two valuable techniques to abstract the complexity.Reflexion
I
6

If we do away with all the optimization and just multiply all combinations of numbers between 100 and 999, filter out the non-palindromes and take the maximum of that, we can write the function very concisely as:

maximum $ filter pal [x*y | x <- [100..999], y <- [100..999]]

Of course this is basically the least efficient way to do it, but since the numbers are relatively small, this still finishes in under half a second on my machine.

However if we want something that is more along the lines of your python solution algorithmically, we can do it like this:

import Data.Maybe
import Data.List

maxpal i curmax
    | i < 100 = curmax
    | otherwise = maxpal (i-1) newmax
    where newmax = fromMaybe curmax (find pal bigger)
          bigger = takeWhile (> curmax) (map (*i) [999, 998 ..])

Here the outer loop is basically the same as in your solution, but we replaced the inner loop using list functions instead.

We're using map (*i) [999, 998, ...] to create the product i*j for every j counting down from 999. Using takeWhile we're saying that the list should stop once a value is not greater than curmax.

Then we're using find to see whether any item in that list is a palindrome. If it is, the first palindrome in the list is our new max. If it isn't we keep our old max. (find returns a Maybe and fromMaybe takes a default value and a Maybe and returns the value from the Maybe or the default value if there is no value in the Maybe)

Ireful answered 22/12, 2010 at 19:28 Comment(3)
Your second solution is what I was looking for. It highlights several iteration primitives like find and takeWhile, which is good for a newbie like me, and does exactly what my imperative code does.Closegrained
It's striking to me how everybody started advising a list comprehension (which -- I should have added that to the question -- I'm aware about). It clearly doesn't work well in this case: change "3-digit numbers" constraint to 4-digit, and you'll be waiting forever. No matter how smart comprehension is, you can't run away from O(n^2) in this specific case.Closegrained
@dorseg: The point though is that in this case the input size is fixed and for the given input size, the list comprehension works reasonably well. There's no reason to optimize if you know the input size is never going to grow. Also the problem with the first solution isn't that it's a list comprehension, but that it's quadratic. The map (*i) [999, 998 ..] could as well be written as a list comprehension ([x*i| x <- [999, 998 ..]]) without losing any performance.Ireful
N
8

Similar answer to Daniel's and sepp2k's:

Lazy functional programming lets you write programs in a much more modular way than you see in imperative control flow like the one in your question. For instance, form the list of factors 999...100, then all products, then filter to retain only the palindromes, and then compute the maximum. Thanks to laziness, these intermediate lists will come into being only as needed and will be incrementally recycled.

For more explanation and examples, see John Hughes's classic paper Why Functional Programming Matters.

maxpal :: Int
maxpal = maximum [i*j | i <- factors, j <- factors, pal (i*j) ]

factors :: [Int]
factors = [999,998..100]

pal :: Show a => a -> Bool
pal = palL . show

palL :: (Eq a) => [a] -> Bool
palL xs = xs == reverse xs
Naranjo answered 22/12, 2010 at 19:29 Comment(5)
Hm. dorserg's version first tests the product size, and then tries the palindrome test, which might be an important optimization missing from Daniel's, sepp2k's and my (all very similar) suggestions.Naranjo
Actually it's not missing from my second, longer solution.Ireful
@sepp2k: Sorry for not being more specific in my comparisons. The optimization is missing from the elegant & composable versions given so far, including yours. I wonder how elegantly & modularly one could write maxpal while having this optimization.Naranjo
I feel that list comprehension + filtering is a tool that in most cases where the cartesian product of two or more sources is produced, is not really "production code". I.e. too slow - sort of like a toy. This is dangerous because the syntax is so succinct. I'd like to see list comprehensions be extended to specify how the search should be performed. If the programmer can't specify how the search should be performed, then it's probably really slow.Award
@user239558: Do you have any evidence to support your feeling about list comprehension & filtering executing slowly? Are you aware that comprehensions eliminated (desugared) at compile-time? Have you read papers & seen statistics about the optimization of fusion frameworks? There's a lot of info and experience out there.Naranjo
I
6

If we do away with all the optimization and just multiply all combinations of numbers between 100 and 999, filter out the non-palindromes and take the maximum of that, we can write the function very concisely as:

maximum $ filter pal [x*y | x <- [100..999], y <- [100..999]]

Of course this is basically the least efficient way to do it, but since the numbers are relatively small, this still finishes in under half a second on my machine.

However if we want something that is more along the lines of your python solution algorithmically, we can do it like this:

import Data.Maybe
import Data.List

maxpal i curmax
    | i < 100 = curmax
    | otherwise = maxpal (i-1) newmax
    where newmax = fromMaybe curmax (find pal bigger)
          bigger = takeWhile (> curmax) (map (*i) [999, 998 ..])

Here the outer loop is basically the same as in your solution, but we replaced the inner loop using list functions instead.

We're using map (*i) [999, 998, ...] to create the product i*j for every j counting down from 999. Using takeWhile we're saying that the list should stop once a value is not greater than curmax.

Then we're using find to see whether any item in that list is a palindrome. If it is, the first palindrome in the list is our new max. If it isn't we keep our old max. (find returns a Maybe and fromMaybe takes a default value and a Maybe and returns the value from the Maybe or the default value if there is no value in the Maybe)

Ireful answered 22/12, 2010 at 19:28 Comment(3)
Your second solution is what I was looking for. It highlights several iteration primitives like find and takeWhile, which is good for a newbie like me, and does exactly what my imperative code does.Closegrained
It's striking to me how everybody started advising a list comprehension (which -- I should have added that to the question -- I'm aware about). It clearly doesn't work well in this case: change "3-digit numbers" constraint to 4-digit, and you'll be waiting forever. No matter how smart comprehension is, you can't run away from O(n^2) in this specific case.Closegrained
@dorseg: The point though is that in this case the input size is fixed and for the given input size, the list comprehension works reasonably well. There's no reason to optimize if you know the input size is never going to grow. Also the problem with the first solution isn't that it's a list comprehension, but that it's quadratic. The map (*i) [999, 998 ..] could as well be written as a list comprehension ([x*i| x <- [999, 998 ..]]) without losing any performance.Ireful
B
3

There's no one-size-fits-all answer here. But let's walk through this specific example:

First, consider the outer loop: We always do the full range, and we only care about the final maximum, so that's easy enough:

outerLoop = foldl innerLoop 0 [999,998..100]

In the inner loop, we have some value of i and a current maximum. Now we only care about the range where i*j is larger than the current maximum:

innerLoop curmax i = foldr checkMax curmax [999*i, 998*i .. curmax]

In the core logic we get a value for i*j which we know will always be greater then or equal to the current maximum, so all that's necessary is to check the next value to see if it's a palindrome: If so, we're done, because the sequence decreases. If not, defer the decision:

checkMax ij defer = if pal ij then ij else defer
Boyceboycey answered 22/12, 2010 at 19:47 Comment(2)
+1 I am a Haskell novice, but I am surprised this answer had no votes, apparently it mimics very well the python code intent and it's still clear and compact. One minor suggestion, I think [ii, (i-1)*i .. curmax] would be enough in the second line (so you don't check both ab and b*a).Fadeout
@tokland: That sounds correct, but as you noticed I was mimicking the conceptual (rather than syntactic) structure of the Python version, so I'll leave the code as is. Good observation, though!Boyceboycey
A
2

In my mind a range corresponds to a list. For example:

f = [999,998..100]

Now f is defined as the sequence of numbers from 999 to 100.

for loops correspond to different functional concepts, depending on what you're doing in each iteration. Sometimes a map is the appropriate analog, sometimes a fold, sometimes something else. Often times it's a combination of things. In this case, you're effectively combining two lists. One way to do that in Haskell is a list comprehension:

g = [(x * y) | x <- f , y <- f]

Here g represents a list of the product of each element of the previously defined sequence combined with itself. In other words, pretty much what you've got going on within your for loop.

From here, you'd probably want to filter the resulting sequence to contain only values that are palindromes, and then calculate the maximum value from that set.

Adena answered 22/12, 2010 at 19:19 Comment(0)
O
2

So, thinking functionally you should be looking at ways to break your problem not into loops and steps but into functions.

So, if we had a function maxWhere f xs which returned the largest x for which f x is true, we could write:

maxpal = maxWhere pal [x * y | x <- [999,998..100], y <- [999,998..100]]

A naive implementation of maxWhere is

maxWhere f xs = maximum $ filter f xs

but this is bad if f is more expensive than comparison as we'll be making more calls to f than in the original. We can use fold to combine the filter and the maximum into a single pass and get the same behaviour as the imperative code.

maxWhere f xs = foldl' r 0 xs
    where r a x
       | x > a     = if f x then x else a
       | otherwise = a

The use of zero as a magic small number here is horrible, but works in that case.

(I really want to spell that list of candidate numbers (*) <$> [999,998..100] <*> [999,998..100], but that may be introducing an unnecessary complication here.)

Ooze answered 22/12, 2010 at 22:58 Comment(0)
G
1

this kind of loop lends itself easily to a list comprehension, like this:

maximum [x*y | x <- [999..100], y <- [999..100],isPalindrome (x*y)]

Where we might write isPalindrome like this:

isPalindrome x = xs == reverse xs
  where xs = show x

This is really fast enough, although sort of unsmartypantsy, so first off we will notice that we are checking the numbers twice. Let's say a*b is the biggest palindrome, then we will check both the case where x == a, y==b, and x==b, y==a. So first we stop this by restricting the numbers we search through to only the cases where x >= y, like this:

maximum [x*y | x <- [999..100], y <- [x..100],isPalindrome (x*y)]

This cuts the numbers to test in half.

In your python solution you also bound y below by the biggest number we have found so far divided by the current x (x*y => curmax), also you never search beyond the first y found (breaking the inner loop if curmax is updated). We may cut the search further by not continuing if the first element we check (x squared) is less then our current answer, since all subsequent checks are smaller, but this is beyond what looks good in a list comprehension so we move our search into it's own function:

import Data.List(find)
import Data.Maybe(isNothing,fromJust)

search x curr 
   | x * x < curr                   = curr
   | isNothing maypal || pal < curr = search (x - 1) curr 
   | otherwise                      = search (x - 1) pal 
   where maypal = find isPalindrome [x * x, (x - 1) * x .. curr]
         pal    = fromJust maypal

It's worth noticing how our limitation, (x*x) < curr, really just means that from now on, [x*x,(x-1)*x..curr] is going to be empty. As you can see, all of the bounds that were enforced by breaks in your python code fits inside one iteration on x (using recursion) and a find on a list of x*y values. It might not look nicer, but it seems to me to state more explicitly the restrictions we make on x and y.

Running it we get:

*Main> search 999 0
906609

Turns out that stopping when x * x < curr is a really good idea since the square root of 906609 is 952...

Grig answered 22/12, 2010 at 18:47 Comment(0)
M
1

Gah. Beaten by sepp2k, but I'll answer your general question:

Temporary variables can also be expressed using the state monad, or ST monad if you have a lot of them. FP often wins in succintness and clarity, but in some cases it doesn't, e.g. when there are several local variables to juggle.

Laziness can emulate many breaks, but when dealing with IO, you generally have to use explicit recursion. However, the 'List' package (from Hackage) is rather clever at allowing you to write IO loops in a functional style.

Mccowan answered 22/12, 2010 at 19:40 Comment(0)
C
0

As noted by stephen tetley in his comment, in FP you can use continuation passing style to handle complex control flow (Cont monad plus its callCC which is somehow similar to break. ...or even goto - abuse of CPS can lead to rather incomprehensible code - see my example below):

import Control.Monad.Cont

pal n = sn == reverse sn
    where sn = show n

range = [99999,99998..10000]

mfoldM a r f = foldM f a r  

curmaxm = (`runCont` id) $ mfoldM 0 range $ \m i ->
            callCC $ \break ->
                mfoldM m range $ \m j -> do
                  let ij = i*j
                  if ij < m
                     then break m
                     else return $
                          if pal ij then ij else m

Two mfoldM's (just a standard foldM with its arguments rearranged) correspond to two loops in the original sample and break function-argument is used in "inner loop" to exit it once (i*j > current max) condition is violated (returning the current max as a result of that "inner loop"). Here we need to escape from just one "loop level", so callCC here is definitely overkill.

The same logic can also be implementing with find (+ laziness of Haskell):

import Data.List
import Data.Maybe
import Control.Monad

curmax = fromJust $ foldM it 0 range
    where 
      it m i = (find pal . takeWhile (>m) . map (*i) $ range) `mplus` return m

find pal here returns either the first palindrome number (which will also satisfy (>m) condition in takeWhile) or Nothing (zero of MonadPlus) and after mplus (or Alternatice.<|>) it effectively returns either a new maximum palindrome or the previous max (return m). Since find stops searching once the first satisfying element is found, this code behave exactly as its imperative curmax analog. Both version run for [99999..10000] range in 0.5 second.

Update: Just for fun: same approach but using StateT Integer (Cont Integer) () - Cont to escape from the "loop" and State to pass max palindrome around (plus an ability to use forM_ and when). Same efficiency:

import Control.Monad.Cont
import Control.Monad.State.Strict

solcs = runCont (execStateT comp 0) id
    where   
      comp = forM_ range $ \i -> callCC $ \break ->
                forM_ range $ \j -> do
                  let ij = i*j
                  m <- get
                  when (ij < m) (break ())
                  when (pal ij) (put ij)  
Cantilever answered 23/12, 2010 at 7:16 Comment(0)
S
0

I think you can do what you want using two mutually recursive functions.

Here's a much simpler example (taken from a tutorial on ATS):

implement main (argc, argv) = let
  fun loop1 (i: int): void =
    if i <= 9 then loop2 (i, i) else ()

  and loop2  (i: int, j: int): void =
    if j <= 9 then begin
      if i < j then begin
        print ", ";
        print "("; print i; print ", "; print j; print ")";
        loop2 (i, j+1)
      end
    end else begin
      print_newline ();
      loop1 (i+1)
    end
  in
    loop1 0
  end

The code written above is very much like what you'd write in C (taken from that same page):

int main (int argc, char *argv[]) { int i, j ;

for (i = 0; i <= 9; i += 1) {
  for (j = i; j <= 9; j += 1) {
    if (i < j) printf (", ") ; printf ("(%i, %i)", i, j) ;
  } /* for */
  printf ("\n") ;
} /* for */

return 0 ;

}

As you see, nested loops become mutually recursive functions; and mutable variables i and j become induction variables. loop1 corresponds to the outer loop, whereas loop2 to the inner loop.

Shepperd answered 26/1, 2011 at 14:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.