Find keys closest to an integer argument in Haskell tree
Asked Answered
L

1

1

There is a plenty of solutions how to find closest lower and upper keys in binary tree in imperative languages, but a lack of same questions for doing it in purely functional style like that of Haskell. I'm curious to learn how it's possible to walk around a binary serch tree ahead of meeting both closest keys. There is a function and some pattern matches I've so far:

data TreeMap v = Leaf | Node { pair::(Integer, v), l::TreeMap v, r::TreeMap v} deriving (Show, Read, Eq, Ord)

closestLess :: Integer -> TreeMap v -> (Integer, v)
closestLess i Leaf = error "Tree doesn't include any element"
closestLess i (Node pair tree_r tree_l)
        | i < fst pair = closestLess i tree_l
        | i == fst pair = closestLess i tree_r
        | otherwise = precise i pair tree_r 

I use this function to get a lower key, but closest to an Integer argument. In my opinion, there is nothing but necessity to implement some auxiliary function like "precise", although I'm confused what definition it needs exactly. My suggestion is to put that Integer value, node into "precise", right subtree also to find any key which is just closest one to the target. Therefore I would have some tips or assumptions how to make it for lower and upper keys as well.

Lancey answered 14/12, 2019 at 15:44 Comment(5)
I don't get what you would want your precise function to do. Also, avoid throwing errors in pure functions, instead find a better representation for your result (like using Maybe)Cadaver
Can you maybe post a working imperative version of closest lower keys so that we can understand what exactly you want your haskell version to do?Cadaver
Following @Bergi's request, I added type definition. Sorry for this mistakeLancey
@Bergi, I need solving that as in this tutorial for C++ programmers linkLancey
I think you can follow their algorithm quite literally - except maybe those weird in-out mutable min_diff and min_diff_key parameters, which - even worse - also have some no-value values like MAX_INTEGER and -1. Write a function Integer -> TreeMap v -> Maybe (Integer, v) -> Maybe (Integer, v) with parameters searchValue, tree, and closestSoFar.Cadaver
D
0

Here's how I'd do it:

data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v) deriving (Show, Read, Eq, Ord)

closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess i = precise Nothing where
  precise :: Maybe (Integer, v) -> TreeMap v -> Maybe (Integer, v)
  precise closestSoFar Leaf = closestSoFar
  precise closestSoFar (Node k v l r) = case i `compare` k of
    LT -> precise closestSoFar l
    EQ -> Just (k, v)
    GT -> precise (Just (k, v)) r

A few notes regarding the differences between this and your attempt:

  • You used record syntax for the Node constructor. It's bad form to use record syntax on sum types, since the functions would then be partial (for example, pair Leaf would be bottom). Since you didn't actually use these and they're not necessary, I removed them.
  • You wrapped the key and value in a tuple, with no apparent reason why. I separated them out and placed them directly in the type.
  • Your closestLess function had a return type of (Integer, v), even though it couldn't always return something of that type. I changed it to Maybe (Integer, v), so that it can return Nothing instead of having to use error. (Side note: your error message was technically wrong. If you called closestLess where the search value is smaller than all of the nodes, it would fail even though the tree did have elements.)
  • Your code is inconsistent as to which is the left and which is the right branch of the nodes. In my code, the left branch is always the one that's on the left in the data constructor.
  • You used i < fst pair and i == fst pair in separate guards. By case-matching on the output of compare instead, you only have to do the comparison once instead of twice.
  • You were on the right track with needing a precise function, but a lot of the logic that you had in closestLess actually needed to be in it.

Here's a quick test case, using the example at the site you linked:

Prelude> tree = Node 9 () (Node 4 () (Node 3 () Leaf Leaf) (Node 6 () (Node 5 () Leaf Leaf) (Node 7 () Leaf Leaf))) (Node 17 () Leaf (Node 22 () (Node 20 () Leaf Leaf) Leaf))
Prelude> closestLess 4 tree
Just (4,())
Prelude> closestLess 18 tree
Just (17,())
Prelude> closestLess 12 tree
Just (9,())
Prelude> closestLess 2 tree
Nothing

You can also make it lazier (yielding the outer Just as soon as a single candidate is found), at the expense of being a bit more complex:

import Data.Functor.Identity

data TreeMap v = Leaf | Node Integer v (TreeMap v) (TreeMap v) deriving (Show, Read, Eq, Ord)

closestLess :: Integer -> TreeMap v -> Maybe (Integer, v)
closestLess i = precise Nothing
  where
    precise :: Applicative t => t (Integer, v) -> TreeMap v -> t (Integer, v)
    precise closestSoFar Leaf = closestSoFar
    precise closestSoFar (Node k v l r) = case i `compare` k of
      LT -> precise closestSoFar l
      EQ -> pure (k, v)
      GT -> pure . runIdentity $ precise (Identity (k, v)) r

See my question about this for more details about it.

Daube answered 14/12, 2019 at 17:3 Comment(1)
It works perfectly, thank you. It's a very remarkable explanation of this case solution, indeed - as a newbie in functional programming, I appreciate your review and conclusions about all of that very much.Lancey

© 2022 - 2024 — McMap. All rights reserved.