How do I remember the root of a binary search tree in Haskell

Asked 6/1, 2022 at 6:36 Answered 7/1, 2022 at 9:15

variables haskell functional-programming binary-search-tree immutability

I am new to Functional programming.

The challenge I have is regarding the mental map of how a binary search tree works in Haskell.

In other programs (C,C++) we have something called root. We store it in a variable. We insert elements into it and do balancing etc..
The program takes a break does other things (may be process user inputs, create threads) and then figures out it needs to insert a new element in the already created tree. It knows the root (stored as a variable) and invokes the insert function with the root and the new value.

So far so good in other languages. But how do I mimic such a thing in Haskell, i.e.

I see functions implementing converting a list to a Binary Tree, inserting a value etc.. That's all good
I want this functionality to be part of a bigger program and so i need to know what the root is so that i can use it to insert it again. Is that possible? If so how?

Note: Is it not possible at all because data structures are immutable and so we cannot use the root at all to insert something. in such a case how is the above situation handled in Haskell?

Smallpox answered 6/1, 2022 at 6:36 Comment(7)

I think in general, we create transformer function which transforms the previous tree (root) into the new tree (root). Provide the new tree(root) where it is needed. – Shredding 6/1, 2022 at 6:51

Thanks. But how do i remember the old root. How exactly that happens? – Smallpox 6/1, 2022 at 6:57

You're stuck in a C/C++ mindset. Ask about what your program is actually supposed to do, instead of how you'd replicate the exact C/C++ way of doing it. – Memorable 6/1, 2022 at 7:5

My program is trying the following a) Convert a list to a Binary Search Tree b) do some I/O operation c) Ask for a user input to insert a new value in the created Binary Search Tree d) Insert it into the already created list. This is what the program intends to do. Not sure how to get this done in Haskell (or) is am i stuck in the old mindset. Any ideas/hints welcome. – Smallpox 6/1, 2022 at 7:18

Even in C++, you typically have a class that wraps all this. Your reference to an instance of the class never changes, only the root member of the instance gets changed. – Hightension 6/1, 2022 at 13:53

@Smallpox That's not what you're trying to do. That's your plan about how to do what you're trying to do. Consider: my alternative implementation that does not have any data at all, just does some I/O operation, asks the user for input, and throws that input on the floor, is observationally indistinguishable from your one that has a search tree. Tell us about your program's API/specification, not its internal implementation details. – Esparza 6/1, 2022 at 16:17

"how do i remember the old root. How exactly that happens?" you can use several variables, each holding on to its version of your tree that was in effect at different points in time. see more in the link in my answer (skip to the heading "how to think functional" there). – Boogie 6/1, 2022 at 16:52

It all happens in the same way, really, except that instead of mutating the existing tree variable we derive a new tree from it and remember that new tree instead of the old one.

For example, a sketch in C++ of the process you describe might look like:

int main(void) {
  Tree<string> root;
  while (true) {
    string next;
    cin >> next;
    if (next == "quit") exit(0);
    root.insert(next);
    doSomethingWith(root);
  }
}

A variable, a read action, and loop with a mutate step. In haskell, we do the same thing, but using recursion for looping and a recursion variable instead of mutating a local.

main = loop Empty
  where loop t = do
    next <- getLine
    when (next /= "quit") $ do
      let t' = insert next t
      doSomethingWith t'
      loop t'

If you need doSomethingWith to be able to "mutate" t as well as read it, you can lift your program into State:

main = loop Empty
  where loop t = do
    next <- getLine
    when (next /= "quit") $ do
      loop (execState doSomethingWith (insert next t))

Suffocate answered 6/1, 2022 at 7:36 Comment(0)

Writing an example with a BST would take too much time but I give you an analogous example using lists.

Let's invent a updateListN which updates the n-th element in a list.

updateListN :: Int -> a -> [a] -> [a]
updateListN i n l = take (i - 1) l ++ n : drop i l

Now for our program:

list = [1,2,3,4,5,6,7,8,9,10] -- The big data structure we might want to use multiple times

main = do
  -- only for shows
  print $ updateListN 3 30 list -- [1,2,30,4,5,6,7,8,9,10]
  print $ updateListN 8 80 list -- [1,2,3,4,5,6,7,80,9,10]
  -- now some illustrative complicated processing
  let list' = foldr (\i l -> updateListN i (i*10) l) list list
  -- list' = [10,20,30,40,50,60,70,80,90,100]
  -- Our crazily complicated illustrative algorithm still needs `list`
  print $ zipWith (-) list' list
  -- [9,18,27,36,45,54,63,72,81,90]

See how we "updated" list but it was still available? Most data structures in Haskell are persistent, so updates are non-destructive. As long as we have a reference of the old data around we can use it.

As for your comment:

My program is trying the following a) Convert a list to a Binary Search Tree b) do some I/O operation c) Ask for a user input to insert a new value in the created Binary Search Tree d) Insert it into the already created list. This is what the program intends to do. Not sure how to get this done in Haskell (or) is am i stuck in the old mindset. Any ideas/hints welcome.

We can sketch a program:

data BST
readInt :: IO Int; readInt = undefined
toBST :: [Int] -> BST; toBST = undefined
printBST :: BST -> IO (); printBST = undefined

loop :: [Int] -> IO ()
loop list = do
  int <- readInt
  let newList = int : list
  let bst = toBST newList
  printBST bst
  loop newList

main = loop []

Goniometer answered 6/1, 2022 at 7:48 Comment(0)

"do balancing" ... "It knows the root" nope. After re-balancing the root is new. The function balance_bst must return the new root.

Same in Haskell, but also with insert_bst. It too will return the new root, and you will use that new root from that point forward.

Even if the new root's value is the same, in Haskell it's a new root, since one of its children has changed.

See ''How to "think functional"'' here.

Boogie answered 6/1, 2022 at 8:3 Comment(0)

Even in C++ (or other imperative languages), it would usually be considered a poor idea to have a single global variable holding the root of the binary search tree.

Instead code that needs access to a tree should normally be parameterised on the particular tree it operates on. That's a fancy way of saying: it should be a function/method/procedure that takes the tree as an argument.

So if you're doing that, then it doesn't take much imagination to figure out how several different sections of code (or one section, on several occasions) could get access to different versions of an immutable tree. Instead of passing the same tree to each of these functions (with modifications in between), you just pass a different tree each time.

It's only a little more work to imagine what your code needs to do to "modify" an immutable tree. Obviously you won't produce a new version of the tree by directly mutating it, you'll instead produce a new value (probably by calling methods on the class implementing the tree for you, but if necessary by manually assembling new nodes yourself), and then you'll return it so your caller can pass it on - by returning it to its own caller, by giving it to another function, or even calling you again.

Putting that all together, you can have your whole program manipulate (successive versions of) this binary tree without ever having it stored in a global variable that is "the" tree. An early function (possibly even main) creates the first version of the tree, passes it to the first thing that uses it, gets back a new version of the tree and passes it to the next user, and so on. And each user of the tree can call other subfunctions as needed, with possibly many of new versions of the tree produced internally before it gets returned to the top level.

Note that I haven't actually described any special features of Haskell here. You can do all of this in just about any programming language, including C++. This is what people mean when they say that learning other types of programming makes them better programmers even in imperative languages they already knew. You can see that your habits of thought are drastically more limited than they need to be; you could not imagine how you could deal with a structure "changing" over the course of your program without having a single variable holding a structure that is mutated, when in fact that is just a small part of the tools that even C++ gives you for approaching the problem. If you can only imagine this one way of dealing with it then you'll never notice when other ways would be more helpful.

Haskell also has a variety of tools it can bring to this problem that are less common in imperative languages, such as (but not limited to):

Using the State monad to automate and hide much of the boilerplate of passing around successive versions of the tree.
Function arguments allow a function to be given an unknown "tree-consumer" function, to which it can give a tree, without any one place both having the tree and knowing which function it's passing it to.
Lazy evaluation sometimes negates the need to even have successive versions of the tree; if the modifications are expanding branches of the tree as you discover they are needed (like a move-tree for a game, say), then you could alternatively generate "the whole tree" up front even if it's infinite, and rely on lazy evaluation to limit how much work is done generating the tree to exactly the amount you need to look at.
Haskell does in fact have mutable variables, it just doesn't have functions that can access mutable variables without exposing in their type that they might have side effects. So if you really want to structure your program exactly as you would in C++ you can; it just won't really "feel like" you're writing Haskell, won't help you learn Haskell properly, and won't allow you to benefit from many of the useful features of Haskell's type system.

Frustum answered 7/1, 2022 at 9:15 Comment(0)

Recommended topics

Hot tags