Memoization in J
Asked Answered
G

1

6

Every time I use J's M. adverb, performance degrades considerably. Since I suspect Iverson and Hui are far smarter than I, I must be doing something wrong.

Consider the Collatz conjecture. There seem to be all sorts of opportunities for memoization here, but no matter where I place M., performance is terrible. For example:

hotpo =: -:`(>:@(3&*))@.(2&|) M.
collatz =: hotpo^:(>&1)^:a:"0
#@collatz 1+i.10000x

Without M., this runs in about 2 seconds on my machine. With M., well, I waited over ten minutes for it to complete and eventually gave up. I've also placed M. in other positions with similarly bad results, e.g.,

hotpo =: -:`(>:@(3&*))@.(2&|)
collatz =: hotpo^:(>&1)M.^:a:"0
#@collatz 1+i.10000x    

Can someone explain the proper usage of M.?

Gummy answered 30/7, 2015 at 12:47 Comment(2)
Have you looked at the Nuvoc explanation in the J wiki? It outlines some of the limitations that could slow things up, but then again they may not. jsoftware.com/jwiki/Vocabulary/mcapdotTonina
@Tonina Not surprised to see you here. :) I'll check it out.Gummy
P
8

The M. does nothing for you here.

Your code is constructing a chain, one link at a time:

    -:`(>:@(3&*))@.(2&|)^:(>&1)^:a:"0 M. 5 5
5 16 8 4 2 1
5 16 8 4 2 1

Here, it remembers that 5 leads to 16, 16 leads to 8, 8 leads to 4, etc... But what does that do for you? It replaces some simple arithmetic with a memory lookup, but the arithmetic is so trivial that it's likely faster than the lookup. (I'm surprised your example takes 10 whole minutes, but that's beside the point.)

For memoization to make sense, you need to be replace a more expensive computation.

For this particular problem, you might want a function that takes an integer and returns a 1 if and when the sequence arrives at 1. For example:

    -:`(>:@(3&*))@.(2&|)^:(>&1)^:_"0 M. 5 5
1 1

All I did was replace the ^:a: with ^:_, to discard the intermediate results. Even then, it doesn't make much difference, but you can use timespacex to see the effect:

   timespacex  '-:`(>:@(3&*))@.(2&|)^:(>&1)^:_"0     i.100000'
17.9748 1.78225e7
   timespacex  '-:`(>:@(3&*))@.(2&|)^:(>&1)^:_"0  M. i.100000'
17.9625 1.78263e7

Addendum: The placement of the M. relative to the "0 does seems to make a huge difference. I thought I might have made a mistake there, but a quick test showed that swapping them caused a huge performance loss in both time and space:

   timespacex  '-:`(>:@(3&*))@.(2&|)^:(>&1)^:_ M. "0  i.100000'
27.3633 2.41176e7

M. preserves the rank of the underlying verb, so the two are semantically equivalent, but I suspect with the "0 on the outside like this, the M. doesn't know that it's dealing with scalars. So I guess the lesson here is to make sure M. knows what it's dealing with. :)


BTW, if the Collatz conjecture turned out to be false, and you fed this code a counterexample, it would go into an infinite loop rather than produce an answer.

To actually detect a counterexample, you'd want to monitor the intermediate results until you found a cycle, and then return the lowest number in the cycle. To do this, you'd probably want to implement a custom adverb to replace ^:n.

Pledgee answered 1/8, 2015 at 2:12 Comment(5)
Lots to absorb here, but first a few notes: I didn't think that memoizing hotpo would do anything for me for the very reasons you state. I should not have included it in my question. Also, I'm interested in M., not Collatz (which I chose at random), so I do want the intermediate results. In fact, calculating them in a performant way is kind of the point of what I'm trying to do with M.. Many of the intermediate patterns repeat, so memoization can have a powerful effect. I wrote a similar algorithm in Haskell and got huge gains using memoization. I want to do the same in J.Gummy
For example, if you look at the Collatz reduction for 33, it contains the Collatz reduction for 34, although it takes a short while to get there. If you memoize this, you can get a pretty good time reduction (though a potentially nasty space increase).Gummy
Interestingly, the sequence 34 17 52 26 13 40 20 10 5 16 8 4 2 1 terminates just over 40% of the Collatz sequences for natural numbers under 10000 (and who knows how many over it), which probably accounts for some of the performance improvement with memoization in Haskell.Gummy
I'm marking this answered because it pointed in the right direction, though it wasn't exactly what I was looking for. The Collatz conjencture itself is somewhat irrelevant to the question. Rather, what I was looking for was a discussion of M. itself with the Collatz conjecture as a backdrop.Gummy
The reason it works in haskell but not J is that in haskell, you are dealing with linked lists, and you can memoize a reference to a sublist just by pointing at one cell. J doesn't really have a concept of references or pointers, though, unless you're working with locales or mapped files. You can use array offsets to build references, though, and make your own data structures. See building trees in j for an example.Pledgee

© 2022 - 2024 — McMap. All rights reserved.