Memory-efficient sparse array in Java
Asked Answered
P

5

10

(There are some questions about time-efficient sparse arrays but I am looking for memory efficiency.)

I need the equivalent of a List<T> or Map<Integer,T> which

  1. Can grow on demand just by setting a key larger than any encountered before. (Can assume keys are nonnegative.)
  2. Is about as memory-efficient as an ArrayList<T> in the case that most of the indices are not null, i.e. when the actual data is not very sparse.
  3. When the indices are sparse, consumes space proportional to the number of non-null indices.
  4. Uses less memory than HashMap<Integer,T> (as this autoboxes the keys and probably does not take advantage of the scalar key type).
  5. Can get or set an element in amortized log(N) time where N is the number of entries: need not be linear time, binary search would be acceptable.
  6. Implemented in a nonviral open-source pure Java library (preferably in Maven Central).

Does anyone know of such a utility class?

I would have expected Commons Collections to have one but it did not seem to.

I came across org.apache.commons.math.util.OpenIntToFieldHashMap which looks almost right except the value type is a FieldElement which seems gratuitous; I just want T extends Object. It looks like it would be easy to edit its source code to be more generic, though I would rather use a binary dependency if one is available.

Popper answered 27/9, 2012 at 16:39 Comment(0)
J
6

I would try with trove collections, there is TIntObjectMap which can work for your intents.

Jacinthe answered 27/9, 2012 at 16:49 Comment(1)
That looks good. I tried adapting OpenIntToFieldHashMap to a generic value type, which seems to have worked with ~10min work, but it only performs marginally better than TIntObjectMap.Popper
C
5

I would look at Android's SparseArray implementation for inspiration. You can view the source by downloading AOSP's source code here http://source.android.com/source/downloading.html

Courtland answered 22/4, 2013 at 3:50 Comment(3)
code.google.com/p/android-source-browsing/source/browse/core/… does look to be appropriate—amortized run time is undocumented but from inspection I am guessing it is logarithmic—and is under ASL 2.0, which is fine. Unfortunately it is not in Central that I know of, and would want it decoupled from unrelated stuff like Android Bluetooth support which is all in the same source root.Popper
Here's a self contained version that uses all the necessary code from android github.com/frostwire/frostwire-jlibtorrent/blob/…Kalikalian
You might be looking more precisely for SparseIntArray where you avoid costs of boxing/unboxing the indexes developer.android.com/reference/android/util/SparseIntArray And yes source code is available and easy+friendly license if you want to extract it from the google code base and adapt it.Bradytelic
P
1

I have saved my test case as jglick/inthashmap. The results:

HashMap size: 1017504
TIntObjectMap size: 853216
IntHashMap size: 846984
OpenIntObjectHashMap size: 760472
Popper answered 27/9, 2012 at 17:8 Comment(8)
Where do I find IntHashMap?Sweep
@Sweep probably apache commons (?)Impeditive
Sorry, IntHashMap was my adaptation of OpenIntToFieldHashMap from Commons Math. Since it was barely better than TIntObjectMap I dismissed this approach.Popper
@JesseGlick see java.dzone.com/articles/time-memory-tradeoff-example and gist.github.com/leventov/bc14ea790b4d3cfd238d#file-memory-txtTopminnow
@Topminnow interesting. Addresses a different set of questions than I was asking here but a good source to investigate potential implementations.Popper
Why different. It shows average relative memory overuse of "int -> int" maps in libraries, that correlates with "int -> obj" well because specializations are homogenuous within all libs.Topminnow
Well, you are also measuring access speed which I was not considering relevant (so long as it is logarithmic); and my memory comparison is to a non-sparse ArrayList<T>, which would be half the size of the “theoretical minimum” in the more general case you were considering.Popper
In your answer only different hash table implementations are compared. I referenced another comparison, which include all impls you tested, and more.Topminnow
Y
1

I will suggest you to use OpenIntObjectHashMap from Colt library. Link

Yearn answered 13/5, 2014 at 19:30 Comment(1)
Thanks for the tip. It does indeed have moderately but significantly lower space consumption than the alternatives. I have included this in my revised test case.Popper
V
0

Late to this question, but there is IntMap in libgdx which uses cuckoo hashing. If anything it would be interesting to compare with the others.

Virtuoso answered 19/9, 2018 at 23:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.