Using ets:foldl as a poor man's forEach on every record
Asked Answered
I

2

5

Short version: is it safe to use ets:foldl to delete every ETS record as one is iterating through them?

Suppose an ETS table is accumulating information and now it's time to process it all. A record is read from the table, used in some way, then deleted. (Also, assume the table is private, so no concurrency issues.)

In another language, with a similar data structure, you might use a for...each loop, processing every record and then deleting it from the hash/dict/map/whatever. However, the ets module does not have foreach as e.g. lists does.

But this might work:

1> ets:new(ex, [named_table]).
ex
2> ets:insert(ex, {alice, "high"}).
true
3> ets:insert(ex, {bob, "medium"}).
true
4> ets:insert(ex, {charlie, "low"}).
true
5> ets:foldl(fun({Name, Adjective}, DontCare) ->
      io:format("~p has a ~p opinion of you~n", [Name, Adjective]),
      ets:delete(ex, Name),
      DontCare
   end, notused, ex).
bob has a "medium" opinion of you
alice has a "high" opinion of you
charlie has a "low" opinion of you
notused
6> ets:info(ex).
[...
 {size,0},
 ...]
7> ets:lookup(ex, bob).
[]

Is this the preferred approach? Is it at least correct and bug-free?

I have a general concern about modifying a data structure while processing it, however the ets:foldl documentation implies that ETS is pretty comfortable with you modifying records inside foldl. Since I am essentially wiping the table clean, I want to be sure.

I am using Erlang R14B with a set table however I'd like to know if there are any caveats with any Erlang version, or any type of table as well. Thanks!

Isthmian answered 5/12, 2010 at 19:18 Comment(0)
S
8

Your approach is safe. The reason it is safe is that ets:foldl/3 internally use ets:first/1, ets:next/2 and ets:safe_fixtable/2. These have the guarantee you want, namely that you can kill elements and still get the full traverse. See the CONCURRENCY section of erl -man ets.

For your removal of all elements from the table, there is a simpler one-liner however:

ets:match_delete(ex, '_').

although it doesn't work should you want to do the IO-formatting for each row in which case your approach with foldl is probably easier.

Subdiaconate answered 5/12, 2010 at 19:42 Comment(1)
Thank you. The Concurrency section of the man page is exactly what I had missed. It clearly says if you use safe_fixtable then every object is visited once. And yes, in my real code I am of course doing some complex processing on the data before essentially marking it "done" via ets:delete. Cheers!Isthmian
E
2

For cases like this we will alternate between two tables or just create a new table every time we start processing. When we want to start a processing cycle we switch the writers to start using the alternate or new table, then we do our processing and clear or delete the old table.

We do this because there might otherwise be concurrent updates to a tuple that we might miss. We're working with high frequency concurrent counters when we use this technique.

Epicontinental answered 6/12, 2010 at 3:32 Comment(1)
That's cool, since it's quite similar to the code-reloading mechanism. My initial concern--using foldl as a foreach--is now resolved, and it's good to be reminded of how to properly maintain counters (which is what I am doing too). Thanks!Isthmian

© 2022 - 2024 — McMap. All rights reserved.