Dev Corner

Tuesday, February 21, 2006

ETS is handy

The other day I wanted to create a counter that is shared amongst many clients that were connecting to a server. I wanted to assign a unique id to each, so I quickly listed how I would do this in erlang.

These are:
  1. Create a gen_server process that has private state that contains the counter variable
  2. Use ETS
  3. Make the accepting gen_server keep track of and assign incremental ids.
  4. ...
There are probably more, but those were a few that I quickly thought of.

So I started thinking about the create a new gen_server option. Perhaps it would have been a little overkill, and then I had to worry about what would happen, if by some odd chance, that the gen_server died.
If it did, then I would have to remember the state of the counter. It is true that the gen_server would be very simple that the chance of this happening are very slim, but still, you never know.

I thought about putting the counter somewhere and having the socket acceptor gen_server I had, assign the ids. I held off on that one, I didn't want to pollute the duties of that gen_server with lots of ancillary things.

So then I started looking into ETS. ETS stands for Erlang Term Storage. From the manual:
"... provide the ability to store very large quantities of data in an Erlang runtime system, and to have constant access time to the data."
I had used ETS before for storing server performance tables. I noticed that there was a function named: update_counter. This function updates a counter field in an ets table.
So I thought, cool, let me try creating some quick code to prototype this:

ETS stores tuples of information, so lets create a record that will be the data that is stored:

-record(counter_entry, {id, nextid=1}).

I added id in there to that I would be able to create several counters. Each counter sequence would be identifiable so that we can have many types of counters.

Next I added a function to initialize the counter table:

init(CounterID) ->
    ets:new(t_mycounters, [set, {keypos, 2}, public, named_table]),
    ets:insert(t_mycounters, #counter_entry{id=CounterID, nextid=1}).

When you create a new ets table, you can provide some options for how it is accessed and indexed. I passed in {keypos, 2}. This tells ets which tuple element number will serve as the index field of the table.
I gave the table public access, meaning that any other process can query and manipulate the table. Otherwise, only the process creator can manipulate the table.

The table is also a named_table. This way, I can use the table by passing in the atom: t_mycounters to identify the table. If I hadn't done this, then I should have stashed away the table identifier returned by ets:new and used that each time I wanted to operate on the table.

I then defined a function to get the next successive value for the counter:

getnext
(CounterID) ->
    ets:update_counter(t_freechatcounter, CounterID, {3, 1}).

ets provides a useful function to atomically treat a table field as a counter. In this case I told ets to update tuple position 3 by one. This is where the {3,1} comes in. It will actually be position 3 of the counter_entry record. Which is nextid.

Here is how to use it:

counters:init(chat_id_sequence).
NextSeqNo = counters:getnext(chat_id_sequence).

Kinda like the concept of postgresql sequences!

Well, that's it for now. If you were to take one thing away, it should be: ETS is handy and cool.

6 Comments:

  • Re: gen_server

    If it is used in the context of OTP (as was intended) it would have a supervisor. The supervisor would both keep the state (so a crash would not loose it) and re-start the gen-server if it crashed

    By Anonymous Anonymous, at 5:55 PM  

  • You have a point, but I wanted the supervisor to supervise. I didn't want to pollute it with id state saving duties.

    Also, with public named ets tables, it makes it easy to expose the counter values to remote nodes via ets.

    By Blogger Ernie Makris, at 6:20 PM  

  • It is not polluting - it is the standard supervisor pattern. It is the normal recovery mechanism.

    Also a gen-server can be exposed to any node you desire.

    By Anonymous Anonymous, at 12:24 AM  

  • Example for me... how would a supervisor save the internal state of its child then have the next instantiation recreate it?

    By Anonymous ryan rawson, at 4:33 AM  

  • I think that is the style of my blog. Show what you mean in code. I second the request for an example:)

    By Blogger Ernie Makris, at 10:18 AM  

  • i would like to suggest that an arguably better (cleaner/likely more efficient) method of solving this is to have your gen_server spawn a process that maintains counter state and serializes the requests. you can do this in the init() callback.
    you can link this process with the gen_server handler (but make sure the counter process is trapping exits!) and also register() it with a name that prevents a gen_server restart from spawning a duplicate instantiation.
    You can keep the process PID returned from whereis() in the gen_server #state{} so that you can ! to it directly without a whereis() lookup every iteration. You can also make the gen_server watch for exits from the counter proc with the handle_info() callback. have fun!

    By Anonymous Anonymous, at 2:26 PM  

Post a Comment

<< Home