Elixir + Phoenix Channels memory consumption
Asked Answered
H

1

13

I'm pretty new to the Elixir and Phoenix Framework, so may be my question is a little bit dumb.

I have an app with Elixir + Phoenix Framework as a backend and Angular 2 as a frontend. I am using Phoenix Channels as a channel for front-end/back-end interchange. And I found a strange situation: if I send a large block of data from backend to frontend then particular channel process memory consumption go up to hundreds of MBs. And each connection (each channel process) eats a such amount of memory, even after transmission ends.

Here is a code snippet from backend channel description:

defmodule MyApp.PlaylistsUserChannel do
  use MyApp.Web, :channel

  import Ecto.Query

  alias MyApp.Repo
  alias MyApp.Playlist

  # skipped ... #

  # Content list request handler
  def handle_in("playlists:list", _payload, socket) do 
    opid = socket.assigns.opid + 1
    socket = assign(socket, :opid, opid)

    send(self, :list)
    {:reply, :ok, socket}
  end

  # skipped ... #        

  def handle_info(:list, socket) do

    payload = %{opid: socket.assigns.opid}

    result =
    try do
      user = socket.assigns.current_user
      playlists = user
                  |> Playlist.get_by_user
                  |> order_by([desc: :updated_at])
                  |> Repo.all

      %{data: playlists}
    catch
      _ ->
        %{error: "No playlists"}
    end

    payload = payload |> Map.merge(result)

    push socket, "playlists:list", payload

    {:noreply, socket}
  end

I've created a set with 60000 records just to test frontend ability to deal with such amount of data, but got a side effect - I found that particular channel process memory consumption is 167 Mb. So I open a few new browser windows and each new channel process memory consumption grew to this amount after the "playlists:list" request.

Is it normal behaviour? I would expect high memory consumption during database query and data offload, but it still the same even after request finished.

UPDATE 1. So with a big help of @Dogbert and @michalmuskala I found that after manual garbage collection memory is going to free.

I've tried to dig a little with recon_ex library and found the following examples:

iex([email protected])19> :recon.proc_count(:memory, 3)
[{#PID<0.4410.6>, 212908688,
  [current_function: {:gen_server, :loop, 6},
   initial_call: {:proc_lib, :init_p, 5}]},
 {#PID<0.4405.6>, 123211576,
  [current_function: {:cowboy_websocket, :handler_loop, 4},
   initial_call: {:cowboy_protocol, :init, 4}]},
 {#PID<0.12.0>, 689512,
  [:code_server, {:current_function, {:code_server, :loop, 1}},
   {:initial_call, {:erlang, :apply, 2}}]}]

#PID<0.4410.6> is Elixir.Phoenix.Channel.Server and #PID<0.4405.6> is cowboy_protocol.

Next I went with:

iex([email protected])20> :recon.proc_count(:binary_memory, 3)
[{#PID<0.4410.6>, 31539642,
  [current_function: {:gen_server, :loop, 6},
   initial_call: {:proc_lib, :init_p, 5}]},
 {#PID<0.4405.6>, 19178914,
  [current_function: {:cowboy_websocket, :handler_loop, 4},
   initial_call: {:cowboy_protocol, :init, 4}]},
 {#PID<0.75.0>, 24180,
  [Mix.ProjectStack, {:current_function, {:gen_server, :loop, 6}},
   {:initial_call, {:proc_lib, :init_p, 5}}]}]

and:

iex([email protected])22> :recon.bin_leak(3)                  
[{#PID<0.4410.6>, -368766,
  [current_function: {:gen_server, :loop, 6},
   initial_call: {:proc_lib, :init_p, 5}]},
 {#PID<0.4405.6>, -210112,
  [current_function: {:cowboy_websocket, :handler_loop, 4},
   initial_call: {:cowboy_protocol, :init, 4}]},
 {#PID<0.775.0>, -133,
  [MyApp.Endpoint.CodeReloader,
   {:current_function, {:gen_server, :loop, 6}},
   {:initial_call, {:proc_lib, :init_p, 5}}]}]

And finally the state of the problem processes after recon.bin_leak (actually after garbage collection, of course - if I run :erlang.garbage_collection() with pids of these processes the result is the same):

 {#PID<0.4405.6>, 34608,
  [current_function: {:cowboy_websocket, :handler_loop, 4},
   initial_call: {:cowboy_protocol, :init, 4}]},
...
 {#PID<0.4410.6>, 5936,
  [current_function: {:gen_server, :loop, 6},
   initial_call: {:proc_lib, :init_p, 5}]},

If I do not run garbage collection manually - the memory "never" (at least, I've waited for 16 hours) become free.

Just to remember: I have such memory consumption after sending a message from the backend to the frontend with 70 000 records fetched from Postgres. The model is pretty simple:

  schema "playlists" do
    field :title, :string
    field :description, :string    
    belongs_to :user, MyApp.User
    timestamps()
  end

Records are autogenerated and look like this:

description: null
id: "da9a8cae-57f6-11e6-a1ff-bf911db31539"
inserted_at: Mon Aug 01 2016 19:47:22 GMT+0500 (YEKT)
title: "Playlist at 2016-08-01 14:47:22"
updated_at: Mon Aug 01 2016 19:47:22 GMT+0500 (YEKT)

I would really appreciate any advices here. I believe I'm not going to send such a big amount of data but even smaller data sets could lead to a huge memory consumption in case of many client connections. And since I haven't coded any tricky things probably this situation hides some more general problems (but it's just an assumtion, of course).

Hobnob answered 31/7, 2016 at 16:35 Comment(9)
Definitely not normal. Does the memory usage go down after you disconnect and the process dies? Also, try :observer.start to see which process is using the memory and for what.Crissie
How much memory does the raw data you're sending in json take? The 60_000 records? The process memory will need to grow to at least that size (probably even more because of garbage along the way) to handle it.Cannikin
@Crissie I found this memory consumption with :observer. This amount consumed by Elixir.Phoenix.Channel.Server:init/1 process, that's why I'm asking about Phoenix channel memory consumption. And yes, after disconnect process dies and memory free. By the way, I found that cowboy_protocol:init/4 eats 100Mb as well for each connection.Hobnob
@Cannikin but shouldn't it free the memory after the data off-load? In Chrome developer tools I can see that the length of the message with that data is 11029018 (bytes, probably?).Hobnob
@Hobnob there are some things you could do to analyse where the memory use comes from. This might be the case of binaries held for too long. Does the memory go down if you run :erlang.garbage_collect()? You can find many tips on debugging memory in erlang-in-anger.comCannikin
@Cannikin no, memory stays at the same level even after :erlang.garbage_collect(). I'll dig the book, thanks a lot!Hobnob
Pretty weird, app asked for such a big request only once - after the first connect. I've set a button to request that dataset manually. After that I've prepared another set with 70K records. And I see in observer that Elixir.Phoenix.Channel.Server memory consumtion goes down right after the second request and then up to the same level during the answer (212MB), but cowboy_protocol memory consumption after the second request goes up to 378MB (!!!) for every socket connection. Don't understand what's going on at all.Hobnob
@Crissie could you please review my Update 1?Hobnob
@Cannikin could you please review my Update 1?Hobnob
C
21

This is the classical example of the binary memory leak. Let me explain what's happening:

You handle a really big amount of data in the process. This grows the process heap, so that the process is able to handle all that data. After you're done with handling that data, most of the memory is freed, but the heap remains big and possibly holds the reference to the big binary that was created as the final step of handling the data. So now we have a large binary referenced by the process and a big heap with few elements in it. At this point the process enters a slow period only handling small amounts of data, or even no data at all. This means the next garbage collection will be very delayed (remember - heap is big), and it may take some really long time until the garbage collection actually runs and reclaims the memory.

Why the memory is growing in two processes? The channel process grows because of querying the database for all that data and decoding it. Once the result is decoded into structs/maps it is sent to the transport process (the cowboy handler). Sending messages between processes means copying, so all that data is copied over. This means the transport process has to grow to accommodate the data it's receiving. In the transport process the data is encoded into json. Both processes have to grow, and later stay there with big heaps and nothing to do.

Now to the solutions. One way would be to explicitly run :erlang.garbage_collect/0 when you know you've just processed a lot of data and won't do that again for some time. Another could be to avoid growing the heap in the first place - you could process the data in a separate process (possibly a Task) and only concern yourself with the final encoded result. After the intermediary process is done with processing the data, it will stop and free all of its memory. At that point you'll be only passing a refc binary between processes without growing the heaps. Finally there's always the usual approach for handling lots of data that's not needed all at once - pagination.

Cannikin answered 1/8, 2016 at 20:31 Comment(2)
Thanks a lot for such a detail explanation! I was thinking about the ways you offered, but I see some caveats here (in my particular case): If I'm using :erlang.garbage_collect(), I can run it only after fetching data from DB, pushing it to the socket and before cowboy_protocol process finished, so the app stays with cowboy_protocol process' big heap. I've tried to implement a Task.async behaviour, but probably I do something wrong because memory consumption is even higher in that case. Here is a gist with a code.Hobnob
To be honest, I think that Phoenix Framework itself should care about garbage collecting, because I can't see the ways to free memory after some operations which are starting after customised code finished.Hobnob

© 2022 - 2024 — McMap. All rights reserved.