elua/elua

Large RAM usage while pre compiled lua code is in the romfs

Closed this issue ยท 15 comments

Hi everyone,

i am currently working on an system where we have a small POSIX like operating system. We included the Lua variant from eLua because of the LTR patch.
I have quite some lua code as modules in the fashion:

local mod = {}
..
..
return mod

I put different modules pre compiled as bytecode into the romfs. When i require them in Lua via:

x = require "mod"

the ram consumption raises by 2-10kB depending on the size of the module. I check these via collectgarbage('count') before and after the require and also run collectgargage('collect') in between.

So maybe i do and/or understand something wrong but i thought that eLua is able to execute bytecode directly from the romfs. So i expect the ram consumption raises just a little but in kBytes.

Is there a way to test if the lua code is executed from flash ?

Thanks in advance for any help,
Hans

Hi,

What version of eLua are you using?

Thanks,
Bogdan

Its 0.9

I'll do some tests, but for now, does your system happen to have a SD card attached? A simple way to test if the bytecode from flash works properly is to precompile your module and copy it to two different locations: ROMFS and SD. Then require() it explicitly from both locations and check if there's a difference.
One more thing though: when you do require(), Lua will execute the code in the module you're requiring. Since this happens at runtime, it can create quite a few data structures in RAM (depending on the actual content of your module). So, even if your bytecode is read directly from flash, Lua still needs RAM to represent this module internally. I'm afraid this can't be changed; if it can, it will probably work only in some (possibly limited) cases and with a lot of changes to the Lua code.

So it seems RAM consumption is a quite hot topic on the mailing list ;)

Anyway its clear to me, that a VM cannot execute the bytecode directly and still has to parse it. But i don't know anything about the internals of the Lua VM. I have the programming in lua book for Lua 5.2 but there is almost nothing about the memory model of the VM. Can you suggest some literature about the details of the lua VM ?

I try to make an example of what i mean and observe until now. If i take a lua module like this:

Example 1

local mod = {}
mod.x = string.rep("x",2048)
function mod.size()
  print(#mod.x)
end
return mod

When i require this module pre-compiled from the romfs it consumes ~3kB of ram. If i would do it like this:

Example 2

local mod = {}
function mod.size()
  local x = string.rep("x",2048)
  print(#x)
end
return mod

It takes about 0.1kB of Ram :) And this is basically what all is about. I understand it in this way, that in example 1 lua copies the whole table entry mod.x into the active state and holds it there because it maybe can be manipulated. In example 2 the variable is just created locally, used and then garbage collected.

What i actually thought about the LTR patch is that it can recognize, ok that the large array mod.x is in a read-only memory area so it can't be manipulated and so there is no need to copy it into ram. And if i want to manipulate i get an error.
I hope this example makes it a little clearer what i'm looking for.

I think the confusion here has to do with what's in a precompiled Lua file. There's not just bytecode there, there's loads of other data that is read by Lua when the file is loaded; some of this data will require memory allocation. There an excellent article on the subject available at http://luaforge.net/docman/83/98/ANoFrillsIntroToLua51VMInstructions.pdf (check section 4 in particular if you don't want to go through the whole document).

Thanks for the article it looks very good.

But dont't you think that the behavior of rotables i describe would be of good use ? Especially in embedded system i think it would be very nice if you could use a lot of data in lua (lookuptables, filter coefficients etc.) just by requiring a module from the romfs.
I thought this is the improvement of the rotable or am i wrong ?

The rotable thing was meant for C extensions. Originally, when you wrote C extensions, you'd create Lua closures for each C function you exposed to Lua, which took a lot of memory (this was fixed meanwhile in Lua 5.2 and above). This is combined with a new compile time Lua table type, but again, this new type only exists for C extensions. Making this work for Lua itself would be much harder and sometimes (many times?) impossible. But it's definitely a target for improvement.

Ok i see. Many thanks for your help.

I think i will post tomorrow again a few words to this on the lua mailing list. Our idea with the mmap() is more driven from the fact that lua would work the same on linux, windows or an embedded system.
So you can test on linux or windows an run on embedded or vice versa, without the need of a special memory patch for embedded systems.
This is why we had the idea to suggest that to the lua developers directly.

mmap() is not going to help you much. eLua implements a very basic pseudo-mmap especially for the purpose of reading bytecode from flash, but there's not so much more you can do. Taking your initial example, Lua doesn't copy the whole module when you do require (as a sidenote, I don't think Lua ever copies whole modules). Instead, it just saves a reference to that module to package.loaded, which takes just a bit of memory. So your problem isn't actually there; your problem resides in the fact that in the first example you effectively export "x" to the users of your module by doing "mod.x = string.rep...", so Lua needs to keep that alive. No amount of loading code/data from flash will save you from this :)

Ok my example 1 is badly chosen. Buf if i would have something like this:

Example 3

local mod = {}
  function mod.x(sel)
    local str = "some very long string which holds alot of info"
    local table = {"xxx", "xx", "x", ... } -- some large table
    if sel == 1 then
      return str
    else 
      return table
    end
  end
return mod

Would something like this consume alot of ram when i require the module ?

Regardless of requiring this as source or as precompiled bytecode, these things need to happen:

  • space for "mod" needs to be allocated in RAM
  • space for "mod.x" (a closure) needs to be allocated in RAM
  • space for local variables needs (str, table) needs to be allocated in RAM

So having the module precompiled doesn't automatically prevent memory allocation. Some things (like tables) are always allocated at runtime.

But the local variables are only allocated in RAM for the execution of the function and after that the gc will free the RAM again doesn't it ?
I did a quick test and it seems to work this way. Also the pre-compiled file of example 3 is much larger than of example 1.

Another question which came to my mind: You said your LTR patch is mostly for the C-extension in lua5.1 but this is something which is fixed in lua5.2 and above.
Can you use lua 5.3 with a similar RAM consumption like 5.1 with LTR patch than ?

Sure, local variables will get out of the way, but "mod" and "mod.x" still need to be allocated. This is why you can't expect to simply mmap() a bytecode file and use it as is: RAM still needs to be allocated for a bytecode file. There's no way around it.
About LTR, the thing that was "fixed" in 5.2 (bad wording on my side, this wasn't actually a fix since nothing was actually broken in Lua ๐Ÿ˜„, it just just an improvement) is the addition of "light C functions", which are not closures, thus don't require memory allocation when exporting to Lua. Rotables didn't make it to Lua 5.2, as well as a bunch of other memory optimizations in eLua (partial strings in flash, bytecode in flash and so on). Because of this, I'd be very surprised if you could use Lua 5.3 with a RAM consumption similar to eLua 5.1 (but this will of course depend a lot on your actual application).

Allright thank you. I know have a better understanding of the ram consumption in lua.
Its clear to me that its not possible to just mmap() a bytecode file and there is no ram usage. But i want to be as efficient as possible in my embedded application. The idea of the mmap() is to just have references in the lua VM to the file which use ram. All actions are done by getting code/data and collecting them when they are not needed anymore.
I think its just another approach as you did with your LTR patch. (as far as i can tell ๐Ÿ˜‰ )

The nice thing with the mmap() included in lua directly would be the same behavior between different plattforms. Linux, Windows, embedded. Its a more generic approach to a common problem and i always tend to prefer more general solutions ๐Ÿ˜„

Also another advantage of mmap() in lua would be, if you have something like the /rfs and the server supports something like mmap() it should be possible to have read-only pre-compiled code in the rfs and than there is no need to flash every time.
But you will maintain the lower ram usage as if the code were pre-compiled in the romfs.