Is it possible to call loadstring on string of lua bytecode that contains a reference to a C function?
Asked Answered
P

1

7

We are using the Love2d Lua game engine that exposes a graphics api to Lua. We are trying to serialize a giant hash table that contains all the save game data for the game world. This hash includes some functions, and some of these functions call Love2d C functions.

In order to serialize the functions in the hash, we use string.dump, and load them back in with loadstring. This works well for pure Lua functions, but when we try to serialize and then load back in a function which calls a wrapped C function such as one in the Love2d api, loadstring returns nil.

Consider the following simple program that draws "hello, world" to the screen via Love2d's graphics engine:

function love.load()
    draw = function()
        love.graphics.print('hello, world', 10, 10)
    end
end
function love.draw()
    draw()
end

We would like to be able to do this:

function love.load()
    draw_before_serialize = function()
        love.graphics.print('hello, world', 10, 10)
    end

    out = io.open("serialized.lua", "wb")
    out:write('draw = load([[' .. string.dump(draw_before_serialize) .. ']])')
    out:close()

    require "serialized"
end
function love.draw()
    draw()
end

Doing this writes to a Lua file on disk that contains a mix of non-compiled Lua and Lua bytecode, which looks something like this:

draw = load([[^[LJ^A^@      
       @main.lua2^@^@^B^@^B^@^D^E^B^B4^@^@^@%^A^A^@>^@^B^AG^@^A^@^Qhello, world 
       print^A^A^A^B^@^@]])

This method works fine with Lua functions that do not call C modules. We think that this is the problem because this example does work:

function love.load()
    draw_before_serialize = function()
        print('hello, world')
    end

    out = io.open("serialized.lua", "wb")
    out:write('draw = load([[' .. string.dump(draw_before_serialize) .. ']])')
    out:close()

    require "serialized"
end
function love.draw()
    draw()
end

Instead of calling the Love2d graphics method, it does a print to the console.

After more testing, we were confused to find that this example does work:

function love.load()
    draw_before_serialize = function()
        love.graphics.print('hello, world', 10, 10)
    end

    draw = load(string.dump(draw_before_serialize))
end
function love.draw()
    draw()
end

Here we don't actually write out the function to disk, and instead just dump it and then immediately load it back. We thought that perhaps the culprit was not writing out the data with the binary write mode flag set ("wb"), but since we are on Linux this flag has no effect.

Any ideas?

Purple answered 2/6, 2012 at 21:53 Comment(3)
"does a print to the console" what does it print? Also, are you sure that the global environment used by the above code is the same as the global environment used by require? [since you are depending on draw being defined in the global environment]Zibeline
You should be advise that this: [[' .. string.dump(draw_before_serialize) .. ']] is not necessarily going to work. The dump you get could contain anything, including the ]] characters. That would terminate the string early, thus breaking things.Alienage
@NicolBolas I once saw an easy yet smart solution for that that just checked the string for ](=*)] and then framed the dump with one = more than the maximum number of ='s found with the match.Toddle
S
6

I think the problem is in the formatting of the string. Nicol Bolas might be right about the [[]] quote marks surrounding your byte-code dump, but this points at a bigger problem; The byte code really could be anything, but you're treating it like it's a normal string that can be written to and read from a text file. This problem is demonstrated by your last demo, where you load the dumped string without ever writing it to file.

This implementation of a serializer for tables which include functions kind of does what you want, I think, but I also think it's broken (well, I couldn't get it to work right anyway...). Anyway it's on the right track. You need to format the bytecode and then write it to the file.

I'm sure there's a better way to do it, but this works:

1.    binary = string.dump(some_function)
2.    formatted_binary = ""
3.    for i = 1, string.len(binary) do
4.        dec, _ = ("\\%3d"):format(binary:sub(i, i):byte()):gsub(' ', '0')
5.        formatted_binary = formatted_binary .. dec
6.    end

This loops through each character in the bytecode, formats them as escaped bytes (each is a string containing a code like "\097", which upon interpolation would escape to "a").

Line 4 of this sample is kind of dense so I'll break it down. First,

binary:sub(i, i)

pulls the i'th character out of the string. Then

binary:sub(i, i):byte()

gives back the ascii integer representation of the i'th character. Then we format it with

("\\%3d"):format(binary:sub(i, i):byte())

which gives us a string like "\ 97", for example, if the character were "a". But this won't escape properly because we need "\097", so we do a gsub replacing " " with "0". The gsub returns the resulting string and the number of substitutions that were performed, so we just take the first return value and put it in "dec". I'm not sure why the "%3d" format doesn't replace the spaces with "0"'s by default... oh well.

Then in order to execute the formatted binary string, we need to escape it and pass the result to "load". The weirdo [[]] quote marks in Lua don't do escapes like ""... in fact I'm not sure they do any escapes at all. So then to make an executable Lua string that will return a function that will do whatever is in "some_function", we do this:

executable_string = 'load("' .. formatted_binary .. '")'

Ok - so putting all that together, I think we can make your test-case work like so:

  1 function love.load()
  2     draw_before_serialize = function()
  3         love.graphics.print('hello, world', 10, 10)
  4     end
  5 
  6     binary = string.dump(draw_before_serialize)
  7     formatted_binary = ""
  8     for i = 1, string.len(binary) do
  9         dec, _ = ("\\%3d"):format(binary:sub(i, i):byte()):gsub(' ', '0')
 10         formatted_binary = formatted_binary .. dec
 11     end
 12     
 13     out = io.open("serialized.lua", "wb")
 14     out:write('draw = load("' .. formatted_binary .. '")')
 15     out:close()
 16     
 17     require "serialized"
 18 end 
 19 function love.draw()
 20     draw()
 21 end

When I run this with Love I get an OpenGL screen with "hello world" printed in the corner. The resulting file "serialized.lua" contains the following:

draw = load("\027\076\074\001\000\009\064\109\097\105\110\046\108\117\097\084\000\000\004\000\004\000\008\009\002\002\052\000\000\000\055\000\001\000\055\000\002\000\037\001\003\000\039\002\010\000\039\003\010\000\062\000\004\001\071\000\001\000\017\104\101\108\108\111\044\032\119\111\114\108\100\010\112\114\105\110\116\013\103\114\097\112\104\105\099\115\009\108\111\118\101\001\001\001\001\001\001\001\002\000\000")
Standard answered 8/6, 2012 at 22:24 Comment(1)
Is that hard to make the oposit ? I have a serialized string almost like this one you posted and I want to change it to a human-based lanquage to make some minor changes on the script that this string generates.Hax

© 2022 - 2024 — McMap. All rights reserved.