lune-org/lune

Lune crashes when writing a very deeply nested Instance to a file

Closed this issue · 6 comments

Lune Version: 0.6.7 (Win64)
Step 1: Create a very deeply nested Instance. Around 2,870 nests should be enough, but I'm using 32,768 nests as an example.

local j = {Instance.new("Part")}
for i=1,32768 do
	local part = Instance.new("Part")
	part.Parent = j[i]
	table.insert(j, part)
end

Step 2: Write that instance to a file or model.

roblox.writeModelFile("./myNestedPlace.rbxm", {j[1]})

Actual result: Lune crashes with a fatal unrecoverable error: thread 'blocking-1' has overflowed its stack
Expected result: Lune either writes the file or complains that it's too deeply nested.

Whats the usecase for creating such a deeply nested instance structure? Does this error in roblox studio and/or are you able to save a file like this there?

I am using Lune to make Roblox place model/place files slightly smaller (by roughly half a percent) and is testing the extreme cases.

I can save a file like this on Roblox Studio with i=4000, however on Lune it crashes every time. I tried using editbin to increase the stack to hopefully save it, however it did nothing on Lune but increased the maximum nest on Studio (where I can save it with i=37000 atleast). Loading this file created from Studio and saving it again using Lune also crashes.

I can save a file like this on Roblox Studio with i=4000, however on Lune it crashes every time

Just out of curiosity - why do you need or have such extreme levels of instance nesting?
I agree that crashing instead of throwing an error here is not great, but after looking into this a bit more there is no clear path for us to error instead of crash for instances nested this deeply. Increasing the stack size might be more doable but I would still like to see a strong use case before we do that.

why do you need or have such extreme levels of instance nesting?

I stumbled upon this bug by simply trying to minify place files, then going into the extremes, minifying a instance with a lot of depth. The use-cases for such extreme levels of instance nesting is very slim: mainly the only thing is saving parity with Studio on extreme cases (which crashes when even mounting the extreme stack into the workspace unless the stack size is raised by editbin, which leads to some fun...)

there is no clear path for us to error instead of crash for instances nested this deeply

Thankfully, there is a workaround as most bugs do. It's possible to work around this by the LuaU side by creating a function that gets the instance nesting depth before passing it to crashy saver and throwing an error instead of it crashing. The most extreme and enormous of place files usually have a depth of just above a dozen (DataModelPatch.rbxm).

This said function is provided as below in Lua:

-- Query the depth of a Instance. Doesn't consume call stacks.
local function getdepth(object: Instance): number
	local queue = {{object}}
	local depth = 1
	while #queue ~= 0 do
		local last = queue[#queue]
		if #last ~= 0 then
			local inst = last[#last]:GetChildren()
			if inst[1] then
				table.insert(queue, inst)
				depth = math.max(depth, #queue)
			end
			table.remove(last)
		else
			table.remove(queue)
		end
	end
	return depth
end

local game = roblox.readPlaceFile("place.rbxl") -- Load place file. A ridiclously nested place file, up to at least 37,000 in depth, has no problems loading in Lune, thankfully.
assert(getdepth(game) < 2000, "Resaving error: Instance too deep") -- Check depth of Instance, so it can not crash
roblox.writePlaceFile("place2.rbxl", game) -- Resave it a bit smaller, but doesn't crash when nested :D

It can be fixed from the runtime side by converting the code to Rust and doing a equalivent check before passing it to the savers.

I've done an investigation and I believe I've fixed the underlying issue in rbx-dom that causes the stack overflow here. For the curious, we currently recursively traverse a tree when inserting into the underlying Dom and that's causing a problem for sufficiently deep trees. The linked PR swaps us over to an implementation that does not traverse it recursively, so we shouldn't be vulnerable to this anymore.

Fixed in #62 , next release won't have this issue anymore so I'll close this