Blu3wolf/Treefarm-Lite

Treefarms suddently stops working. Unsure what causes it...

Valerate opened this issue · 43 comments

Hi

My treefarms randomly stops working, both before sappling enter and when there's fully mature trees. Only fix is to replace the farm, which sucks to do over and over.

Modlist:
{
"mods": [
{
"name": "base",
"enabled": "true"
},
{
"name": "5dim_automatization",
"enabled": "true"
},
{
"name": "5dim_battlefield",
"enabled": "true"
},
{
"name": "5dim_core",
"enabled": "true"
},
{
"name": "5dim_decoration",
"enabled": "false"
},
{
"name": "5dim_energy",
"enabled": "false"
},
{
"name": "5dim_logistic",
"enabled": "false"
},
{
"name": "5dim_mining",
"enabled": "true"
},
{
"name": "5dim_module",
"enabled": "true"
},
{
"name": "5dim_ores",
"enabled": "true"
},
{
"name": "5dim_resources",
"enabled": "true"
},
{
"name": "5dim_trains",
"enabled": "true"
},
{
"name": "5dim_transport",
"enabled": "true"
},
{
"name": "5dim_vehicle",
"enabled": "true"
},
{
"name": "AdvancedEquipment",
"enabled": "true"
},
{
"name": "bobassembly",
"enabled": "false"
},
{
"name": "bobconfig",
"enabled": "false"
},
{
"name": "bobelectronics",
"enabled": "false"
},
{
"name": "bobenemies",
"enabled": "false"
},
{
"name": "boblibrary",
"enabled": "true"
},
{
"name": "boblogistics",
"enabled": "false"
},
{
"name": "bobmining",
"enabled": "false"
},
{
"name": "bobmodules",
"enabled": "false"
},
{
"name": "bobores",
"enabled": "false"
},
{
"name": "bobplates",
"enabled": "false"
},
{
"name": "bobpower",
"enabled": "false"
},
{
"name": "bobtech",
"enabled": "false"
},
{
"name": "bobwarfare",
"enabled": "false"
},
{
"name": "clock",
"enabled": "true"
},
{
"name": "CORE-DyTech-Core",
"enabled": "true"
},
{
"name": "Cursed-PI",
"enabled": "true"
},
{
"name": "EvoGUI",
"enabled": "true"
},
{
"name": "FARL",
"enabled": "true"
},
{
"name": "Force Fields",
"enabled": "true"
},
{
"name": "Foreman",
"enabled": "true"
},
{
"name": "InterfaceChest",
"enabled": "true"
},
{
"name": "KS_Power",
"enabled": "true"
},
{
"name": "Landfill",
"enabled": "true"
},
{
"name": "MAIN-DyTech-Dynamics",
"enabled": "false"
},
{
"name": "MAIN-DyTech-Machine",
"enabled": "true"
},
{
"name": "MAIN-DyTech-Power",
"enabled": "true"
},
{
"name": "MAIN-DyTech-War",
"enabled": "true"
},
{
"name": "MAIN-DyTech-World",
"enabled": "true"
},
{
"name": "RailTanker",
"enabled": "true"
},
{
"name": "research-queue",
"enabled": "true"
},
{
"name": "RoadWorks",
"enabled": "true"
},
{
"name": "rso-mod",
"enabled": "true"
},
{
"name": "scarynights",
"enabled": "true"
},
{
"name": "SmartLoader",
"enabled": "true"
},
{
"name": "SmartSplitters",
"enabled": "true"
},
{
"name": "TheFatController",
"enabled": "true"
},
{
"name": "Treefarm-AC",
"enabled": "true"
},
{
"name": "Treefarm-Lite",
"enabled": "true"
},
{
"name": "upgrade-planner",
"enabled": "true"
},
{
"name": "Warehousing",
"enabled": "true"
}
]
}

G'day Valerate! Can I ask what factorio and treefarm versions you are using?

I have the same problem so i can confirm the bug is real and exists. Currently i have more than 40 farms mk1 and this bug is very annoying to get there every 30 minutes and having to reconstruct 20% of them . It is very random . It actually makes me think of starting over again without your mod. I am using 5 dim mods and a few others , with factorio v 0.12.30 and Treefarm-Lite_0.3.6 + Treefarm-AC_0.2.3.
It doesn't cut down the mature trees and sometimes it also stop seeding , but that is very rare.
I have also looked at your code and i will modify it here and there to see if i can fix it instead of restarting the game again and if I succed i will also post here the possible fix.

Hi, my version is .29 of factorio and Ac_0.2.3 and Lite_0.3.5. And Bijo says it well, I cant reproduced, and it seems random. It seems the script stops randomly for some random farms? I'm not good with lua so i have no idea

Today it also happened with a farm with sulfur trees from dytech, so its not only happening with wood ones. I still have no idea how it happens. I also found one farm who had cut down like 10% of the tress but stopped. All the trees were mature.

was this save upgraded from an older version of Treefarm?
The script would not be stopping exactly, it runs constantly but only with one instance - the one script handles all trees and treefarms. More likely is a bug causing treefarms to get skipped in the updates queue, which would cause them to stop updating. would need to know what causes them to get skipped (if that is in fact the problem). Hopefully I can take a look into this soon, have not had much time for Treefarm lately (hands have been full just trying to get my computer to start lately!).

No, I just started playing again and I used the version which was newest at the time, 0.3.5. Here's some screenshots to show how bad it is
https://www.dropbox.com/s/mjn30ogy71is6a7/Skjermbilde%202016-04-29%2003.21.19.png?dl=0
https://www.dropbox.com/s/q2xz3sm68jd9g7f/Skjermbilde%202016-04-29%2003.20.56.png?dl=0

I tried adding more (it used to be 8x8, and its now 11x8) but almost all new ones I added stopped working after the first harvest

All the farm got mature trees, and the belts are empty. Some even stopped mid harvest so you can see some farm halfdone

Some just dont plant more after their harvest is done:
https://www.dropbox.com/s/ce5nqh8s6zpdztc/Skjermbilde%202016-04-29%2003.25.35.png?dl=0

Also with other tree suchs as dytech's sulfur and rubber:
https://www.dropbox.com/s/8wgygyhbp95saiu/Skjermbilde%202016-04-29%2003.43.37.png?dl=0

Well that gives me a suspicion as to what is causing it. Will have to think of a system to catch this and prevent it from happening - although Im not sure why its happening in the first place. Again still unable to work on this.

For anyone else wanting to look into this, its likely a bug in how fields are updated, which I would start by looking at the tick event and going into the fieldmaintainer function. If something caused a field to not get updated on its update tick, it would never update again. Probably what is happening here.

Not sure if the game can skip a tick for any reason, but looks like if it does than are farm scheduled to update that tick then the farm is never updated again, but it should still be in your to update table. You could loop over that table every few seconds to look for orphans.

Or maybe rework the tick method to use a master list of farms and use a mod operator to tick select farms. Example: if tick + farmIndex value % 60 == 0 then doUpdate end

You would iterate over each farm to do the tick math, but it should be negligible performance. I tick through my mod entity using this strategy: https://github.com/Peppe-KSP/InterfaceChest/blob/master/control.lua#L145 and https://github.com/Peppe-KSP/InterfaceChest/blob/master/control.lua#L211

Should stagger when the fields update and update a given field once per second.

It doesn't have to do anything with tick skipping , ticks aren't skipped. I have many moods including finite water and more than 60 tree farms and when i set my game speed to 10x i get only 400+ UPS instead of a fixed 600 ups which would had been incase my cpu was able to do so much processing. If i set some insane scan settings to finite water mod my ups and fps can drop to less than 1/second so i don't think it will skip any ticks. What i am trying to say is that if your cpu can't handle the processing of x ticks/second it won't skip them (compared to other games) instead it will lower your game speed by processing them as fast as he can.
Also set my speed to 10x for awhile and observed how the farms acted. So there were no new farms which stopped working in this time beside those 7 which weren't working when i started my observation. Replaced those 7 farms (which weren't working) by mining and placing them again in the same spot. They started working after replacing them but there were 7 others or so which stopped working after doing this. So the bug seems to appear either when you are placing a new farm or when you are removing one. Hope this helps in identifying the problem faster.

@Peppe-KSP Unfortunately TF doesnt use a list of farms in the present version, it uses a list of lists of farms, which each list of farms corresponding to all farms due to update on a specific tick. Searching the tick list would I think be pretty performance heavy, especially if it was done on a regular basis.

@Bijeporc - anything to get eyes on the code causing it helps, cheers!

What if instead of an index to a specific tick it was just and index 1 to 60?

On create you add the entity to the reminder of game tick % 60. So it will be in one of 60 buckets. That is effectively what your scheduling does now, but it shifts everything that updates to an index of now + 60. The shift is only complicating the update process.

In your main tick event you don't have to loop over everything since you have placed them into update buckets 1-60 on create. Use the same mod operation on the game tick to get the list that should be updated this tick. You then loop over the same way you do now, but you can drop all the rescheduling and if something fails in the update loop it is not dependent on a reschedule to update in the future.

One other issue I have seen with my mod when I had a similar line to this one:
global.tf.treesToGrow[event.tick] = nil

I expected it to remove a row from my table. LUA spec says that is the same as table.remove, but at least in the context of factorio modding it did not seem to be the same. You might swich to table.remove(global.tf.treesToGrow, event.tick)

For me nil seemed to still leave the nil row in the table and then mess up how I tracked entities to update. So my entity list would fall out of sync and weirdness would happen like chest 5 would only update if chest 1 didn't get removed. Finally i started outputting the index and saw the weird nil behavior.

Lua doesnt have rows as such. All rows in lua tables are nil unless specifically declared otherwise. Its probably worth noting that for the list of lists, that is a sparse list, usually with gaps.

The idea of making it modulo 60 is a good one and would fix the behavior seen currently, but it would do so imperfectly - specifically that trees could take up to twice as long to update sometimes.

I think Ill implement that change, but I will also hunt around for the root cause that is catching treefarms when placing or removing them.

Also, table.remove and setting a table value to nil are not quite identical - table.remove also changes the size of the table, and moves all other entries in the table 'down' an entry. Possibly that has something to do with what you found.

Yeah i was just tinkering with a offline copy and the mod 60 change is pretty lightweight. Biggest change is figuring how you want to calculate growth now and then also convert saves over.

My thought instead of delta time is every time you hit the seed give a check if it can grow this cycle:
local maxGrowTime = global.tf.seedPrototypes[seedTypeName].randomGrowingTime + global.tf.seedPrototypes[seedTypeName].basicGrowingTime
if math.ceil((math.random() * maxGrowTime/oldPlant.efficiency) % (maxGrowTime/1000) == 0 then
grow...
end

All the other deltatimes I think can just roll up to the one in growTree().

Whats wrong with how growth is currently calculated?

In your main tick event you don't have to loop over everything since you have placed them into update buckets 1-60 on create. Use the same mod operation on the game tick to get the list that should be updated this tick. You then loop over the same way you do now, but you can drop all the rescheduling and if something fails in the update loop it is not dependent on a reschedule to update in the future.

Right see, I dont loop over everything anyway. I loop over everything that updates this tick. That is the point of the list of lists data structure.

As far as converting saves over... Im not overly committed to doing that. When it hits version 1.0, and until version 2.0, then Treefarm will be backwards compatible for saves. Version 0.x.x, not so much.

If a tick cannot be skipped then scheduling something by the game tick should work fine, but if you go down the path of doing field and tree updates off mod 60 and you want a tree to grow one stage randomly in the future you would move it from scheduling by the future tick to random chance every update cycle.

Whatever the rate was before you would average it out for a random chance every 60 ticks. But maybe you were just moving the fields to a fixed update cycle and keeping the trees on the scheduler. Since the report was fields stop responding/harvesting, but trees still grow, so that piece is probably fine.

Well I have seen the bug in action now, so thats something.

And now I cant reproduce again... not sure what is going on here!

Okay, its not triggered only by placing or removing of fields. I just started a new game, placed 50 farms, and then after placing the farms, started to fill their inventories with seeds. All of the farms started planting, but not all of them finished. Some of the farms planted all their inventories, but other farms stopped planting after a few or a lot, before finishing.

Possible trigger conditions - filling fields inventory with seeds? fields planting the seeds? trees growing? trees being harvested?

In this particular test, no fields were ever removed, and none were added after starting to fill field inventories with seeds.

At the end of the test, 15 fields had stopped updating (failed reschedule in fieldMaintainer() perhaps?) out of the 50 placed.

Your growTrees loop does not do anything with an invalid seed/tree.

When I flipped a save from 3.6 to my local changes to mod 60 updates the growTrees table was tracking 276 entities for ~46 real seeds/trees.

Like I said I switched from = nil, to kill a table item to table.remove. Doing a #count on the table after nil shows no change, but doing a count after a remove shows what I think is the right count for valid entities.

@nbadp
logic looks ok to me. table.insert(global.tf.fieldsToMaintain[nextUpdate], fieldTable)
It is a nested table, so fieldsToMaintain[x] is a table containing a row fieldTable.
Each new field is added as a row, so it would be accessed:
global.tf.fieldsToMaintain[nextUpdate][1]
and the next field for that tick would be added as:
global.tf.fieldsToMaintain[nextUpdate][2]
etc..

@Peppe-KSP
Yes I figured that much and deleted that comment as horribly wrong.

When I flipped a save from 3.6 to my local changes to mod 60 updates the growTrees table was tracking 276 entities for ~46 real seeds/trees

When you say tracking, what do you mean? Were you counting the number of entries in global.tf.treesToGrow[event.tick]? Or doing a #global.tf.treesToGrow[event.tick]? #global.tf.treesToGrow would be the minimum number of trees growing I guess, seeing as each table in treesToGrow can have any number of trees in it, so long as its a positive number.

Like I said I switched from = nil, to kill a table item to table.remove. Doing a #count on the table after nil shows no change, but doing a count after a remove shows what I think is the right count for valid entities.

Thats fair, but I dont actually do a #count on the table contents. In the case of global.tf.treesToGrow[event.tick], I dont even bother removing each field as Im going through - it has a temporary life cycle. The table (an unordered list) is built as trees are cycled over when growing, and then on the appropriate tick they are themselves cycled over, building a new table (more likely many new tables, given the randomness), and then just setting the old one to nil.

Your growTrees loop does not do anything with an invalid seed/tree.

Yes. It does nothing, then after growTrees is called, the table is set to nil (event.tick section). In this manner, invalid seeds are not tracked.

Think I may have found it. Not tested yet, but line 630 and 638 of control.lua both have return statements which are triggered by a tree getting harvested. Returning the function at that point would cause any other treefarms that were due to be updated this tick, to get skipped. Changing the statements to break statements should fix it.

d0655eb has the changes made in it.

EDIT: its also not ready for the wild - LuaEntity API call when LuaEntity is not valid :P Guess I should probably test things before referring to them XD

24384b3 should fix the minor (obvious) error with d0655eb hopefully! No awkward Lua errors spotted anyway, so thats a plus.

This should fix treefarms never getting updated again from being due to update at the same time as another treefarm that harvested a tree on that tick. Whew!

I added a comment on the file and i will also add one here just to let you know.
For me it looks fixed all farms started to work accordingly.
Thank you.

Glad to hear it works! Im drafting a new release 0.3.7 which implements this as well as a few other minor things (mostly spelling in the readme). Thanks to all!

Thats fair, but I dont actually do a #count on the table contents. In the case of global.tf.treesToGrow[event.tick], I dont even bother removing each field as Im going through - it has a temporary life cycle. The table (an unordered list) is built as trees are cycled over when growing, and then on the appropriate tick they are themselves cycled over, building a new table (more likely many new tables, given the randomness), and then just setting the old one to nil.

Yeah I guess I am just curious if setting the processed tick to nil has some overhead cost. If the entry still remains in a raw count of the table using: #global.tf.treesToGrow. I know i loop over the table different using an index value, so the table count was important to me.

I think you are right you don't need to worry about the count of tree updates since you would never call the that tick index again, so if a tree is invalid don't worry about it as it's parent tick will be dead soon enough. Is there overhead to finding the parent tick by it's game.tick index after a 20 hour long game? -- Lua is returning 1 row that it had to sort out of 4 million nils?

What do you mean, finding the parent tick by its game tick index?

Lua sorts nothing out of nils. Everything in Lua is either nil or something non-nil. Until you specify a variable as having a value, it is nil. After you specify it as being nil, lua garbage collection takes it out of memory.

I think Im misunderstanding what you are asking.

I was thinking of your data structure like this: global.tf.treesToGrow[parent][list of child trees to update]

So game tick 1000 happens:
global.tf.treesToGrow[1000][loop all the trees scheduled for this tick]
global.tf.treesToGrow[1000] = nil
global.tf.treesToGrow[1001] = [loop all the trees scheduled for this tick]
global.tf.treesToGrow[1001] = nil

I was just curious if there is a performance hit to that where after 20 hours you are now on:
global.tf.treesToGrow[4,000,000][loop all the trees scheduled for this tick]
global.tf.treesToGrow[4,000,000] = nil

Where just asking for LUA to find the index takes lua more time than if those were removed with table.remove?

What is the count of global.tf.treesToGrow? Is the lua table 4 million long or is it only what is not nil?

If garbage collection clears it out maybe I just had some other issue in my mod when using myTable[index] = nil vs table.remove(myTable, index) and I attributed the resolution to the wrong thing and should trust nil and .remove are functionally equivalent.

Nil and table.remove are not functionally equivalent. Table.remove changes the value of a field in the table called n, which is where the table tracks the number of entries. Table.remove also moves all consecutive entries in the table down one space.

Setting it to nil just tells lua I dont want that specific information associated with that table entry any more. If there are no more pointers to that specific information, eventually lua garbage collection will remove it.

For a sparse table like I am using (the outer table), setting it to nil is fine. For the list of trees to grow this tick (the inner table), setting each one to nil as I went through it would cause bugs. I would want to use table.remove, because that would retain the list functionality. A list doesn't have gaps in its indexing, whereas a sparse table does. Table.remove on the first tree in the list would make the second tree in the list now the first, the third now the second, etc. Setting the first one to nil would leave the second one the second, the third the third, etc.

There should be no performance hit whatsoever between asking for a specific table found in the first index of the table vs the four millionth of the table, so long as the table doesnt have many entries (under a hundred say). This is actually how you implement a double queue in lua simply, by having a table that tracks push and pop operations. As each operation goes, it increments the top of the queue number or the bottom of the queue number. If you have the same number of push and pop operations, the spread between them is constant, but the actual numbers they use constantly increases.

Thanks for the info. LUA is an interesting beast.

With our talk yesterday I went over my own use of table.remove and discovered a bug when queuing up two table.removes in the same tick that would have unexpected behavior since the index shifts as you mentioned on .remove and index was my key. So I tried setting the rows to nil instead of removing it and changing the index which had some weirdness I don't understand. Setting the item to nil in the middle of a for loop seemed to end the loop early or mess up the index order because it would skip processing some entities -- like if i expected 25 deconstruction messages i'd only get a random amount like 10-15, but to nil that entity would require hitting my message.

If i set the row to just a custom string like "remove" and create a new list of all values that aren't "remove" I get the functionality I expect -- 25/25 deconstuction messages. Then I can rebuild the list with the items that are not tagged "remove".

Maybe I should abandon looping by index, but wanted to fix it within my current looping structure before switching to key/value pair looping.

Okay, setting an item in a table to nil will cause weirdness. You are presumably iterating over the table in a loop. The for i, v in pairs do loop is expecting entries in it starting at i = 1, and continuing until some i that has no value (nil). The table ends there. No wait it doesnt you say, because you have more values after that one! Pairs doesnt know that. Ipairs doesnt either. If you have a non sequential table (i.e. with gaps in the index numbering) your loop isnt going to work, because its basically (if using pairs or ipairs) calling next() to get to each new value, and if the new value is nil, conventionally that means the end of the table.

http://www.lua.org/pil/7.3.html

The bottom half of that page explains what I mean in the case of pairs.

Would recommend the whole book actually: http://www.lua.org/pil/contents.html#P1

Closing as it seems fixed, feel free to re-open if the issue persists.

With our talk yesterday I went over my own use of table.remove and discovered a bug when queuing up two table.removes in the same tick that would have unexpected behavior since the index shifts as you mentioned on .remove and index was my key. So I tried setting the rows to nil instead of removing it and changing the index which had some weirdness I don't understand. Setting the item to nil in the middle of a for loop seemed to end the loop early or mess up the index order because it would skip processing some entities -- like if i expected 25 deconstruction messages i'd only get a random amount like 10-15, but to nil that entity would require hitting my message.

If i set the row to just a custom string like "remove" and create a new list of all values that aren't "remove" I get the functionality I expect -- 25/25 deconstuction messages. Then I can rebuild the list with the items that are not tagged "remove".

I wouldnt mind seeing the script in question. You probably need not set each entry to anything really. If table a is your list, as you iterate over a you create table b, which is table a sans any items to remove. Then after iterating, set a = b so that your new table is the old table sans removed items. Or perhaps I am misinterpreting your post?

Here is the release using "remove" workaround:
Peppe-KSP/InterfaceChest@add957c#diff-2f29ee9c98c28d035046fb40285456daR157

Setting that index to nil during the main loop, i think did what you said earlier/another post and killed the processing of the rest of the list items, but setting it to "remove" let the iterator continue. I guess I could have kept my other list "chests to delete" and used that to compare to when rebuilding the list.

In the end I moved to something similar to what we talked about for your trees. I run on 6 ticks, so I made 6 buckets and I add new chests with a unique ID, so I can set that specific chest to nil when needed and not affect the processing.
Peppe-KSP/InterfaceChest@2ceb7bc#diff-2f29ee9c98c28d035046fb40285456daR168

Hi,
Just take a look at that cool stuff, it the coolest stuff ever! You'll be simply amazed, check this out http://basis.benjamincampagnuolo.com/e7ktv/02

Yours sincerely, blue

Yo!

I've just came across that interesting article, and I thought you might be interested in such stuff, you may read it here http://help.tonyshaff.com/e7mjw/2

blue