totemo/watson

Logblock - request for comments

Closed this issue · 5 comments

I'm one of the maintainers of Logblock and I plan to rewrite the output soon to offer a choice of output formats - specifically

  • plain text designed for rcon
  • colored text designed for console
  • json-enhanced text designed for in-game players
  • CSV designed for text dumps

The idea is to let server admins choose what output method they prefer, but also allow it to be overridden on a per-session or per-command basis

The thought occurs that Watson could benefit from that, either by using the CSV output or by having an output format specifically for Watson

So my question is, what would be the ideal format for Logblock to send you data?

Ooh. Thanks for asking.

Although I haven't had much time to tackle it, at one point I was thinking about writing a plugin that would talk to any logging API and then use plugin channels to talk to Watson, so that Watson didn't have to parse everything out of chat. Because really, chat is mostly for people.

I was very fortunate to be a moderator on a server that happened to use LogBlock, and lucky that LogBlock provides timestamps and coordinates in a way that made Watson a viable proposition.

So, that said, I think it's instructive to look at the kind of issues that I deal with when I add support for a Logging plugin. They are, in no particular order:

  • Getting precise timestamps of edits. Plugins that offer vague timestamps like "just now" or "less than a minute ago" or relative timestamps like "5 minutes ago" are hard for Watson to deal with. Watson wants hours, minutes and seconds at the very least, so that it can draw edits in sequence. In survival mode that's usually good enough because most of the time players can not place or destroy much more than one block a second. If they do, it's some kind of spam where you don't really care about the exact sequence of events. For xray, you might, but that's survival mode with with a pick that mines at most one block a second. In my ideal world, if LogBlock stores those timestamps to a finer resolution, then having accuracy down to tenths of a second might be good. I don't think it needs to go beyond that.
  • One problem that sometimes comes up is LogBlock returning results outside the queried time range. The "/w pre" Watson command queries a specific time range of edits to get those that happened before some edit of interest. Unfortunately, it seems LogBlock is not quite so precise about what it returns, when it comes to querying with a specific range of timestamps. It might return results from 30 seconds or so after the requested time range, and users find that confusing.
  • Watson likes to know what world an edit happened in, so that it can store results in separate local (client process) data structures - one per world. The Minecraft client doesn't know about worlds; it only understands dimension types. On a server with multiple dimensions of overworld (say) type, Watson will probably munge all those results together even if the edits happened in separate worlds.
  • Watson needs to know the type and data value of blocks. By default, LogBlock doesn't distinguish between various cases when reporting results to chat - a stair is a stair, regardless of its orientation and a slab is a slab, whether in the upper or lower half of the block. I do provide a custom LogBlock materials.yml file that makes LogBlock report the types and orientations of blocks more specifically. For some other logging plugins I don't care what the plugin names the block: whether "upper birch slab" or just "birch slab" because the plugin also reports the numeric ID and data value of the block and I parse those instead. So in an ideal world, perhaps being able to tell LogBlock to format a block as " (:)" would make that problem go away and allow Watson to render all the stairs and whatnot fairly exactly.

I've shied away from designing a specific format, partly because I'm very busy at the moment (sorry), partly because Watson does ok the way things are, and partly because I suspect it might be the wrong to question to ask (the right one being: when am I going to get around to the plugin-channel thing?).

However, I know that there are some smart, capable people who use Watson, know the code base and have an opinion. So if you're one of those, I won't be offended if you want to respond here. I'm in a rush and may have forgotten to mention something.

+Edits to fix grammar, missing words, " (:)"

Nice feedback

Some is very achievable, some less so

  • LB actually uses the MySQL datetime type, so displaying sub-second accuracy isn't possible*. The output format is tweakable in the config, but if I were doing some sort of Watson output mode I'd probably fix that to ISO 8601 or unix epoch
  • Yeah, internally LB converts any time parameters to the form x minutes before the current time before sending it to the DB. For "since 1h" and similar this makes sense, because it avoids timezone issues (MySQL DateTime is not timezone-aware). For fully-specified times, it's annoying. This would be on the lower-priority small tweaks list, simply because the parameter parsing and query building code is fairly involved, with a great deal of special-casing
  • LB is actually only capable of outputting results for one world at a time* as they are stored in separate tables (technically you could make two worlds log to the same table but then LB has no way of distinguishing which world an action occurred in) - adding the world the query was asking about to the results header would be trivial
  • Would an output format where it's not human-readable at all be useful, or would you also want friendly output to display directly to the end-user? I was thinking something like datetime, name, uuid, coords, block replaced type, new block type, new block data. Or a hybrid mode, where for each action it displayed a "friendly" line and then a non-human-readable line which Watson could suppress?

* At least, not without altering the database structure, and that's both not in scope and would result in, potentially, DB conversions taking an hour or more

In all honesty, using a "generic" plugin channel for watson-like plugins would be kind of awesome. That would easily fix a lot of issues you have with chat "snipping" plugins.
It would also allow for a more programmer friendly environment because it is just a "packaged byte[]". So you could just format it how you want.
The best thing is that you wouldn't have to worry (well, almost) about the parsed data. You can pack up to 1MB of data per "packet" in those Plugin Channels.

We use YYYY-MM-DD on our server so we indeed have had to change those formats in code. I would much rather see it configurable, but I haven't really gotten to that just yet. It's quite a hefty change.

The next best thing would be a query param added to LogBlockQuestioner, for ex "csv" or "precise", which would then just like the binary format skip the time parsing and if possible even just hand over ID's. (However bukkit has in the past skewed away from ID's).
You can still get ID -> Block in the client rather easily. You could also use the formatted names but I'm not sure how that would go with modded.

LB indeed only shows per-world statistics when you do a lookup. The client can't distinguish between the worlds so it would be rather hard to make the client be multi-world aware unless some extra hooks were added to LB (and the client).

I do have experience with "Plugin Channels" both on the server and client, and would gladly help, but I'm sure that there indeed are smarter people than me that know the codebase better than me.

I assume this discussion has now run its course. Thanks to everybody.