Handling unformatted code

Question

Handling unformatted code

Closed this issue 9 years ago · 44 comments

Opening this up for discussion because I don't think it would be unreasonable for Cap to detect unformatted code in the chats and:
a) warn the user
b) migrate it to the bin and warn the user
c) migrate to the bin and mock the user
d) migrate to the bin and post a formatted version on behalf of the user

There should also be a throttle on it, because often even regulars will post, edit quickly, ctrl + k, send again. maybe 10 seconds?

These are all just random ideas. What does everyone think?

awalgarg commented 9 years ago

SO Magic!

Answer 1 · 2015-05-26T13:24:21.000Z

If you fin a reliable way to detect unformatted code, I'm in favor of trashing it and warning the user. We can do the mocking ourselves.

Answer 2 · 2015-05-26T13:26:47.000Z

I agree with @honnza. I don't have many ideas for how you'd reliably detect unformatted code, though

Answer 3 · 2015-05-26T13:30:36.000Z

maybe a command that roomowners could use to format other people's code automatically? Now that we have the clipboard accessible, we could add an extension that allows everyone to copy the ID of every message by clicking a button on that message. So the ROs can call the command, and copy paste the ID without too much hastle, and the bot'll do the rest. No need for automatic recognition which could fail, this way

Answer 4 · 2015-05-26T13:30:36.000Z

{
howto: Letter frequency, especially special chars vs. alnum

action: warn (actually notify and teach about Ctrl+K) later if message is still unformatted bin it
}

Answer 5 · 2015-05-26T13:32:42.000Z

Come up with specific rules for what defines "unformatted code" and it's possible to convert the rules into a regex. After a short chat conversation, it's hinted that "Both { and } occuring in multiline messages means the message needs work." Please discourse on more criterias.

Answer 6 · 2015-05-26T13:33:15.000Z

I know that php codersniffer detects code in comments, so it's definitely possible.

Answer 7 · 2015-05-26T13:33:54.000Z

Right now I don't think we need a formal solution, this is meant to be a discussion whether or not this feature would be useful. We can as a group discuss the criteria for message migration, _after_ we decide if it is useful or wanted.

Answer 8 · 2015-05-26T13:34:00.000Z

can codersniffer be reverse engineered?

Answer 9 · 2015-05-26T13:34:25.000Z

I'd just eval and check for syntax error...

Answer 10 · 2015-05-26T13:34:34.000Z

"definitely" for "useful"

Answer 11 · 2015-05-26T13:34:53.000Z

@ralt how would you eval css? :P

Answer 12 · 2015-05-26T13:35:32.000Z

@awalgarg this is the JS room, nobody who doesn't know what he's doing should post css

Answer 13 · 2015-05-26T13:36:20.000Z

I support this idea because it's not like there could be any harm arising from helpful guidances leading to formatting code. Feel free to request ownership of my code dump room if you'd like to help moderate it.

Answer 14 · 2015-05-26T13:36:45.000Z

sure we want leave javascript code with syntax errors unformatted? Also, code in other languages? -1 on eval

Answer 15 · 2015-05-26T13:37:10.000Z

mhmm, what about code by newbies having syntax errors which they need help in fixing? @towc

Answer 16 · 2015-05-26T13:37:34.000Z

@rlemon I am totally for it. Just one addition: have the bot remind the user how to format code

Answer 17 · 2015-05-26T13:38:16.000Z

better yet, redirect them to jsfiddle

Answer 18 · 2015-05-26T13:38:34.000Z

@awalgarg that's why I suggest having ROs do the validations. Not to automate the process, just to make it a lot easier for them

Answer 19 · 2015-05-26T13:39:23.000Z

In fact, all we need is a bot command to trash chat posts

Answer 20 · 2015-05-26T13:39:49.000Z

This is really a very broad topic. Can we break it down into smaller issues?

detect javascript only code which should be binned and posted on jsfiddle instead for a start?

Answer 21 · 2015-05-26T13:41:48.000Z

My suggestion is to start with a bin command. Auto-trigger can come next.

Answer 22 · 2015-05-26T13:45:28.000Z

@awalgarg this entire discussion is about whether it is a good idea. not the implementation of said idea. Everyone just took it a step further.

Answer 23 · 2015-05-26T13:46:07.000Z

It is a good idea. I doubt anyone disagrees that

Answer 24 · 2015-05-26T13:49:15.000Z

Yeah, as honnza said, we all agree it is a good idea since we have a known problem which needs a solution. Being programmers, we naturally started looking for an implementation ;)

Answer 25 · 2015-05-26T13:56:47.000Z

I'm mostly waiting for the sleepy heads to wake up and chime in. Zirak, otherBotRunners, etc.

Answer 26 · 2015-05-26T14:00:37.000Z

Detection sounds simple for 90% of cases.

Detect $(", function() <div or .controller(" and trash those to a "please post formatted code" room. I can write a simple ML that'd detect code more reliably or we can use an existing library but honestly it's super overkill.

Answer 27 · 2015-05-26T17:38:21.000Z

Sounds good. The thought passed my head a few times, but the genie always told me detecting the 100% was too difficult.

90% is good enough. I'll dig through the transcript and try to come up with something.

Answer 28 · 2015-05-26T19:07:51.000Z

After some fooling around, here's what I came up with:

function isUnformattedCode (text) {
    var lines = text.split('\n');
    if (lines.length < 4) {
        return false;
    }

    var codeyLine = /^\}$|\}$|^<\//;
    return lines.some(/ /.test.bind(codeyLine));
}

Searched the transcript for !!format, ran that against all messages in the time block, saw that it agreed with them and caught some more. Most importantly, miraculously it's yet to provide me a false positive, tested against today's and yesterday's chat history.

Methinks the algo should look something like this:

Ignore if user is an owner/mod
Bin and teach <2k users
Teach >=2k users

By "teach" I mean a message like "Please don't post unformatted code - use Ctrl+K before sending (hit up to edit messages). See the FAQ [faq link]".

If the user sent a long message (>10 lines), it'll also have a "or use a paste service like [links]".

Thoughts?

Answer 29 · 2015-05-26T19:52:05.000Z

Neat. The rules sound good to me. When will the maid be implementing this?
On May 27, 2015 12:37 AM, "Zirak" notifications@github.com wrote:

After some fooling around, here's what I came up with:

function isUnformattedCode (text) {
var lines = text.split('\n');
if (lines.length < 4) {
return false;
}
var codeyLine = /^\}$|\}$|^<\//;
return lines.some(/ /.test.bind(codeyLine));
}

Searched the transcript for !!format, ran that against all messages in
the time block, saw that it agreed with them and caught some more. Most
importantly, miraculously it's yet to provide me a false positive, tested
against today's and yesterday's chat history.

Methinks the algo should look something like this:

Ignore if user is an owner/mod

Bin and teach <2k users

Teach >=2k users

By "teach" I mean a message like "Please don't post unformatted code - use
Ctrl+K before sending (hit up to edit messages). See the FAQ [faq link]".

If the user sent a long message (>10 lines), it'll also have a "or use a
paste service like [links]".

Thoughts?

—
Reply to this email directly or view it on GitHub
#238 (comment).

Answer 30 · 2015-05-26T21:18:18.000Z

I know I'm late but I'd just like to chime in with my opinion:

Don't do the bin/teach cutoff at 2K, that's ridiculous. It needs to be much much lower, I have ~1.3K rep and I'm a very knowledgeable person.

I had typed out why we shouldn't do this at all (seriously guys, binning unformatted code automatically why do we even need room owners these days just make Caprica automatically kick people too) but I'm going to let it go and suggest a sensible "smart user" rep level.

Answer 31 · 2015-05-26T21:48:41.000Z

@AmaanC I'll be home this weekend, will try and take a stab at it.

@Jhawins

Don't do the bin/teach cutoff at 2K

Not set in stone, we can take it back to 1k (which is also /welcome's lower threshold), but most regulars do have more than 2k.

why do we even need room owners these [if we have features like these]

"That's a room owner's job" isn't a reason to not implement this. Lacking this task, you won't find our room owners bored; we're room owners, not people who hunt down unformatted messages and lecture users on the basic etiquette of chat.

Binning unformatted messages and correcting people is one of the menial things you have to do to maintain a normal conversation. Why not automate it? It's a mechanical process, there's nearly no thought behind it, it's repetitive, and it's annoying. I don't do it as much as I used to because of these reasons.

just make Caprica automatically kick people too

I'd love to. Boy oh boy would I love to. Imagine not having to deal with help vampires. Imagine not having to deal with spammers or bigots. Wouldn't it be great? Wouldn't it be awesome if some automatic process took care of the mindless things, and left the more serious stuff to us?

Answer 32 · 2015-05-27T00:04:16.000Z

I appreciate you replying to everything but you know I have no defense lol.

Rep level though I needs adjusting still. We are a "rep != knowledge"
community so 2K rep is not a decent "teachable user" level.
On May 26, 2015 4:48 PM, "Zirak" notifications@github.com wrote:

@AmaanC https://github.com/AmaanC I'll be home this weekend, will try
and take a stab at it.

@Jhawins https://github.com/Jhawins

Don't do the bin/teach cutoff at 2K
Not set in stone, we can take it back to 1k (which is also /welcome's
lower threshold), but most regulars do have more than 2k.

why do we even need room owners these [if we have features like these]
"That's a room owner's job" isn't a reason to not implement this. Lacking
this task, you won't find our room owners do without things to do in the
long room; we're room owners, not people who hunt down unformatted messages
and lecture users on the basic etiquette of chat.

Binning unformatted messages and correcting people is one of the menial
things you have to do to maintain a normal conversation. Why not
automate it? It's a mechanic process, there's nearly no thought behind it,
it's repetitive, and it's annoying. I don't do it as much as I used to
because of these reasons.

just make Caprica automatically kick people too
I'd love to. Boy oh boy would I love to. Imagine not having to deal with
help vampires. Imagine not having to deal with spammers or bigots. Wouldn't
it be great? Wouldn't it be awesome if some automatic process took care of
the mindless things, and left the more serious stuff to us?

—
Reply to this email directly or view it on GitHub
#238 (comment).

Answer 33 · 2015-05-27T19:26:02.000Z

@awalgarg Sadly (AFAICT) that's serverside SO magic.

@Jhawins

Rep level though I needs adjusting still.

Sure, what do you think will be better? 1k as in /welcome?

Answer 34 · 2015-05-28T14:33:39.000Z

I say remove the rep limit fully and implement a throttle. I have X seconds to edit the message and format it before the bot bitches at me.

Answer 35 · 2015-05-28T21:33:34.000Z

That'll be in there anyway.

Answer 36 · 2015-05-29T15:18:21.000Z

Maybe too heavy or unsupported, but might be relevant: https://github.com/tj/node-language-classifier
It uses the deprecated classifier internally.
I didn't check how it handles unsupported languages.

Answer 37 · 2015-05-29T17:45:47.000Z

+1 for Zirak

Answer 38 · 2015-05-30T11:16:52.000Z

@gtomitsuka That seems to assume the input is a programming language, when we want to determine whether it is one. Obviously the dumb regexp above won't match a slew of languages, but it seems to get that 90%.

Answer 39 · 2015-05-30T16:10:53.000Z

Timeout is 10 seconds, messages will be binned to Trash Can, rep threshold is 2k (due to lack of better suggestion)

Answer 40 · 2015-06-01T20:44:54.000Z

@Zirak, the limit should be three lines and not four. Sorry I don't feel like this needs a new issue (considering how fresh the feature is). re-open if you agree, otherwise I'll start a new issue.

Answer 41 · 2015-06-01T23:28:10.000Z

(otherwise, good work, you made us proud, etc)

Answer 42 · 2015-06-04T22:33:44.000Z

Let's give this a couple more days and revisit if it gives too many false negatives?

Answer 43 · 2015-07-28T18:01:16.000Z

@Zirak this is no longer working. Should we reopen this or start a new issue?

http://chat.stackoverflow.com/transcript/message/24729564#24729564

Bot sees every line as a new message.

http://i.stack.imgur.com/l9BIE.png

not sure if that is expected or not.