mibk/dupl

Allow disabling of check with in code comments

scags9876 opened this issue ยท 14 comments

Great tool by the way! It has enabled me to write better code by pointing places where things can be dry'd up.

It would be nice to disable output for certain places in code where I know there is duplication, but I've chosen to keep the duplication in place. Sometimes there are reasons for this.

I'm thinking something akin to a linting comment to disable duplication checking:

// dupl-disable
            Context("when specifying parameter set 1", func() {
                BeforeEach(func() {
                    request, _ = http.NewRequest("GET", "/?paramA=20&paramB=40", nil)
                })
                It("should reflect the desired parameters", func() {
                    server.GET("/", func(c *gin.Context) {
                        paramSet1, _ := getParamSet1(c)
                        Expect(paramSet1.A).To(Equal(20))
                        Expect(paramSet1.B).To(Equal(40))
                    })
                    server.ServeHTTP(recorder, request)
                })
            })
// dupl-enable

...

// dupl-disable
                Context("when specifying parameter set 2", func() {
                    BeforeEach(func() {
                        request, _ = http.NewRequest("GET", "/?paramX=5000&paramY=10000", nil)
                    })
                    It("will reflect the desired parameters", func() {
                        server.GET("/", func(c *gin.Context) {
                            paramSet2, _ := getParamSet2(c)
                            Expect(paramSet2.X).To(Equal(5000))
                            Expect(paramSet2.Y).To(Equal(10000))
                        })
                        server.ServeHTTP(recorder, request)
                    })
                })
// dupl-enable

Using comments like // dupl-disable for this is not a good idea.

mibk commented

Thank you!

But I agree with @shurcooL. It's not a good idea. What about a tool that would filter dupl's output, or its input. I understand the convenience of this, but I think we should come up with something else to solve this problem.

BTW cutting out the parts between these comments from a file could cause that it was impossible to build an AST of the rest of the file, and thus impossible to search for any duplications at all.

Any other ideas?

novln commented

A kind of naive idea, but why not create a configuration file at the root of a project, with a list of file to ignore ?

Something like .eslintrc for http://eslint.org/

mibk commented

I have thought about it too, but I think this solution is not sufficient as sometimes it is only part of a file not to be included for clone-searching. And it could be achieved now already: just specify a list of input files for example like this:

dupl $(find . -name '*.go' $(printf "! -name %s " $(cat .duplignore)))

The list of files here would be everything ending with .go not included in .duplignore. And this command can be put in a Makefile...

@mibk as it is mentioned in README dupl can report false positives on the output. It makes really problematic to use it combined with a CI for automatic builds, when it reports them - at the moment you can only disable checking for whole file which isn't too optimal.

mibk commented

Yes, I know. Originally it wasn't supposed to be used in CI systems. This is an open issue because I don't know how to solve it. Does anyone have an idea?

A possible solution is to not run dupl in CI systems.

dupl is a tool for finding code clones. Such a tool is best used in the hands of a programmer who can tell if a code clone should be acted on, or if everything's normal.

It is not a tool for reliably finding errors in code.

mibk commented

Yes, totally agree. The purpose of dupl is slightly different from other tools that are usually run in CI systems.

mibk commented

Leaving this open though as I'm not entirely against the idea provided a reasonable solution is found. I understand some people might feel the urge to use it in CI systems.

mibk commented

I've come up with an ugly hack (that I personally don't intend to use) for this issue. Instead of using comments to control dupl's behaviour, one can use dummy statements (e.g. _ = true) to make the clone syntactically different from other clones. For example like this:

Context("when specifying parameter set 1", func() {
    BeforeEach(func() {
        request, _ = http.NewRequest("GET", "/?paramA=20&paramB=40", nil)
    })
    _ = true
    It("should reflect the desired parameters", func() {
        server.GET("/", func(c *gin.Context) {
            paramSet1, _ := getParamSet1(c)
            Expect(paramSet1.A).To(Equal(20))
            Expect(paramSet1.B).To(Equal(40))
        })
        server.ServeHTTP(recorder, request)
    })
})

And then pasting these dummy statements on different places, or in different variations (e.g. _, _ = true, true). Like I said, it's ugly... Anyway, to anyone who would actually consider using it: dupl doesn't see the difference between _ = true and _ = false, so the statements really have to be syntactically different in order to make this ugly hack work.

When copy pasting code, one should think about creating a function. Other people and contributors who need to dive in also need to understand it. When code is copy pasted and a little changed it comes less clear what it is doing. In large code bases this can become a major problem.

One has to rethink where the code (or function) may live, and what purpose it is fullfilling.

I had exactly the same problem. One testing function copied between two internal packages. After thinking what problem I needed to solve it got its place in a central package common to both packages. And the duplication went away.

Thats completely true and agree on that, but the dupl tool doesn't behave that way. What I copied was a whole function including the same body between two packages and dupl was complaining. So I "copied a little between two packages", which was also a test function (not put in the actual released binary).

But thanks for pointing to that piece of talk.

I believe this can be disabled using this comment above the function/method if one is using golangci-lint tool.

// nolint: dupl
Ref: https://golangci-lint.run/usage/false-positives/