List of sub-optimal definitions in `Data.List.Base`
JacquesCarette opened this issue · 28 comments
While looking at PR #2355 from @mildsunrise I noticed a ridiculous number of uses of with
. These turn out to not be localized to Data.List.Properties
but come from uses of with
in Data.List.Base
. The vast majority of them can be simplified, with a really nice knock-on effect to simplify quite a few proofs.
But I'm guessing this is yet another case of breaking change, so has to wait until 3.0? I think we should outlaw uses of with
in function definitions, if favour of defining explicit 'helper' functions.
This is, I think, the standing criticism from @omelkonian and others (related to the various issue he has posted on the main issue tracker about inspect
and our attempts to deprecate it...), but only indirectly alluded to in #1937 / #2123 ... although PRs associated with those have tackled some of the worst offenders (esp. wrt turning with
into let
, esp. in recursive calls).
In response to @MatthewDaggitt 's cautious nudge, @mildsunrise suggests in #2355 that the refactoring of catMaybe
/mapMaybe
that I suggested based on Orestis' prelude may in fact have the same behaviour as the existing definitions, so we should investigate that for all the examples you have identified... and if the computational behaviour is the same, then I think that shouldn't count as breaking
...
@JacquesCarette Could you make a task list here on the issue of the 'worst'/most egregious examples? Then the analysis could be parallelised across us all... and we could document our arguments for the behaviour being preserved...?
UPDATED: on a PR-by-function basis
Yes if the computational behaviour is the same then it's backwards compatible.
As long as "forcing evaluation by with
" and "forcing evaluation through pattern-matching" are 'the same computational behaviour', then yes. They sure have very different usability behaviour!!
Updated with an actual list.
Looking at the list, definitely a case of too much with
on a Bool
instead of if_then_else_
as in the first bullet of #1937 ... notwithstanding that proofs of properties of such functions may themselves require with
to fix the types up properly in each case branch... (and I suppose at some point it might be relevant to document the list of "650--700" uses of with
in stdlib
I refer to there...)
See also @gallais 's comment there about if_then_else_
potentially being too eager to unfold...
initLast
is an early example of using views to write functions... for which the use of with
is often (but not always) a corollary side-effect... so I'd be tempted to go after the lower-hanging fruit first ;-)
Re: lower hanging fruit: clearly some uses of with
(eg in merge
) have already been 'deprecated' in favour of if_then_else_
, so I suspect the others might also be 'harmless'.
i would love to see this happen. how would we name the new helper functions? do we have existing modules where we're using helper functions that serve as precedent? also, aren't there compatibility concerns from adding many new exports to a Base
module?
Well, the way to replace with
by an explicit helper function is already described in the paper which introduces the notation... but as to polluting the namespace, such things may always (?) be encapsulated as part of a local where
block, so I think the knock-on effect on the API can be minimised?
The problem with local-only functions is that it's hard to prove things about them. Some of those helpers are independently useful, so those are 'easy', in a sense.
For many of the ones in Data.List.Base
, the 'helpers' actually already exist - most of them are just maybe
from Data.Maybe.Base
!!
initLast
is an early example of using views to write functions... for which the use of with is often (but not always) a corollary side-effect... so I'd be tempted to go after the lower-hanging fruit first ;-)
I'm starting to see that there are different kinds of views. At the very least:
- structurally obvious sub-computations (where you want to match on the results)
- computationally necessary sub-computations (on which you want to match)
- change of coordinates, then match
For 1, I'd argue that using views for defining a function (as opposed doing a proof) is an anti-pattern. For 2, like initLast
, I am undecided. For 3, I think with
makes quite a bit of sense.
Interesting analysis! (and some of the first since VfL to subject this monster to more 'scientific'/empirical scrutiny ;-))
'Anti-patterns in DTFP': a catchy, if niche, book title!?
Don't know if we could fill a book (yet), but certainly a non-trivial paper.
@JacquesCarette for how many of your shopping list do you have PRs ready-to-go? (I don't want to duplicate work... ;-))
Actually, while I did start working on removing some uses of with
that seemed like overkill, I specifically avoided Data.List.Base
so as to not clash with #2355 . But then I realized that it shouldn't make a difference: all those proofs should still go through as is (as that's what's happened with the others, where I'm likely post a PR soon). One could perhaps improve some of the proofs further (more of them could become equational), but that can wait.
So: go ahead.
What remain to do are:
unalignWith
(needs a helper function, and then some properties about it commuting with variousmap
s which had me tied up in knots)initLast
not sure you can replacewith
here, except cosmetically viacase_returning_of_
, but curious if you find an alternative...?unsnoc
ditto. but withcase_of_
So I'll stop there!
Re: initLast
and unsnoc
.
unsnoc
is for sure definable without with
:
unsnoc [] = nothing
unsnoc (x ∷ xs) = just (maybe′ (λ (ys , y) → x ∷ ys , y) ([] , x) (unsnoc xs))
but you can't then define initLast
in terms of it, because the InitLast
view also encodes an equality which isn't (immediately) observable about unsnoc
... so requires a separate proof (in Data.List.Properties
?).
@JacquesCarette How might you seek to resolve that?
I've tried multiple things and: I can't definite initLast
without using with
. The InitLast
view computes a lot, and that won't allow me to give any alternate definite (that I can see). I don't think the InitLast
view is wrong in any way -- I think this is just inescapable for the change of view from cons to snoc.
Ob: initLast
. Nowadays, we'd (more) typically name a function-symbol-quoted-as-a-constructor as <backtick><constructor-name>
(I think I learnt this undocumented (?) syntax pattern from @gallais ), so here the view 'should' perhaps be written as:
data InitLast {A : Set a} : List A → Set a where
`[] : InitLast []
_`∷ʳ_ : (xs : List A) (x : A) → InitLast (xs ∷ʳ x)
But not before the hypothetical-rewrite
milestone, I suspect... ;-)
Similarly, the constructor (prec: 5) and the operator (prec: 6) are declared with different precedences!?
I've tried multiple things and: I can't definite
initLast
without usingwith
. TheInitLast
view computes a lot, and that won't allow me to give any alternate definite (that I can see). I don't think theInitLast
view is wrong in any way -- I think this is just inescapable for the change of view from cons to snoc.
I suppose... you could write
initLast′ : (xs : List A) → xs ≡ [] ⊎ ∃₂ λ ys y → xs ≡ ys ∷ʳ y
initLast′ [] = inj₁ refl
initLast′ (x ∷ xs) = Sum.[ lem₁ , lem₂ ]′ (initLast′ xs)
where
lem₁ : xs ≡ [] → x ∷ xs ≡ [] ⊎ ∃₂ (λ ys y → x ∷ xs ≡ ys ∷ʳ y)
lem₁ eq = inj₂ ([] , x , cong (x ∷_) eq)
lem₂ : ∃₂ (λ ys y → xs ≡ ys ∷ʳ y) → x ∷ xs ≡ [] ⊎ ∃₂ (λ ys y → x ∷ xs ≡ ys ∷ʳ y)
lem₂ (ys , y , eq) = inj₂ (x ∷ ys , y , cong (x ∷_) eq)
unsnoc′ : List A → Maybe (List A × A)
unsnoc′ xs = Sum.[ (λ _ → nothing) , (λ (ys , y , _) → just (ys , y)) ]′ (initLast′ xs)
which is morally what already exists, but is with
-free, albeit at a non-trivial cost of intelligibility... as well as the (losing the) payoff of view-based programming: why prove something (an equation) that the typechecker can do for you for free?
The problem with local-only functions is that it's hard to prove things about them.
You perhaps mean something else but if the issue is that you cannot refer to them, let
me remind you of named where blocks:
f : ty
f = g
module F where
g : ty
h : ty
h = F.g
Putting module F
in front of the where
suddenly gives you the ability to refer to these
local functions.
The problem with local-only functions is that it's hard to prove things about them.
You perhaps mean something else but if the issue is that you cannot refer to them, let me remind you of named where blocks:
f : ty f = g module F where g : ty h : ty h = F.gPutting
module F
in front of thewhere
suddenly gives you the ability to refer to these local functions.
That is interesting, and something I hadn't known/appreciated before (though I think I'd seen #1166 discuss this feature without really getting what was going on) ... not sure about @JacquesCarette though!
Regarding "polluting the namespace with lots of module
names" (#1166 ), can such declarations also be marked private
? (Maybe that's moot if/while we retain a Base
/Properties
separation though!?)... Ah, OK documentation here
So to keep this thread alive until we can close it sooner rather than later: suggest
- remove
initLast
andunsnoc
from the list above - fix
unalignWith
Then close?
Agreed - did the first bit, no time right now to do the second.
@mechvel 's issue #2415 should make us reconsider this issue and its associated PR #2365 but playing with @Taneb 's version of the regression tests suggests that we should reconsider the explicit/implicit quantification in the properties of each of the Decidable
functions because eg the analogues of the filter-accept
lemmas for partition
yield unsolved metas regardless of whether the new or old definitions is used (UPDATED: see #2427)... while the derun
properties in their existing form use explicit quantification, even though their proofs do not require it... so the overall picture seems highly... unclear (to me, at least).
so the overall picture seems highly... unclear (to me, at least).
Ditto.
Actually better I think to leave this open, with the filter
etc. box left unticked... we should return to this in v2.2 once we've figured out #2427 ...