dry-rb/dry-logic

Add a `text?` predicate for validating that a string contains non-whitespace characters

backus opened this issue · 11 comments

I have multiple projects now which use dry-validation. I am finding myself frequently reaching for this custom predicate in each:

module MyProject
  module Predicates
    include Dry::Logic::Predicates

    # Based on https://en.wikipedia.org/wiki/Whitespace_character#Unicode
    # Matches official ASCII and Unicode whitespace as well as characters
    # which are effectively whitespace characters
    effective_whitespace = "\u200B\u200C\u200D\u2060\uFEFF"
    WHITESPACE_PATTERN = /\A[[:space:]#{effective_whitespace}]*\z/

    def self.contains_text?(string)
      WHITESPACE_PATTERN !~ string
    end
    predicate(:contains_text?, &public_method(:contains_text?))
  end
end

Usage is usually

Dry::Validation.Schema do
  configure { config.type_specs    = true }
  required(:foo, :str).filled(:str?, :contains_text?)
end

Basically I want to assert that the value is a non-empty string with at least one character that isn't whitespace.

This seems generic enough to be a built-in predicate

Should we also add a new macro like non_empty_string? so ie required(:foo, :str).non_empty_string

You know, I've always expected that this is the behaviour that filled would offer. Is there any reason would couldn't simply make filled do this?

filled? could strip a string input by default although that would be implicit and it would be against the general philosophy of predicates being simple. Fwiw this predicate is already quite complex given that it handles multiple types.

We definitely should support something more dedicated for handling string inputs and it really feels like a new dedicated predicate is a good idea. We could either have a macro or a simple value(:filled_str?).

I should mention that if you want to strip a string then you gotta use a custom type that would do that. We could even add that as a built in type in dry-types as I suspect this will be very common.

@solnic I'd like to incorporate this into a separate predicate. We have a number? Predicate that checks if a string is a number so why can't we have a predicate that checks of a string contains text.

WRT stripping white space it can get quite complicated or at least personal. Do you strip left and right or just right, do you strip multiple consecutive spaces etc.

For anyone wondering how to create a custom type that strips a string like @solnic described, here's one way:

stripped_string = Dry::Types['string'].constructor { |*args| String(*args).strip }
stripped_string['']       # => ''
stripped_string['  ']     # => ''
stripped_string['foo']    # => 'foo'
stripped_string['  foo '] # => 'foo'

Or you could do something like this:

Dry::Types.register(
  'stripped.string',
  Dry::Types['string'].constructor { |*args| String(*args).strip }
)

module Types
  include Dry::Types.module
end

class Person < Dry::Struct
  attribute :name, Types::Stripped::String
end

p = Person.new(name: '   George   ')
p.name # => 'George'

Note that this 'stripped string' type is coercible, so you can e.g. pass it a `Fixnum and it won't blow up (whereas Types::Strict::String would).

(Disclaimer: I have no idea what I'm doing; there may be a better way of accomplishing this.)

dnd commented

I would agree with @timriley. I think my original expectation was that filled? would strip automatically. If it doesn't, that's fine, but it should probably be stated explicitly in the docs to help avoid confusion. text? or non_empty_string? both seem like acceptable predicate names.

I think this predicate is a good idea, but like I mentioned in #25, I really think the word 'empty' should be avoided here (i.e. non_empty_string? would be a bad idea for a name.) There's already a predicate called empty? which returns true for "" and false for " " (which is consistent with the Ruby core method String#empty? - if this new predicate is called non_empty? or something similar then you'd get cases where the same string is both "empty" and "non-empty":

str = " "
empty?(str) # => false
non_empty?(str) => # false

... which doesn't make any sense.

(Of course, I'd argue that from a semantic point of view, a whitespace-only string is really "empty" and " ".empty? should return true... but that's an issue I'll have to take up with the Ruby core team, not the dry-rb team ;) )


EDIT: cross-posting from what I just wrote on the dry-logic issue: How about non_blank? (or not_blank?)? Semantically, I think that "blank" makes sense as a word meaning "not whitespace", and this would avoid confusion/conflict with the way dry-v and dry-logic are already using terms like 'empty'.

This is also consistent with how ActiveSupport uses the term 'blank' in String#blank?. (I share @solnic's distaste for ActiveSupport, but since 99% of people learning dry-rb will be coming here from Rails, I reckon it's worth trying not to confuse them too much.)

dnd commented

@georgemillo I was thinking the exact same thing about using some variant of blank? not_blank?, but didn't suggest it for the same reason of known distaste for ActiveSupport. I would definitely second the use of blank variants for this.

Whatever happened to this?

For me, it's common to have a dropdown in a web form that includes a blank and need to validate that the value is included in a set of particular values if the value is not blank.

So do I create a high-level rule (seems like overkill) or a custom predicate? Doing required(:my_field).maybe(:str?, included_in: %w(acceptable values)) doesn't work with web form parameters because the empty string is a string (so it won't get coerced to nil in this configuration).

(What I'm doing right now – which feels ugly – is required(:my_field).maybe(included_in?: ["", "acceptable", "values"]) … ideally the blank would be coerced to nil)

The confusing part is the docs say "Your schema will automatically coerce empty strings to nil provided that you allow a value to be nil" and then shows required(:age).maybe(:int?, gt?: 18). Therefore, it's surprising that when you change that example (if contrived) to required(:age).maybe(:str?, included_in?: ["18"]) that it fails to validate when given {age: ""}.

This won't be added. We can consider stripping strings in Params by default and then filled will work (or any length-checking predicate). Regardless of how we handle this exactly, it will not be implemented in a predicate in dry-logic.