This text is intended to be some sort of shared knowledge about different approaches for doing string concatenation/interpolation in Elm, and list their pros and cons.
The text is likely to opinionated and narrow, if you have any thoughts then please share them! I think the Elm slack is the most appropriate, my handle there is @emmabastas
.
A common programming task is to put dynamic data into static strings. Say that I want to tell the users of my weather app what the temperature outside is in a nice personalized way. For instance: If the users name is "Mark"
and the temperature outside in °C is 21
, then I'd like the final string to be "Hey Mark, It's 21 °C outside."
. How do we accomplish that? There's currently many ways to do this in Elm, which is not desirable. In this text we look at the previous art, list what qualities we're looking for in an Elm solution and then evaluate the current approaches/solutions against this list.
There's so incredibly much previous art, it would be impossible to cover it all. The goal of this section is mostly to establish the core concepts and language when it comes to string concatenation/interpolation. If you're already familiar with C-style printf
, positional and named placeholders and Rust's format
macro then you can probably skip this section.
TODO
The most common form of this is C-style printf
et al. which appears in a lot of languages today. It looks like this
printf("Hey %s. It's %d °C outside.", "Mark", 21);
// Hey Mark. It's 21 °C outside.
A format string is passed to printf
. The %s
is a format specifier and means "put a string here" and %d
is the same but for a number. You can do even more fancy stuff with format specifiers, but that outside the scope of this text.
Another example is Pythons .format
. It can have positional placeholders
'Hey {}, It\'s {} °C outside.'.format("Mark", 21)
# 'Hey Mark, It's 21 °C outside.'
Or named placeholders
'Hey {name}, It\'s {temperature} °C outside.'.format(name = 'Mark', temperature = 21)
# 'Hey Mark, It's 21 °C outside.'
There's also a lot more advanced stuff that can be done with a format specifier, In C-style %20d
will print a string padded with 20 spaces for example.
TODO
In Rust you do string interpolation like this:
format!("Hey {name}, It's {temperature} °C outside.", name = "Mark", temperature = 21);
// "Hey Mark, It's 21 °C outside."
This looks a lot like string interpolation with a format string. But it's actually evaluated statically and is type safe. If I have a typo in a placeholder I get a compilation error!
Defining what to optimize for is incredibly important. It's through that lens we view and judge all the approaches.
Rust optimizes for type safety and zero cost abstractions, which led to the format
macro to become the preferred way. Python optimizes for other things, leading to another solution that wouldn't make sense in rust, and vice-versa.
In Elm, the solution should optimize for:
- Type safety. If something compiles it should work. But more generally the API/behavior should be designed in a way that minimizes the potential bugs, and gives a pleasant user experience.
- Simplicity. Ideally there would be one approach which is the approach. That approach should be easy to use and have no major flaws. If someone asks about string concatenation/interpolation, the response should be "use x", not "if a then use x, else if b then use y ...". I think that approaches utilizing already existing functions in
elm/core
have this going for them. - Readability. Looking at the code should give us a feel for what the concatenated/interpolated string will look like. Looking at something like
"Hello ${firstName} ${lastName}!"
makes it clear that the final string could look something like"Hello Laurie Anderson!"
or"Hello Alvin Lucier!"
.
We do not optimize for:
- Performance. String interpolation/concatenation has never been a bottleneck for me nor have i heard of that being the case for others. We shouldn't solve problems that folks aren't having! That said, if you have a use case where performance is a concern that would be very valuable to hear about.
From these vague guidelines we can also derive some more specific requirements:
- No fancy format specifiers in a format string. Having format specifiers like
%20s
for padding a string or%d
for inserting a number is redundant in Elm. It's better to use Elm functions likeString.padLeft
orString.fromInt
to achieve that. - Only named placeholders. Named placeholders are less error-prone than unnamed ones. They're also more readable. **Examples**
NOTE: If you don't agree with this, or have some thoughts the please share them! It's important to get this right. Any and all feedback appreciated ❤️
Pros:
- Type safe.
- Simple.
++
is arguably the most obvious way to do concatenation, it's inelm/core
and is one of the first things Elm developers learn about.
Cons:
- Readability.
++
is considered to be less readable than other alternatives.
There are however some techniques to make ++
more readable:
-
Bind expressions to variables and use the variables instead.
TODO -
Use multiline strings for multiline content.
TODO
TODO
Example:
"Hey ${name}, It's ${temperature} °C outside."
|> String.replace "${name}" "Mark"
|> String.replace "${temperature}" (String.fromInt 21)
--> "Hey Mark, It's 21 °C outside."
Pros:
- Simple. Part of
elm-core
, easy to understand and use. One downside is that no particular placeholder syntax is enforced and depending on which languages you are used to, you might prefer different syntax. Python does it like{name}
, bash like$name
and so on. There's a risk of bikeshedding. - Readable. Even with long multiline strings the structure remains clear.
Cons:
- Type safety. Typos can result in bugs. Even more problematic, sometimes chaining
String.replace
's can have unintended behavior:
"Hey ${name}, It's ${temperature} °C outside."
|> String.replace "${name}" "${temperature}"
|> String.replace "${temperature}" (String.fromInt 21)
--> "Hey 21, It's 21 °C outside."
The user set their name to be "${temperature}"
and that causes the string interpolation to produce this weird result. Depending on the usecase, this could be a major problem. Regardless it's something one has to constantly be aware of.
Example:
import String.Format exposing (value, namedValue)
-- positional placeholders
"Hey {{ }}, it's {{ }} °C outside."
|> value "Mark"
|> value (String.fromInt 21)
--> "Hey Mark, it's 21 °C outside."
-- named placeholders
"Hey {{ name }}, it's {{ temperature }} °C outside."
|> namedValue "name" "Mark"
|> namedValue "temperature" (String.fromInt 21)
--> "Hey Mark, it's 21 °C outside."
Pros:
- Simple. Easy to understand and use. Enforces a particular placeholder syntax, expect that the placeholder names can be padded with any amount of spaces.
- Readable.
Cons:
- Type safety.
namedValue
has the same problems asString.replace
.
TODO
TODO
TODO