Consider: Should Value::op<< STRING be wrapped in quotes?
devjgm opened this issue · 6 comments
This issue is being broken out from #1380.
Background
The behavior today is:
std::cout << Value(std::string("foo"));
This would output "foo"
(including the surrounding double quotes). The question is should that include the double quotes or not?
A relevant comment from @coryan #1380 (comment)
And from @devbww #1380 (comment)
Streaming T
vs Value(T)
In PR #1387, I added a unit test to compare the output of streaming an object of type T
vs Value(T)
.
google-cloud-cpp-spanner/google/cloud/spanner/value_test.cc
Lines 1085 to 1139 in f83a24e
From this we can see that for all non-aggregate (vector or tuple) types, the following lines produce the same output:
T t = ...;
std::cout << t << "\n";
std::cout << Value(t) << "\n";
Except for std::string
. When T
is std::string
, Value(t)
wraps the output in double quotes.
Question
Do we want std::cout << Value(std::string("foo"));
to output
"foo"
, or ...foo
Proposal from @devjgm
I'd prefer if Value(std::string)
did not wrap the output in double quotes normally. That is, I'd prefer if std::string
behaved exactly like all the other non-aggregate values and printed the same whether or not it was wrapped in a Value
. I think this would be the behavior expected by most users.
However, this could cause problems with delimiters when printing a std::string
that's part of an aggregate, such as Value(std::vector<std::string>{...})
. So only when a std::string
is printed as part of a vector or tuple, I think in that case it should be wrapped in quotes. In this model, the wrapping of the string in quotes is conceptually part of the formatting of the aggregate, not the string itself.
So to be clear, I propose the following behavior:
std::string s = "foo";
std::cout << s << "\n";
std::cout << Value(s) << "\n";
std::cout << Value(std::vector<std::string>{"foo", "bar", "baz"}) << "\n";
OUTPUT:
foo
foo
["foo", "bar", "baz"]
Thoughts? Other proposals?
Thoughts? Other proposals?
So, I believe I've made my thoughts/other-proposals clear, and I'm sure I'll be outvoted (which is fine), but I can reiterate if you like.
The existence of StreamMatchesValueStream
and the addition of new T
output operators, has painted us into a corner where your proposal may be the best answer to the question you ask. But I don't think that "StreamMatchesValueStream" need be an axiom.
Instead, Bytes
, Timestamp
, and Date
would not have output operators, just like vector
and tuple
do not. Rather, the provided accessors would be used to render each type as you see fit. Timestamp
and Date
streaming may well be something that should be locale specific anyway. And if Bytes
streaming exists at all, it should simply stream the bytes, just as string
streaming streams the chars.
If we need spanner::Value
streaming for "debugging and human consumption" (and I'm happy to say we do), then my take has been to stream a value just like you'd write the corresponding literal in Spanner.
This seems to be just as consistent a world as "StreamMatchesValueStream", but doesn't require addressing any additional questions about quoting (including making it context dependent) or escaping.
If we need
spanner::Value
streaming for "debugging and human consumption" (and I'm happy to say we do), then my take has been to stream a value just like you'd write the corresponding literal in Spanner.
I think this is the main issue that would push us one way or the other. Once this is decided, I think the other questions are pretty clear.
There are 9 "Spanner types" and they all have a corresponding C++ type:
google-cloud-cpp-spanner/google/cloud/spanner/value.h
Lines 57 to 67 in f83a24e
An instance of Value
is like a bridge between the Spanner and C++ worlds. So it's reasonable to wonder what we're formatting for when streaming a Value
. Do we format for the Spanner word, or the C++ world?
I now see that you're pushing for us to favor and format for the Spanner world using Spanner's literal syntax. And I have been pushing the other direction, for us to favor the C++ world, and print things as much as possible as a C++ programmer might expect the C++ type to be formatted.
Here's why I think we should ignore the Spanner formatting/syntax and instead stream Value
objects as much as possible like their C++ types.
- Our C++
Value
class only operates in terms of C++ types. You create it with C++ types. You.get<T>()
C++ types. And I think users will naturally expect it to print as much as possible, like the corresponding C++ type. - Clearly,
operator<<()
will be called in C++ code in a C++ context, so I again, I think producing output that looks like normal streamed C++ output is good. - Users should not use attempt to use
operator<<()
withValue
to construct SQL query strings to be used in Spanner. They should instead use bound params with their queries, which is more secure and doesn't require any formatting. I believe it's a non-goal to make it easy for users to paste together strings to build up an SQL statement. - As our documentation for
op<<
says, the output is intended for debugging and human consumption. Outputting Spanner's SQL literal syntax is likely not the most useful format (and extra verbose) for this purpose.
If you (@devbww) still favor streaming the value just like one would write the corresponding literal in Spanner, can you give the reasons why that would be better for our users? I'm open to being convinced :-)
@devbww I don't want to rush your reply, because I realize this is likely not the most important issue that you're working on. So please don't feel urgency or pressure on this.
That said, the current behavior of Value streaming is inconsistent, and our string quoting does not escape internal quotes. So I'm going to send a PR to implement the proposal I suggested above, because that PR will at least get us to some consistent behavior for the GA launch.
All the output stream operators are documented with a @warning
that the output format is for debugging/human consumption and it "may change without notice". So we are not painting ourselves into a corner -- you're still welcome to convince us that formatting using Spanner literal syntax is preferable, and we can change to that later.
If you (@devbww) still favor streaming the value just like one would write the corresponding literal in Spanner, can you give the reasons why that would be better for our users? I'm open to being convinced :-)
@devbww I don't want to rush your reply, ...
It seemed to me that the first request was conditional ("if"), so given that I don't believe I can give you convincing reasons, I have nothing to add.
OK. Thanks.