Array enumeration review
Opened this issue · 12 comments
Has anyone had a chance to look at my array enumeration work? Code and documentation are all there, still need to write test cases and TIP(s), but one of the TIP requirements is to get some community backing. Despite putting everything online, I've had zero feedback.
I've exceeded the requested scope by providing a C API for the entire [array] command and adding exact/glob/regexp filtering capabilities to almost every command. I feel this is the right thing to do. I don't want to focus on the minimum, do only half a job, and end up with a fragmented, short-sighted API. Plus I feel this expanded scope is more appropriate given the size of the bounty. What I've got seems consistent to me, and I'm happy with it, but I want to know if it works for you too.
I had to dust off my long-, LONG-forgotten GitHub account to be able to make this post...
This looks great!
Thanks. Design, code, test, and documentation are complete except for one outstanding bug report which I haven't had time to address. I totally forgot about having to terminate in-progress searches if an element is added to or removed from the array. I have been given a C language test case that causes a crash, and I have not yet successfully translated it to tcltest. The other big thing that needs to be done is to write a TIP. In addition to providing the C API, I have extended most of the Tcl commands to accept filters, including an option selecting what kind of matching to use. I am not clear on whether I am required to write two TIPs or if I can put everything in one TIP.
Okay, I now have a good idea what is going wrong with array searches. The root of the problem is my failure to notice the documented, intended behavior of terminating searches when any elements are added to or removed from the array. Thus I did not incorporate proper handling of same into my design.
When an element is added to or removed from an array, all pending searches on the array are terminated, and the search structures are all freed. This includes ArraySearch structures which escaped the control of tclVar.c because they were the return value of Tcl_ArraySearchStart(). Unsurprisingly, a crash results when the C API is asked to use a freed search structure to continue a terminated search. There is no way to notify the caller that a search has been completed and its structure freed, so it can't avoid it. Thanks to traces, it can't even be sure it's avoiding inadvertently causing the search to terminate.
What to do? I think the intended approach is to not let the ArraySearch pointer escape tclVar.c. Instead, do exactly what the Tcl API does: reference in-progress searches by tokens contained in Tcl_Objs. Should a search unexpectedly terminate, the ArraySearch lookup will fail and can be handled.
There are costs. One, this is slower because a hash table lookup must be performed every time a search function is called, though it won't be any slower than the Tcl equivalent. Two, every search function will be more complicated to use due to taking interpreter and array name arguments and returning error codes which must be checked. Together, these two issues make the iterative search functions less attractive than the total enumeration functions, much like the case in Tcl scripts.
Another approach is to leave things as they are and update the documentation. State that the iterative search functions are only to be used in cases where array elements are guaranteed to not be added or removed. This includes not only direct attempts to add or remove elements but also anything that can possibly enter the script engine, for example modifying array elements since that might trigger traces.
On the subject of traces, all [array] commands, including [array anymore] and [array donesearch], trigger array traces. Other than Tcl_ArraySearchStart(), my new C API search functions don't do this: Tcl_ArraySearchPeek(), Tcl_ArraySearchNext(), Tcl_ArraySearchDone(). This is because they do not call ArrayVar() since the array variable is referenced by the ArraySearch argument. While this inconsistency may seem bad, it does at least make the approach described by my previous paragraph possible.
Ultimately, I think I will need to change my Tcl_ArraySearch*() functions to work exactly like their Tcl counterparts, termination and traces and errors and all. But this will also make them less attractive.
A third approach is to admit that the concept of Tcl_ArraySearch*() is fundamentally hampered by the issues described above and to eliminate it outright due to being some combination of slow, inconvenient, unsafe, and inferior.
Thoughts? Preferences? Jokes? Criticisms?
It's a conundrum. In most cases, I think, it is fine to marshall all the elements of an array because there usually aren't that many. Also frequently one wants them sorted as well.
The whole array startsearch/donesearch stuff, I inflicted on the world. I think I talked Ouster into it against his better judgement. Hardly anyone has ever used it; the only advantage is it doesn't marshall the entire list of element names. I think "array foreach" is a much better approach.
Just spitballing here, but would it help if there was a Tcl_ArrayForEach and not the search/search-next/done stuff?
It's a little creepy that if some C code that was enumerating an array invoked some Tcl_Eval variant and the code eval'ed messed with the array then it would crash Tcl, but if it vastly complicates the work then I think it could be OK to stipulate this and say, you know, if you're planning to invoke Eval from your C code while looping over this then use the marshalling functions instead.
I have a solution. When a search is started with C (as opposed to Tcl), defer deallocation of the search structure until the next time a search function is called, at which point an error is returned. The caller needs to not use the pointer again after being told that the search "lookup" failed.
In discussion with Brad Lanam, it came to light that my array search functions (other than starting the search) do not trigger array traces. While I question the utility of these array traces, I do not want the functions to significantly differ from their Tcl analogues. In order to properly support traces, the functions have been redesigned to report errors. This expanded error reporting capability also gave me the room I needed to report that a search was aborted.
Both these changes are checked into Fossil, though I need to add test cases showing array (and other kinds of) traces are working correctly. I also need to update documentation. Now that there is some internal divergence between the Tcl and C calls, I'm also thinking of adding test commands to more thoroughly wring out the C side of things. (The divergence is immediate or deferred deallocation of the search structure.) Plus I'm planning functions to convert between ArraySearch pointers and search tokens so that a search can be initiated in C and iterated in Tcl or vice versa. Not practically useful, but good for testing. I'm very serious about tests.
As an apology for making the interface more complicated, I gave the search functions the power to report not only the array element names but also their values, should the valuePtrPtr not be NULL, so the user can skip the variable lookup in a common use case. Yeah, the extra argument makes the interface yet more complicated, but life is like that. For a fun time, let's see what happens when this feature gets used in combination with the read trace triggering search abort.
I agree that the Tcl search commands are more trouble than they're worth, but they do work. But I think their C analogues will actually see use. The root of the problem seems to be that you took a common C design approach and lifted it directly into Tcl without considering that the Tcl way is fundamentally different. But hey, drop that C-at-heart design back into C and it'll be right at home!
When you say Tcl_ArrayForEach(), I assume you're proposing a function that takes a function pointer argument which it will invoke for each iteration. This would let it terminate early, but the cost would be a more complicated interface in which the caller has to divide itself into multiple functions communicating via a pointer to a client data structure. Perhaps you haven't considered this, but it would also rule out parallel iteration over multiple arrays. All in all, I think the correct solution is to resemble the Tcl API which reports aborting as an error.
Yet it's possible to go too far in that direction (making the C API imitate the Tcl API). One approach I considered was not letting the ArraySearch pointer escape tclVar.c and instead forcing it to be internally looked up via a Tcl_Obj token (just like what is done in pure Tcl). There are performance drawbacks to this, and I assume you care about performance or else you wouldn't be asking for a C API in the first place. Instead take advantage of C's ability to bypass some of these lookups. (Read: take advantage of C's reputation for not guarding against unsafe code.)
I like the approach of deferring the deletion of the search structure until the next search call.
I agree that the explicit iteration is more C-like, and it is also easier to use from other languages (eg our Tcl-swift bridge code) since you don't have to drag a third language context through a ClientData structure if you're simply calling a search function over and over again.
Your last paragraph is a little concerning. C doesn't hold your hands but it is certainly possible to write safe code in C (eg, Tcl, Swift, and all other modern safe languages have a C runtime beneath them).
No problem, we understand time shortages.
Looking over the proposal again, it seems like it's missing a function something like this:
TclObj *TclArraySearchValue(search);
Returns the value associated with the last name returned from search (via *Peek or *Next). Otherwise you'd have to do an additional lookup after a Tcl_ArraySearchNext().
I can also see a possibility of a race condition here, so maybe:
int Tcl_ArraySearchNextValue(search, TclObj **namePtr, TclObj **valuePtr);
???
I also think that making this into a TIP will generate more community interest. I understand the catch-22 thing but what you're doing is really nice.
Anything going on?