about naming
Opened this issue · 4 comments
The points I raised here could be just personal taste, and it might be quite cumbersome to change names, but I think it's better discussed earlier than later.
I found some names in packages a little bit confusing:
ca
as core ofSoftware Alchemy
. I expectedsa
for this since chunk averaging is seldom mentioned. For averaging, isn't it possible sometimes we need something different, like getting max 10 values from all data? That's a typical Hadoop example, butSoftware Alchemy
can handle it as well.
In the other hand, I findSoftware Alchemy
itself didn't tell user what it is compare toDivide and Combine
. Maybe you can also call itscatter compute
.- all function names don't have any way to separate words, either camel cases or underscore.
I have to mentally parsefilesplitrand
, ther
inside it especially easy to be overlooked.
calm
is difficult to be read asca lm
.
I'll suggest to use underscore, and use common prefix likestringr
. So all functions will be likefile_xx
,dis_
,sa_xx
, or even justf_xx
,d_xx
.
I just found the vignettes already mentioned that sometimes you need more than averaging. This confirmed my idea that the ca
name is not best. And I found scatter
have some random shuffle meaning inherent so it's a good word for this case.
Agree that the names could be improved.
I'll suggest to use underscore, and use common prefix like stringr. So all functions will be like file_xx, dis_, sa_xx, or even just f_xx, d_xx.
To be clear, does this mean ca,cabase,calm,caglm,caprcomp
become ca_, ca_base, ca_lm, ca_glm, ca_prcomp
, etc.?
Yes, I didn't add the 'ca' example because I think ca is not the best representation of software alchemy. "Software alchemy" is not easy to understand or relate either.
Changing from 'ca' to 'sa' is a good idea. We can do that easily without breaking users' old partools code by simple assignments, e.g. salm <- calm.
I agree that the lack of separators like '_' may be difficult for a non-native speaker of English at first, but I would be reluctant to break users' existing code.
Software alchemy is really for means, including proportions, and is not appropriate for something like fetching the top 10 values of a variable. However, one can use partools in other ways. Actually, I was just the other day thinking about writing a convenience function for that.
As to Divide and Combine, see my 2016 JSS paper, which is referenced both in the man page and the vignette.