Documentation in plonky2 crate
Gorzorg opened this issue · 11 comments
The plonky2 crate is not documented properly.
Even though the whole thing is open source, anyone who wants to use it must figure out what most of the API does by code digging.
Notable examples include:
- many
plonky2::plonk::CircuitBuilder
methods are undocumented, - many
plonky2::gates::gate::Gate
methods are undocumented.
As far as I know, there is a lack of adequate tutorials (the PolymerLabs tutorial is super useful for the basics, but it is inadequate if one wants to use the crate in a project) or explanations about the code, which makes it hard for newcomers to understand what is going on in plonky2, and what functions should be used/preferred/avoided.
I can offer to help write the missing documentation, but given my lack of experience I doubt i could be of great help, without someone to ask what each thing does.
If you want me to help please let me know.
Thank you for reaching out! It's true that our main focus recently hasn't been too much around documentation of the plonky2 crate. I'll dedicate some time to it within the next 10 days to help clarify as much as possible the API for external parties.
Because documenting properly the entire codebase would probably be a longer term effort, if you could point me to which parts (apart from CircuitBuilder
and Gate
) documentation would be useful for external contributors, this would be great!
Thanks for the quick answer!
In this first answer i will focus on gates.
What are gates for?
Part of the problem is that it is not obvious to me what gates are for. All of the traits required for their implementation are public, so I guess you want to leave external users with the possibility to implement their own.
Many people will have repeated units of logic in their circuit, and a natural thing to think (maybe this is a misconception I have) is that one could use Gate
for library building with plonky2, i.e. to implement some circuit logic in a standardized way.
Another approach to library building would be to just implement new traits on CircuitBuilder
, whose functions implement some standardized logic by using already available functionality as building blocks.
What is not clear to me is when to follow one or the other approach, because no general guidelines are available.
I suggest a good place to put this kind of guideline could be on the documentation of the Gate
trait itself.
num_constraints
this one was a somewhat easy catch, but it was not immediately clear to me that, if you are implementing a gate X, and X uses another gate Y, you should not count Y's constraints towards the number of X's constraints.
While this could seem obvious in hindsight, no definition for num_constraints is provided, and this could lead to implementation errors/frustration.
eval_ functions
There are many variants, and getting an intuition of the differencts takes some time. After hours of code digging, I have some idea of what the _circuit suffix means, and about what the _unfiltered/_filtered suffixes mean, but I cannot say my intuition is precise knowledge. Moreover, I could not really understand if this is something the user should even worry about.
I have many more questions, but that is more curiosity than something practically useful.
Going through the many CircuitBuilder
methods, and realizing that "I would like nice docs on all of them" is a huge request, I think I have at the moment two main issues:
Target types
One has Target
, ExtensionTarget
, ExtensionAlgebraTarget
, and the CircuitBuilder
's API provides variants for most operations that involve each of those target types. It is not clear to a newcomer like me why/when one should use one over the other, as most methods describe the operation they perform, which stays the same over the variants (e.g. builder.mul_...
), and as all those methods have public visibility.
I don't really have a clear idea how to solve this issue. If the answer is very generalizable, maybe one could leave a note to the users in the documentations of the various target types, like "this is meant for internal use, or if you know what you are doing. In most cases, use this other type instead.", or like "you might want to use this target type when you have a situation like example code
".
No apparent logical grouping
Since most (all?) methods implemented for CircuitBuilder
are directly implemented, and with pub visibility, the user has little hints for what is used where.
Maybe a way to mitigate the confusion generated by this is to group method implementations behind traits that group different functionalities, but this is unfortunately a breaking change for downstream users of plonky2, so I don't know how to feel about it.
I have no clear idea on how to tackle this, but I think having all the available methods listed alphabetically is somewhat discouraging for new users.
Oh god I just realized I skipped over the "(apart from CircuitBuilder and Gate)" parenthesis in your message -.- sorry about that.
Oh well, now at least you know exactly what bothers me about that. I would say the biggest confusion I have so far, besides gates and circuit builder methods, is about plonky2::plonk::circuit_data::CircuitConfig
.
The user has the ability to configure every parameter in the struct, but it is not apparent what the implications on performance and security are. Some guidelines in the documentation would be handy.
I am not asking for anything super technical, and even something along the lines of "increasing this parameter allows for shorter proof but increases proving time" would be very useful, because it would help non-expert users in writing benchmarks to empirically understand what works better for them, and what should not be played too much with (i.e. parameters that can reduce security).
Thank you for your input, that's really valuable, and will help narrow down the immediate focus for stronger documentation.
Oh god I just realized I skipped over the "(apart from CircuitBuilder and Gate)" parenthesis in your message -.- sorry about that.
No worries, at least you clearly explained what wasn't clear, which can not be obvious for maintainers who have been heavily involved in the library for some time.
I would nonetheless point out, as a global advice for anyone aiming at building on top of plonky2 (or more generally any zk proving system implementation), that building zk-applications / tailored circuits is hard, tedious, error-prone, and should be left to cryptographers or users familiar enough with the cryptographic notions involved, as the consequences of a badly designed circuit can be disastrous.
Thank you for the word of warning. Even though I am not a cryptographer, I do have a mathematical background. I can more or less follow the arguments in the writeup linked in the repo's docs, which alone solved many of my initial doubts. That said, you are absolutely right in saying that anyone in my position can easily do gross mistakes.
However, assuming too many things about how the math maps to the actual code and filling in the gaps with imagination is not a good strategy. This is also a big part in my motivation to properly look around the repo before writing anything with it.
Also, I took too long to edit my previous message, sorry again. In case you didn't read it, I added a couple of sentences about CircuitConfig.
@Gorzorg I've added some documentation in the linked PR which I hope will clear some things out. It's not covering the entire list you gave, nor is as deep as you may have hoped for some of its components, but I intend on expanding it with follow-up PRs whenever I have some time.
I just finished reading the changes. Thank you! While you are right in saying one may wish for more on some components, this is already super useful. Thanks again :)
We'll try putting some more effort on documentation in the near future, feel free to specify any items that you think would deserve more scrutiny!
So, lately I have been busy making a circuit that employs cyclic recursion. While I eventually managed to make it work, I had to understand how to perform the bootstrapping procedure (I am not implying that now I know what I am doing!) one has to make for getting the right circuit data for cyclic recursion.
Even after making it work, I don't really understand why it works exactly, and I don't understand why that procedure cannot be automated. Maybe some documentation could avoid other users similar questions, and clarify my current doubts.