secretflow/yacl

Where does the linear code algorithm come from?

maths644311798 opened this issue · 4 comments

In linear_code.h, Yacl refers to The minimum locality of linear codes and other links. This paper proves many results about locality. Yacl implements which algorithm? Can it be pointed out clearly?

The file linear_code.h implements a d-local-linear-code for the Ferret Oblivious Transfer (OT) extension, as described in Appendix A.2 of https://eprint.iacr.org/2020/924.pdf. Moreover, the majority of the implementation is based on the code from https://github.com/emp-toolkit/emp-ot/blob/master/emp-ot/ferret/lpn_f2.h. :)

I conclude the implementation here. Over $F_2$, the generator matrix $G$ satisfies the definition of $d$-local linear codes. The generator matrix $G$ has several groups of $1s$. Each group contains 64 or 128 $1s$ and the $1s$ lie in a slash. For example,

<> <>
0 0 0 0 0 0 0 0
               
1              
  1     1      
        1    
1     1 1    
  1       1   1
           
      1       1
               

In the encode functions like Encode(absl::Span<const uint64_t> in, absl::Span<uint64_t> out), the sentence auto val = out[i + j]; has some problems. If absl::Span<uint64_t> out is not set to zero, then its value influence the result. I think the initial value for val should be 0.

Each group contains 64 or 128 $1s$ and the $1s$ lie in a slash.

Actually, each group would choose the non-zero entries for generator matrix by random permuation (AES), which could be found at https://github.com/secretflow/yacl/blob/main/yacl/crypto/primitives/code/linear_code.h#L243.

If absl::Span<uint64_t> out is not set to zero, then its value influence the result. I think the initial value for val should be 0.

Encode(absl::Span<const uint64_t> in, absl::Span<uint64_t> out) would perform the operation out = (in * G) ^ out. Consequently, we could compute the output vector for "LPN assumption" as followed:

// implement1
// avoid memory allocation
auto& output = noise;         
// output = secret * matrix + noise
llc.Encode(secret, noise);  

If Encode(absl::Span<const uint64_t> in, absl::Span<uint64_t> out) would force to set out to be zero, we have to compute the "LPN" as shown below, which requires extra memory allocation and initialization:

// implement2
// allocation && initialization
std::vector<uint128_t> output(n, 0);    
 // output = secret * matrix
llc.Encode(secret, output);                   
// output = output + noise = secret * matrix + noise
std::transfrom(noise.begin() , noise.end() , output.begin(), std::bit_xor());  

FYI, the LPN parameters in Ferret OT extension(table 2) require $n \approx 10,000,000$. As a result, implement2 need extra 30ms just for memory initialization.

I understand these two implements now. The name Encode(...) just causes some misunderstandings since it is not used for a purely linear encoding.