IBM/differential-privacy-library

Different lengths of probabilities and measure arrays in /mechanisms/exponential.py#L130

Closed this issue · 3 comments

In

probabilities *= np.array(measure) if measure else 1
, probabilities and measure arrays seem to have different lengths in line 130.

This is because, as per line 132 of https://github.com/IBM/differential-privacy-library/blob/main/diffprivlib/tools/quantiles.py#L132, measure=list(interval_sizes) and as per line 125 interval_sizes = np.diff(array). So that makes len(measure)=len(array)-1.

On the other hand, as per line 131 of https://github.com/IBM/differential-privacy-library/blob/main/diffprivlib/tools/quantiles.py#L131, utility=list(-np.abs(np.arange(0, k + 1) - quant * k)), which makes len(utility)=k+1 (where k=len(array) as per line 121 of https://github.com/IBM/differential-privacy-library/blob/main/diffprivlib/tools/quantiles.py#L121).
Further, since probabilities array is constructed from utility array as per line128 of

probabilities = np.exp(scale * utility / 2)
, that makes len(probabilities)=len(array)+1 .

In that case, how can line 130 of

probabilities *= np.array(measure) if measure else 1
be consistent, if ultimately len(probabilities)=len(array+1) while len(measure)=len(array)-1 ?

Is this a bug?

Further in a case where rand is greater than all self._probabilities in line 169 of

, i.e, when idx = len(self._probabilities) as of line 174 of the same as per given conditions, line 134 of https://github.com/IBM/differential-privacy-library/blob/main/diffprivlib/tools/quantiles.py#L134 should give an "array index out of bounds error" due to array[idx+1] reference, right?
It can only not give the above error when max(len(probabilities)) = len(array)-2, which is not the case, I suppose?

Hi Swastik,

In

probabilities *= np.array(measure) if measure else 1

, probabilities and measure arrays seem to have different lengths in line 130.
This is because, as per line 132 of https://github.com/IBM/differential-privacy-library/blob/main/diffprivlib/tools/quantiles.py#L132, measure=list(interval_sizes) and as per line 125 interval_sizes = np.diff(array). So that makes len(measure)=len(array)-1.

In the Quantile function, the array is appended with the user-provided bounds, increasing len(array) by 2, on line 122:

array = np.append(array, list(bounds))

This should explain the apparent inconsistencies on the length of the various variables that you mention.

I should add that there is robust error-checking on input parameters on all mechanisms in diffprivlib. For example, the exponential mechanism ensures that the utility and measure arrays are the same length, ensuring the errors that you mention should not happen

Thanks for the clarification @naoise-h !