fixthestatusquo/proca

Better validation for Bulgarian EGN numbers

Mivr opened this issue · 22 comments

Mivr commented

In Bulgaria, we use EGN numbers.

They have a well-described pattern and a checksum: https://en.wikipedia.org/wiki/Unique_citizenship_number

At the moment the widget validates only the count of numbers and nothing more. This leads to more invalid signings.

Thus it will be good to have this validation implemented as described in Wikipedia (an example implementation NOT TESTED: https://samiwell.eu/php/validate-bulgarian-id-number-egn).

My question is if the project is open to a pull request adding the validation and is there a place in the code where it is meant for this validation to be added?

Mivr commented

Also, an NPM library is available for validation of these EGN numbers here: https://www.npmjs.com/package/bulgarian-control-numbers

tttp commented

Hi @Mivr,

just to be sure: you're talking about the ECI feature, right? Are you involved with one of the bulgarian partner and could you share the name of the organisation?

I like your suggestion very much, however, because of the security and regulation surrounding ECIs, we are likely to have to get the approach validated by the european commission or the bulgarian administration, in case we have a go ahead

A couple of issues:

  • is there an official page from bulgaria explaining the same thing as the wikipedia article?
  • the only two error messages we have are "please fill this field" (if empty) or "Invalid format. Format required: NNNNNNNNNN (N = digit)". We would need another text (in case the format is correct but checksum invalid)
  • and because a bulgarian citizen can fill in any language, it probably should be translated in all languages ;(

in summary, it's sadly more complicated than adding a libary and is likely to take a very long while to get everything aligned, even if we decide quickly to do it.

code, MIT licensed:

https://github.com/petarov/bulgarian-control-numbers/blob/master/src/egn.js#L56

Mivr commented

Hello @tttp,

Yes, I am talking about the ECI widget, sorry I missed that.

I am working for CAAI https://caai.bg/.

The official documentation can be found here: https://www.grao.bg/normact/Naredba-RD-02-20-9.docx it is only in Bulgarian as it is an official document. The section for the EGN is in chapter four "Глава четвърта" part one "Раздел 1".

I can handle also some of the translations if this is ok with you, I think we will need one string that is similar to: "EGN is invalid, please check the numbers". Do you require all the languages to be supported for this message in order to merge the change?

For the legal part, I will be happy if you are willing to do that as I am not sure I or CAAI can help. If I or CAAI can help, please inform me how.

Also, I took a look at the code and it is missing the day, month, and year validation, so I think it will be best to implement it from scratch, also adding a library for just this validation is a bit too much.

I can easily contribute the code if you think this can work in the end.

EDIT:
Ok, I took a look at the languages that you need, I can deliver the text translated into almost all the languages that you support. Some are a bit exotic and may take some time to find a translator.

Mivr commented

The European Commission is already using this validation here: https://eci.ec.europa.eu/025/public/#/screen/home
If you input incorrect EGN the message is "Invalid value".

tttp commented

Aaah, "invalid value" is nice and simple as generic, and already translated (the ECI gave us all the translations), good catch!

contributions are very much welcome!

happy to help you clone the repository and give you the CAAI config so you can test the end result, but it might be easier you provide me the function isValidEGN and we plug it in

Mivr commented

I do think it will be faster to just give you the method.

I copied the code and its tests below, as this is quite a critical piece of code I tested it quite some.

The code is in TS, I hope this is ok with you, if not feel free to transpile it. :)

uci-validator.ts

const UCI_LENGTH: number = 10;
const onlyNumbersRexEx: RegExp = /^\d+$/;

/**
 * Validation for Bulgarian UCI (EGN) number.
 *
 * More on the algorithm: https://en.wikipedia.org/wiki/Unique_citizenship_number
 *
 * This method validates the checksum and the date.
 * The date is validated to be in the past and to be a real date (33th of any month will not work for example).
 *
 * @param uci a string containing the UCI number
 */
export function isValidBulgarianUci(uci: string): boolean {
  if (uci.length !== UCI_LENGTH || !onlyNumbersRexEx.test(uci)) {
    return false;
  }

  const { year, month, day } = parseDate(uci);
  const dateValid: boolean = isDateValid(year, month, day);
  const dateInThePast: boolean = dateValid
    ? isDateInThePast(year, month, day)
    : false;
  const checksumValid = isChecksumValid(uci);

  return checksumValid && dateValid && dateInThePast;
}

function parseDate(uci: string): { year: number; month: number; day: number } {
  const day: number = Number(uci.substring(4, 6));
  let month: number = Number(uci.substring(2, 4));
  let year: number = Number(uci.substring(0, 2));

  if (month > 40) {
    year += 2000;
    month -= 40;
  } else if (month > 20) {
    year += 1800;
    month -= 20;
  } else {
    year += 1900;
  }

  return { year, month, day };
}

function isDateValid(year: number, month: number, day: number) {
  return month >= 0 && month < 12 && day > 0 && day <= daysInMonth(month, year);
}

function daysInMonth(month: number, year: number) {
  switch (month) {
    case 2:
      return (year % 4 === 0 && year % 100 !== 0) || year % 400 === 0 ? 29 : 28;
    case 9:
    case 4:
    case 6:
    case 11:
      return 30;
    default:
      return 31;
  }
}

function isDateInThePast(year: number, month: number, day: number) {
  const birthDate = new Date(year, month, day);
  const birthDateNumeric = birthDate.valueOf();
  return !isNaN(birthDateNumeric) && Date.now() > birthDateNumeric;
}

const algorithmWeights: number[] = [2, 4, 8, 5, 10, 9, 7, 3, 6];

function isChecksumValid(uci: string) {
  const characters: string[] = uci.split("");
  let uciNumbersSum: number = 0;

  for (let index = 0; index < characters.length - 1; index++) {
    const uciNumber: number = Number(characters[index]);
    uciNumbersSum += uciNumber * algorithmWeights[index];
  }

  let sumModule = uciNumbersSum % 11;
  sumModule = sumModule < 10 ? sumModule : 0;

  return characters[9] === String(sumModule);
}

in folder tests uci-validator.test.ts:

import { isValidBulgarianUci } from "../uci-validator";

describe("Bulgarian UCI tests", function () {
  it("should detect correct UCI number from 1907", function () {
    expect(isValidBulgarianUci("0701270855")).toBe(true);
  });
  it("should detect correct UCI number from 2009", function () {
    expect(isValidBulgarianUci("0948220092")).toBe(true);
  });
  it("should detect correct UCI number from 1899", function () {
    expect(isValidBulgarianUci("9926177023")).toBe(true);
  });
  it("should detect UCI with wrong checksum", function () {
    expect(isValidBulgarianUci("0701270845")).toBe(false);
  });
  it("should detect UCI with invalid month, and correct checksum as incorrect", function () {
    expect(isValidBulgarianUci("0775270849")).toBe(false);
  });
  it("should detect February 29th on non leap year as incorrect", function () {
    expect(isValidBulgarianUci("0702290849")).toBe(false);
  });
  it("should detect February 29th on leap year as correct", function () {
    expect(isValidBulgarianUci("0802290844")).toBe(true);
  });
  it("should detect March 32th as incorrect", function () {
    expect(isValidBulgarianUci("0703320847")).toBe(false);
  });
  it("should detect less than 10 characters as incorrect", function () {
    expect(isValidBulgarianUci("07033")).toBe(false);
  });
  it("should detect presence of not numbers as incorrect", function () {
    expect(isValidBulgarianUci("0703320_47")).toBe(false);
  });
  it("should detect birthday from the future as incorrect", function () {
    expect(isValidBulgarianUci("3043290844")).toBe(false);
  });
  it.each([
    "0006241052",
    "0048195952",
    "0052214707",
    "0107130416",
    "0247236062",
    "0248126497",
    "0409197132",
    "0452284817",
    "0501059660",
    "0610043193",
    "0749024728",
    "0803206463",
    "0805051798",
    "0809142237",
    "0846093170",
    "0904039488",
    "0905156600",
    "0942164089",
    "0947285951",
    "1005094175",
    "1011099540",
    "1203054870",
    "1305044679",
    "1307033565",
    "1307109296",
    "1604071264",
    "1612089087",
    "1711048913",
    "1804309160",
    "1901132027",
    "1907164042",
    "1910104100",
    "1911024760",
    "2002019673",
    "2002270138",
    "2010240996",
    "2010250103",
    "2211249402",
    "2304290900",
    "2312301469",
    "2502274907",
    "2605126070",
    "2711200186",
    "3106070621",
    "3106157193",
    "3107189282",
    "3305164430",
    "3503207166",
    "3504128122",
    "3509280740",
    "4010255372",
    "4210093807",
    "4508035631",
    "4803060381",
    "4909107999",
    "5007221430",
    "5009016959",
    "5202106003",
    "5207255294",
    "5308281922",
    "5309037865",
    "5507225773",
    "5507230620",
    "5606039510",
    "5705315052",
    "5712285624",
    "5907041723",
    "6105313168",
    "6210187577",
    "6310068587",
    "6412217580",
    "6507031002",
    "6606030453",
    "6707252952",
    "6908165398",
    "7003061445",
    "7003145361",
    "7011256229",
    "7202161318",
    "7212094412",
    "7302069740",
    "7307066214",
    "7309083083",
    "7405193928",
    "7405222369",
    "7512254770",
    "7611187412",
    "7807249825",
    "7912248895",
    "8004279120",
    "8307304926",
    "8405168526",
    "8609092874",
    "8903028502",
    "8908020058",
    "8912221242",
    "9006256634",
    "9512167895",
    "9802064381",
  ])(
    "should detect 99 random correct UCI numbers as correct",
    function (uci: string) {
      expect(isValidBulgarianUci(uci)).toBe(true);
    }
  );
});

If there is anything else I can help with to get this validation merged, please let me know.

Mivr commented

@tttp Any update on this is the code enough, do you need anything else from me?

Mivr commented

@tttp I can also try to prepare a Pull request if you do not have the time, I just don't want to do that if you have already started working on this. Would you like a Pull Request with the changes?

tttp commented

sorry, missed the message, that would be awesome if you could PR it indeed. the licence of proca is agpl, so no problem to add this code, and thanks for your contribution!

tttp commented

@tttp Any update on this is the code enough, do you need anything else from me?

sorry, we had some on holidays and I got heavily sidetracked, back to you an BGN now.

tttp commented

Hi @Mivr , my turn to check on you ;)

do you need help testing and setting up? the configuration might be tricky, can you email support@fixthestatusquo.org so we can help you and give you the config needed to test properly?

Mivr commented

Hi @tttp similarly I had to deal with a short term sickness and I am still catching up on my work.

I will write you an email about the configuration, it is probable that someone else from my team will open the pull request as my time is pretty short lately.

Also from the EGN we can extract the birthday of the person and in BG the people that sign need to be above 18 years old. We also saw that such a validation is available for the German input style where it asks for the birthday so we would like to add also that check using the error message from the German validation, are you OK with that?

Should we add this to the server actually (it's in Elixir...)

tttp commented

@Mivr, I looped in @marcinkoziej.

As the existing validation is done on the server already, it might make sense to add it here too, but it means @marcinkoziej would need to do it (assuming no one else speaks elixir fluently)

this is the algo for the checksum (sum (digit * factor) % 11)

https://en.wikipedia.org/wiki/Unique_citizenship_number

  • @Mivr what is not working?
  • did we get staging credentials ot help you test ?
  • did you get rabbitmq credentials from us ?
tttp commented

ok, let's put back the context that was missing, sorry marcin:

the validation that we have now on the server was taken from the commission and is a "dumb" regex, checking that the format (number of digits) is correct

what @Mivr wisely requested was that we check not only if the format is valid, but if the content itself is correct, as id numbers in bulgaria have - for instance - a checksum.

we need to add it to proca, the easiest was to add the test on the client (Mivr has access to proca widget and I gave him the config used for their widget, but we got stuck on react-hook-form details and that's where we looped you in to see if it actually wouldn't make sense to implement it on the server

Clearer?

@Mivr is this correct:

function isDateValid(year: number, month: number, day: number) {
  return month >= 0 && month < 12 && day > 0 && day <= daysInMonth(month, year);
}

It seems month is 1..12 not 0..11

Mivr commented

@marcinkoziej Good catch, I missed this case

Mivr commented

I think it should be 1 to 12 you can test with the provided tests and maybe it's a good idea to add a test for month 0 and 12. The rest seems to be correct, only this check seems incorrect.

i.e. I think the code should be: month > 0 && month <=12

I tested with attached test case! Thanks.
Functionality is deployed to Fur Free Europe ECI now – please check!

Mivr commented

The validation works great.

However the error message says invalid format expected NNNNNNNNN where N is a digit. And this is a bit misleading as that part of the format is correct in the ECI website when there is an error like checksum they display just invalid format.

This hint is mainly so people don't put spaces, or hyphens, or dots in there.