Low Level Design System
Problem Statement Given 2 documents "Buyer.csv" & "Supplier.csv", build a solution that reconciles / matches the txns. The solution should -:
- categorize the txns into Exact / Partial / Only in Buyer / Only in Supplier categories
- handle matching of number / string / date
Definitions
Exact Match - 2 elements are an exact match if they are exactly the same Partial Match - A. Number1, Number2 match partially if Number2 is in range of Number1 +/- threshold B. String1, String2 match partially if String2 is similar to String1 C. Date1, Date2 match partially if Date2 is in range of Date1 +/- threshold D. If there are multiple partial matches, the ‘best partial match’ should show up
Notes 1. Working code is super important 2. Code quality is table stakes 3. System Design should be Extensible / Flexible 4. Can use open source libraries with proper justification
Assumptions Data Type assumptions GSTIN, BillNo is String GST Rate, Taxable Value, IGST, CGST, SGST, Total are double values For any cell with empty value (number type), 0 value is assumed Threshold assumptions Date threshold is taken in form of number of days For string similarity, threshold value is provided by user Two strings are similar if s1 can be converted to s2 by total of threshold operations where an operation is insert/delete/replace Matching assumptions Matching Priority : Exact > Partial > OnlyInBuyer=OnlyInSupplier Best Partial Match Criteria - The closest match (out of all partial matches)
Example
Doc1 Name RegNum Salary Ram 12A3 9000 Shyam 12A4 8500
Doc2 Name RegNum Salary Raam 12A3 9000 Shyam 12A4$ 8501
Output Name RegNum Salary Category Name RegNum Salary Ram 12A3 9000 Partial Raam 12A3 9000 Shyam 12A4 8500 Partial Shyam 12A4$ 8501