adriennekline/psmpy

1:N matching error

Closed this issue · 2 comments

Love the library! The 1:many matching is giving me some trouble, though. I receive an out-of-bounds index error when calling psm.knn_matched_12n with any argument of how_many greater than 1. Any suggestions? I've tried samples of 1000 or 5000, no difference in the output. Oddly enough, the following work just fine: psm.knn_matched_12n(matcher='propensity_logit', how_many=1) and knn_matched(...).

The full error follows:
`---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
/var/folders/qv/q03xmb2n3p5_53d09ngv3mfr0000gn/T/ipykernel_24577/268362318.py in
3
4 # Many:1 match: up to how_many=2. RETA study was 1:4
----> 5 psm.knn_matched_12n(matcher='propensity_logit', how_many=2)

~/opt/anaconda3/lib/python3.9/site-packages/psmpy/psmpy.py in knn_matched_12n(self, matcher, how_many)
450 else:
451 pass
--> 452 elements_to_remove.append(row[0])
453 indices_in_loop.append(row[0])
454 # if there are NO elements in elements to remove:

IndexError: list index out of range`

Great to hear you are otherwise enjoying it! To better help you, could you provide me with the following generic info:

  1. How large is the dataset?
  2. How many times are you trying to actually match on data?
  3. What is the size of the smallest class you have?

My second Q is important as this 'n' as specified here is not how many you are looking to match to, it's how many times you want matching performed e.g. 1:2 so if your smaller cohort was 100 then you'd have 100 matched to 200. SO I want to ensure that this 'out of bounds' error would actually be expected.

I've just tested the this function on a dataset with n=2 and it works just fine. What version are you using?