We have a database of claims, procedures, and diagnoses, along with a header table. The header table uniquely defines records by MBR_ID, CLM_ID, and CLM_HASH_KEY. The expected output is that every claim in the clm_details table will have multiple CLM_HASH_KEY values per CLM_ID.
We define the metadata such that claims, procedures, and diagnoses are children and linked to the header, based on the CLM_HASH_KEY:
So clearly something is wrong with the model such that the relationship is not being captured properly. I have done a few tests and am not certain the issue here. I’m fairly certain it is probably a way I am specifying the metadata. Any advice on how to diagnose this would be appreciated.
I believe that since CLM_HASH_KEY is being labeled as an id but not a foreign key, the column is just being generated as a random id. To confirm, after fitting your synthesizer, you can run
Yes, going through some tests it looks like the primary issue is that the newly synthesized header table doesn’t maintain the correct cardinality (e.g. 1 MBR_ID should have 2-5 CLM_IDs, which should have 5-6 CLM_HASH_KEYS). However it does appear that splitting the tables up and adding HASH_KEYS as a new foreign key does work (albeit with a bit of extra work).
Glad you were able to get it to work! For more reference, the reason I suggested that is because the model will not learn cardinality for columns that are labeled with an sdtype of id unless they are a primary or foreign keys. This is because by default, id columns are dropped before modeling.
Can I ask what extra work you needed to do?
Thanks!