Using type IDs with the snomedct model


I am trying to run a set of sentences through a medcat model to get a list of SCTIDs from the snomed-ct medcat model, based on type IDs.

I am following the example at link - GitHub & BitBucket HTML Preview - Annotating documents with the full medCAT pipeline

Instead of the model in the example (“”), I am using “mc_modelpack_snomed_int_16_mar_2022…zip”.

In order to filter by type ID, I am using a TUI (from SNOMED-CT_Analysis/Exploring a SNOMED-CT Release.ipynb at master · tomolopolis/SNOMED-CT_Analysis · GitHub) for clinical finding ( T-02000 Clinical finding (finding)) instead of the ones used in the example (such as T047, T048). But , this doesn’t work, and I get a KeyError for the TUI used. I suspect the TUI is not recognised as a type_id similar to the umls ones used in the example, but unsure how to go ahead at this point so I can get SCTIDs based on a type ID. Any suggestions, or has anyone done something similar with the snomed-ct model?

medcat 1.5.0
python 3.10.5




I’m not fully familiar with what the T-02000 refers to or whether/where it is stored.

But the type_id field of the the SNOMED-CT CDB is written here:

The description is simply gathered from the parenthesis of the name:

I am not sure whether/where there would be a list of what the type IDs correspond to. But if you find a concept with the correct type-name in the parentheses then you should be able to use that one.
You may have to look into addl_info['cui2original_names'] to find the original names with the brackets.

A subset of SNOMED TUIs and their possible names (I looked through the addl_info['cui2original_names'] for them, but didn’t check too thoroughly) I’ve got saved from something I ran locally:


Thanks so much for this! The subset of type_ids that you’ve shared in the end is exactly what I needed, but didn’t know where to find them. So its good to know for the future. Really appreciate your help!

1 Like