Impact of filters on MedCAT annotations

komal21 · June 30, 2023, 4:57am

With MedCAT, how to apply filters while creating particular projects? e.g. Cui-based filters, source value-based filters and specific semantic type filters.
What all to consider for enhancing annotations

mart.ratas · June 30, 2023, 10:33am

Hi, Komal!

It can depend on what you’re trying to accomplish exactly.
I’ll start by quoting the comment from CAT.train_supervised:

When filtering, the filters within the CAT model are used first, then the ones from MedCATtrainer (MCT) export filters, and finally the extra_cui_filter (if set). That is to say, the expectation is: extra_cui_filter ⊆ MCT filter ⊆ Model/config filter.

To elaborate a little bit:

Somewhat obviously, no concept can be trained if it’s in not included within the CDB.
You can also set up filters in config.linking.filters. There’s some documentation here.

You can explicitly allow only a subset of CUIs
Or exclude a subset of of CUIs

When performing supervised training, the MedCAT trainer export can define its own filters. These are applied on top of the filters in the config
When performing supervised training, additional/extra CUI filters can be specified. These will be applied on top of the previous.

With that said, if you’re only working with a small subset of CUIs, you might be better off filtering your CDB, i.e with CDB.filter_by_cui.
The advantage of this approach is that you’d be working with a smaller model (in terms of file size on disk as well as memory footprint while using the model).
The disadvantage could be that you’d need to redo the training if/when you wanted to add new CUIs.

If you have any further questions, don’t hesitate to ask.

Topic		Replies	Views
Removing a CDB Concept MedCAT	13	280	June 6, 2023
Advice on MedCAT for a small set of concepts MedCAT	2	251	June 26, 2023
Medecat Trainer Missing Annotations MedCAT	3	208	January 17, 2023
Medcat trained models issues MedCAT	5	300	January 16, 2024
Key Error when running supervised training on annotations file saved with MedCATrainer MedCAT	6	245	April 19, 2023

Impact of filters on MedCAT annotations

Related topics