How to improve recall and make medcat find correct word combinations?

bkakke · January 3, 2023, 11:27pm

I have set up a medcat system locally with the prebuilt UMLS (umls_sm_wstatus_2021_oct) and i am looking to find disorders.
I am wondering why the medcat system is having issues to correctly find texts like these:

premature ventricular contractions (here it finds only the word contractions, where as another place in the same text its able to find "occasional premature ventricular contractions ")
known drug allergies (here it does not find anything)
acute distress (here it does not find anything)
frequent ectopic beats (here it finds only the words ectopic beats)
mild epigastric and right upper quadrant tenderness (here it finds only the word tenderness)
I have many more example where i dont quite understand why medcat is having issues.

Do i need to tweak the medcat setup somehow? It seems to be especially the recall which is weak…How do i improve this?

anthony.shek · January 4, 2023, 11:14am

Hi @bkakke

So the publicly available models are minimally trained on public data. MIMIC-III if I remember correctly.
Currently medcat only returns the “most similar” concept to a span of text.

I would have a first look at the cat.cdb.config and see if the prebuilt models configuration can be optimised for your usecase.

If you have access to data then I would conduct some unsupervised training.

Lastly have you checked out MedCATtrainer? This is a supervised training step for the models where annotators can label a subsample of documents with UMLS concepts, from which you can train and produce performance metrics. If you haven’t already done so I would highly recommend using this tool.

bkakke · January 4, 2023, 3:06pm

Hi Anthony,
Yes i believe its mimic III - but thats also a relatively big dataset i think - isnt it?
As i remember its something like 2.1 million documents. Is a bigger dataset required you think?

In the cat.cdb.config there are quite a lot of options. Which ones do you recommend to tweak?

I do have some data. I have the share clef data (ShARe/CLEF eHealth 2013) which i use for benchmarking. When you say " I would conduct some unsupervised training", do you then mean by using the MedCATtrainer or something else?

anthony.shek · January 6, 2023, 4:53pm

Yes that’s correct mimic III. 2.1 million documents is more than enough and will be representative of your dataset (mimic II I believe).

So when the model has been run across a large corpus of documents, using one of your examples, contractions would have been encountered significantly more than premature ventricular contractions. There is a config setting where there is a weight towards more frequently encountered concepts. (see below). The model then will present the single “most similar” concept across a phrase.

To double check this you can use cat.cdb.cui2count_train[<cui>] to check the number of times the model has encounter the concept.

Configs
The main configs I look at when exploring are:
cat.config.linking['prefer_primary_name']
cat.config.linking['prefer_frequent_concepts'] # make sure to reduce this value when trained on large datasets as more frequent short concepts will be frequently encountered
cat.config.General['spell_check_len_limit:']
cat.config.Ner['min_name_len']
cat.config.Ner['min_name_len']
cat.config.Ner['check_upper_case_names']
cat.config.Ner['upper_case_limit_len']

Lastly what I forgot to mention is that if you don’t want to retrieve all concepts. Try and add a white list filter to the model, to only retrieve CUI’s present in your list:

cat.config.linking['filters'] = {'cuis':<set of cuis here>}

For all configs have a look here.
Finding the right configuration balance for all concepts is not so easy though. But have an explore and let us know what you have found works best for your use case.

Another point is further training via MedCATtrainer (supervised training) to increase the number of examples exposed to the model in this process thus changing the models preference from one concept to another.

bkakke · January 9, 2023, 9:16pm

Thanks for those pointers… as an example i am trying with the following:

from medcat.cat import CAT
cat = CAT.load_model_pack('./umls_sm_wstatus_2021_oct/umls_sm_wstatus_2021_oct')
cat.config.linking['prefer_primary_name'] = 0.35
cat.config.linking['prefer_frequent_concepts'] = 0.05 # make sure to reduce this value when trained on large datasets as more frequent short concepts will be frequently encountered
# cat.config.General['spell_check_len_limit:']
# cat.config.Ner['min_name_len']
# cat.config.Ner['min_name_len']
# cat.config.Ner['check_upper_case_names']
# cat.config.Ner['upper_case_limit_len']
text = "21 year old with delayed pp hemorrhage"
print(cat.get_entities(text))

Where it finds only “Hemorrhage” and is supposed to find “delayed pp hemorrhage”. I have tried adjust on all the parameters you suggested. Are you able to make it find “delayed pp hemorrhage” in that sentence?

Jthteo · January 10, 2023, 2:49pm

Sounds like you need to use MedCATtrainer to recognise that novel tri-gram (as well as other synonymous phrases)

anthony.shek · January 16, 2023, 11:46am

The current setup of medcat struggles with major changes to the key phrase.

@jthteo you are right here. MedCATtrainer would help here.

Alternatively if you know a list of potential phrase variations you can add them straight into the model

bkakke · January 16, 2023, 3:52pm

aha thats interesting.
So, it sounds like I need to use MedCATtrainer more.
Regarding MedCATtrainer, what is the normal workflow for it?
By that I mean, how much text should I annotate and what is the way to figure out what text should be annotated?
Is the normal workflow that i sit and find all the cases where medcat fails and then correct it in the MedCATtrainer basically?

thank you for your help

anthony.shek · January 16, 2023, 4:15pm

Have a look at the template workflows present here:

Generally producing a annotated dataset can be used to create a “gold standard” in which you can use to train models on and benchmark new models against.

Re. your other questions:

By that I mean, how much text should I annotate and what is the way to figure out what text should be annotated?

This is a hard question to answer as it depends. It depends on the number of variations that your concept may be represented as, and their alternative meanings: e.g. The term Seizure would require a lot of training because it can be represented as “fit”, “attack”, “sz”, “episode”… etc and these terms may be confused with alternative terms (The patient is healthy and “fit”) which can have a completely different meaning. Where as Epilepsy will have few representations which do not overlap with other concepts.

How accurate is accurate enough? will ultimately depend on your own usecase.

Is the normal workflow that i sit and find all the cases where MedCAT fails and then correct it in the MedCATtrainer basically?

Yes that is generally correct. Training concepts which already perform well may not be the best use of one’s time.
As part of the workflow, It is always good to validate the performance of a model first across datasets, as documents may be written differently across different departments, hospitals etc and cover different varieties of diseases and use different acronyms and expressions.

bkakke · January 16, 2023, 5:08pm

Aha, great. Thank you for this clarification

Follow up question:
If I already have some annotated documents saved in format X, can I use them to train medcat (probably by converting to whatever format medcat accepts)?
What I mean is, can i train medcat without using the medcat trainer UI if i have pre annotated documents somehow?

anthony.shek · January 16, 2023, 10:46pm

Absolutely!

If you want I can send across a template for an example MCTtrainer export? You can then copy that format.

bkakke · January 17, 2023, 5:50am

ah yes - that would be great.

Thank you

bkakke · January 18, 2023, 6:50am

@anthony.shek Do you think you could post an example of the format in here and how to import and use it?

anthony.shek · January 18, 2023, 11:11am

Yes sure. When I have a moment today Ill post something here

mart.ratas · January 20, 2023, 12:26pm

I’ll point out that an example MedCATTrainer export is available on the MedCAT repo (tests->resources):

github.com

CogStack/MedCAT/blob/master/tests/resources/medcat_trainer_export.json

{"projects": [{"name": "MT Samples (Clone)", "id": 14, "cuis": "", "tuis": "T047,T048", "documents": [{"id": 3204, "name": "1687", "text": "SUBJECTIVE: , The patient is a 60-year-old female, who complained of coughing during meals.  Her outpatient evaluation revealed a mild-to-moderate cognitive linguistic deficit, which was completed approximately 2 months ago.  The patient had a history of hypertension and TIA/stroke.  The patient denied history of heartburn and/or gastroesophageal reflux disorder.  A modified barium swallow study was ordered to objectively evaluate the patient's swallowing function and safety and to rule out aspiration.,OBJECTIVE: , Modified barium swallow study was performed in the Radiology Suite in cooperation with Dr. ABC.  The patient was seated upright in a video imaging chair throughout this assessment.  To evaluate the patient's swallowing function and safety, she was administered graduated amounts of liquid and food mixed with barium in the form of thin liquid (teaspoon x2, cup sip x2); nectar-thick liquid (teaspoon x2, cup sip x2); puree consistency (teaspoon x2); and solid food consistency (1/4 cracker x1).,ASSESSMENT,ORAL STAGE:,  Premature spillage to the level of the valleculae and pyriform sinuses with thin liquid.  Decreased tongue base retraction, which contributed to vallecular pooling after the swallow.,PHARYNGEAL STAGE: , No aspiration was observed during this evaluation.  Penetration was noted with cup sips of thin liquid only.  Trace residual on the valleculae and on tongue base with nectar-thick puree and solid consistencies.  The patient's hyolaryngeal elevation and anterior movement are within functional limits.  Epiglottic inversion is within functional limits.,CERVICAL ESOPHAGEAL STAGE:  ,The patient's upper esophageal sphincter opening is well coordinated with swallow and readily accepted the bolus.  Radiologist noted reduced peristaltic action of the constricted muscles in the esophagus, which may be contributing to the patient's complaint of globus sensation.,DIAGNOSTIC IMPRESSION:,  No aspiration was noted during this evaluation.  Penetration with cup sips of thin liquid.  The patient did cough during this evaluation, but that was noted related to aspiration or penetration.,PROGNOSTIC IMPRESSION: ,Based on this evaluation, the prognosis for swallowing and safety is good.,PLAN: , Based on this evaluation and following recommendations are being made:,1.  The patient to take small bite and small sips to help decrease the risk of aspiration and penetration.,2.  The patient should remain upright at a 90-degree angle for at least 45 minutes after meals to decrease the risk of aspiration and penetration as well as to reduce her globus sensation.,3.  The patient should be referred to a gastroenterologist for further evaluation of her esophageal function.,The patient does not need any skilled speech therapy for her swallowing abilities at this time, and she is discharged from my services.", "last_modified": "2020-03-29 14:43:25.537525+00:00", "annotations": [{"id": 45580, "user": "wish", "cui": "C0017168", "value": "gastroesophageal reflux", "start": 332, "end": 355, "validated": true, "correct": true, "deleted": false, "alternative": false, "killed": false, "last_modified": "2020-04-01 22:06:34.303633+00:00", "manually_created": false, "acc": 1.0, "meta_anns": [{"name": "Status", "value": "Other", "acc": 1.0, "validated": true}]}, {"id": 45581, "user": "wish", "cui": "C0020538", "value": "hypertension", "start": 255, "end": 267, "validated": true, "correct": true, "deleted": false, "alternative": false, "killed": false, "last_modified": "2020-04-01 22:06:30.394941+00:00", "manually_created": false, "acc": 1.0, "meta_anns": [{"name": "Status", "value": "Confirmed", "acc": 1.0, "validated": true}]}, {"id": 45582, "user": "wish", "cui": "C0012634", "value": "disorder", "start": 356, "end": 364, "validated": true, "correct": false, "deleted": false, "alternative": false, "killed": true, "last_modified": "2020-04-01 22:06:48.174475+00:00", "manually_created": false, "acc": 0.364258020093567, "meta_anns": [{"name": "Status", "value": "Other", "acc": 1.0, "validated": true}]}, {"id": 45583, "user": "wish", "cui": "C0038454", "value": "stroke", "start": 276, "end": 282, "validated": true, "correct": true, "deleted": false, "alternative": false, "killed": false, "last_modified": "2020-04-01 22:06:33.410261+00:00", "manually_created": false, "acc": 0.39498916376357734, "meta_anns": [{"name": "Status", "value": "Confirmed", "acc": 1.0, "validated": true}]}, {"id": 45584, "user": "wish", "cui": "C0007787", "value": "TIA", "start": 272, "end": 275, "validated": true, "correct": true, "deleted": false, "alternative": false, "killed": false, "last_modified": "2020-04-01 22:06:32.727096+00:00", "manually_created": false, "acc": 0.43553226241827225, "meta_anns": [{"name": "Status", "value": "Confirmed", "acc": 1.0, "validated": true}]}]}, {"id": 3205, "name": "1605", "text": "CHIEF

This file has been truncated. show original

bkakke · January 20, 2023, 2:51pm

ah great…and then i can just use it directly in the train here: MedCAT/meta_cat.py at master · CogStack/MedCAT · GitHub

correct?

Topic		Replies	Views
MedCAT French model only matches exact terms - accuracy similarity always 1 MedCAT	7	66	June 8, 2025
Medcat 1.7.0 trained on documents, or sentences (short documents) MedCAT	1	214	March 30, 2023
Medcat trained models issues MedCAT	5	302	January 16, 2024
Adding new concepts to a trained model or re-training a MedCAT model MedCAT	9	375	January 30, 2023
MedCat meta annotation model poor functionality MedCAT	4	261	January 18, 2023

How to improve recall and make medcat find correct word combinations?

Related topics