MedCAT for Heart Disease Concept NER and model fine-tuning

I’m interested in running MedCAT and extracting all Heart Disease concepts from some clinical text: NHSDigital SNOMED CT Browser

How do I load MedCAT with these concepts and extract them from text?

How do I collect training data and fine-tune a model for this use case?

Without going into too much detail. The steps are pretty straight forward. First build a model, then train (both in the semi-supervised and supervised training steps), then extract data.

This discourse group is pretty responsive so I tend to throw questions here and someone debugs my issues within the same day! Anyway:

Create your own model:

  1. Construct a model. Vocab, cdb, configs etc…
  2. Find a corpus of documents similar to the documents which you require information from and follow the Unsupervised training steps.

Once you have a pretrained model, time to fine-tune it…

Pre-trained Model:

  1. Fine tune the model through: Use MedCATtrainer to create a labelled dataset. Supervised training and fine-tuning + Meta-annotations. Also use this labelling step to create a training dataset for our own customisable meta-annotations.
  2. Run your model and annotate documents with the full MedCAT pipeline with MetaAnnotations
  3. Create fancy visualisations of the insights from big data.
  4. Show off your work to the MedCAT community through this discourse group :stuck_out_tongue:
1 Like