Hi, for background I mainly use R for doing audit work but am using MedCAT to perform NER+L to SNOMED concepts on a list of free text diagnosis strings, which form a subsection of our infectious diseases consult notes, to audit our service. I have started with the v2_Snomed2025_MIMIC_IV model which is publicly available and am now trying to tune it using MedCAT Trainer using a fully anonmymised dataset.
I have encountered a number of problems setting everything up, some probably more trivial due to my lack of experience using docker and python but claude.ai has hepled point me in the correct direction for patching things up but cannot speak to whether these are ‘good’ patches as my python knowledge is limited.
I will separate them into different topics for ease and thought I would start with a couple of the easiest things in this post:
-
docker_compose.ymlimage: cogstacksystems/medcat-trainer:v3.4.1 doesn’t appear to exist on docker; I presume this might be an internal build which isn’t pushed to docker yet. Solved by usinglatestinstead of 3.4.1 but don’t know if this might have knock on effects for other issues I have subsequently encountered. -
When marking annotations as
irrelevant, submitting a document throws a 500 error because MedCAT v2 uses a typedLinkingFiltersobject, not a dict.
Error:
TypeError: 'LinkingFilters' object does not support item assignment
cat.config.components.linking.filters['cuis_exclude'] = set()
File: /home/api/api/utils.py lines 410-412
Before:
if 'cuis_exclude' not in cat.config.components.linking.filters:
cat.config.components.linking.filters['cuis_exclude'] = set()
cat.config.components.linking.filters.get('cuis_exclude').update([cui])
Patch:
if cat.config.components.linking.filters.cuis_exclude is None:
cat.config.components.linking.filters.cuis_exclude = set()
cat.config.components.linking.filters.cuis_exclude.add(cui)
Happy to post pull request if above solution felt to be acceptable