MedCAT trainer "Error: string indices must be integers"

Hi, I successfully created a test project with the default model that came with MedCATtrainer. Now I created a larger project (3000 documents) with the SNOMED model (mc_modelpack_snomed_int_16_mar_2022_25be3857ba34bdd5).

When I try to open the first document, I get this error:

Full Error:

Traceback (most recent call last):
  File "/home/api/./api/views.py", line 292, in prepare_documents
    add_annotations(spacy_doc=spacy_doc,
  File "/home/api/./api/utils.py", line 68, in add_annotations
    update_concept_model(concept, project.concept_db, cat.cdb)
  File "/home/api/./api/utils.py", line 124, in update_concept_model
    icd_codes_mods = _get_or_create_linked_code(ICDCode, icd_codes, cdb_model)
  File "/home/api/./api/utils.py", line 142, in _get_or_create_linked_code
    if len(mod.objects.filter(code=code['code'])) == 0:
TypeError: string indices must be integers

I used the cdb.dat and vocab.dat that came with the model. Perhaps that’s the problem? do I need to re-create them?

In the Django server logs I see:

edcattrainer_1  | ERROR 2022-12-29 10:05:03,306 log.py l:222:Internal Server Error: /api/cache-model/3/
medcattrainer_1  | Traceback (most recent call last):
medcattrainer_1  |   File "/usr/local/lib/python3.9/site-packages/django/core/handlers/exception.py", line 34, in inner
medcattrainer_1  |     response = get_response(request)
medcattrainer_1  |   File "/usr/local/lib/python3.9/site-packages/django/core/handlers/base.py", line 115, in _get_response
medcattrainer_1  |     response = self.process_exception_by_middleware(e, request)
medcattrainer_1  |   File "/usr/local/lib/python3.9/site-packages/django/core/handlers/base.py", line 113, in _get_response
medcattrainer_1  |     response = wrapped_callback(request, *callback_args, **callback_kwargs)
medcattrainer_1  |   File "/usr/local/lib/python3.9/site-packages/django/views/decorators/csrf.py", line 54, in wrapped_view
medcattrainer_1  |     return view_func(*args, **kwargs)
medcattrainer_1  |   File "/usr/local/lib/python3.9/site-packages/django/views/generic/base.py", line 71, in view
medcattrainer_1  |     return self.dispatch(request, *args, **kwargs)
medcattrainer_1  |   File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 509, in dispatch
medcattrainer_1  |     response = self.handle_exception(exc)
medcattrainer_1  |   File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 469, in handle_exception
medcattrainer_1  |     self.raise_uncaught_exception(exc)
medcattrainer_1  |   File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 480, in raise_uncaught_exception
medcattrainer_1  |     raise exc
medcattrainer_1  |   File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 506, in dispatch
medcattrainer_1  |     response = handler(request, *args, **kwargs)
medcattrainer_1  |   File "/usr/local/lib/python3.9/site-packages/rest_framework/decorators.py", line 50, in handler
medcattrainer_1  |     return func(*args, **kwargs)
medcattrainer_1  |   File "/home/api/./api/views.py", line 658, in cache_model
medcattrainer_1  |     get_medcat(CDB_MAP=CDB_MAP, VOCAB_MAP=VOCAB_MAP,
medcattrainer_1  |   File "/home/api/./api/utils.py", line 335, in get_medcat
medcattrainer_1  |     cdb = CDB.load(cdb_path)
medcattrainer_1  |   File "/usr/local/lib/python3.9/site-packages/medcat/cdb.py", line 413, in load
medcattrainer_1  |     data = dill.load(f)
medcattrainer_1  |   File "/usr/local/lib/python3.9/site-packages/dill/_dill.py", line 313, in load
medcattrainer_1  |     return Unpickler(file, ignore=ignore, **kwds).load()
medcattrainer_1  |   File "/usr/local/lib/python3.9/site-packages/dill/_dill.py", line 525, in load
medcattrainer_1  |     obj = StockUnpickler.load(self)
medcattrainer_1  | EOFError: Ran out of input```

I tried changing some things (smaller dataset, re-created the CDB with create_model_pack…)
Now it’s crashing with this message:

Full Error:

Traceback (most recent call last):
  File "/home/api/./api/views.py", line 292, in prepare_documents
    add_annotations(spacy_doc=spacy_doc,
  File "/home/api/./api/utils.py", line 68, in add_annotations
    update_concept_model(concept, project.concept_db, cat.cdb)
  File "/home/api/./api/utils.py", line 124, in update_concept_model
    icd_codes_mods = _get_or_create_linked_code(ICDCode, icd_codes, cdb_model)
  File "/home/api/./api/utils.py", line 142, in _get_or_create_linked_code
    if len(mod.objects.filter(code=code['code'])) == 0:
TypeError: string indices must be integers

What am I doing wrong?