Error Report: Training Meta Annotations with CogStack Scripts + MedCAT
Environment
-
MedCAT version:
1.16.0
-
Python version:
3.10
-
Workflow: Training
meta_model
using Cogstack scripts and project-exported annotations.
Steps to Reproduce
-
Exported project annotations including meta annotations (e.g., Presence).
-
Attempted to train the
meta_model
:
save_dir_path = "test_meta_" + meta_model # where to save the meta_model and results
results = mc.train_from_json(mctrainer_export_path, save_dir_path=save_dir_path)
# Save results
json.dump(
results["report"],
open(os.path.join(save_dir_path, "meta_" + meta_model + "_results.json"), "w"),
)
- Encountered the following error during training.
Full Error Traceback
Exception Traceback (most recent call last)
Cell In[26], line 13
results = mc.train_from_json(mctrainer_export_path, save_dir_path=save_dir_path)
File /.../site-packages/medcat/meta_cat.py:162, in MetaCAT.train_from_json(...)
return self.train_raw(data_loaded, save_dir_path, data_oversampled=data_oversampled)
File /.../site-packages/medcat/meta_cat.py:249, in MetaCAT.train_raw(...)
category_name = g_config.get_applicable_category_name(data)
if category_name is None:
raise Exception(
"The category name does not exist in this json file. You’ve provided ‘{}’, "
"while the possible options are: {}. Additionally, ensure the populate the "
"‘alternative_category_names’ attribute to accommodate for variations."
.format(category_name, " | ".join(list(data.keys())))
)
Exception:
The category name does not exist in this json file. You’ve provided ‘None’, while the possible options are: .
Additionally, ensure you populate the ‘alternative_category_names’ attribute to accommodate for variations.
Observations
-
In the MedCAT JSON export I can see:
"meta_anno_defs": [ {"name": "Presence", "values": ["False", "Hypothetical", "True"]}, {"name": "Subject/Experiencer", "values": ["Other", "Patient", "Relative"]}, {"name": "Time", "values": ["Future", "Past", "Recent"]} ], "relation_anno_defs": []
-
In the
modelpack
, the folderPresence
exists as expected.
Issue
Despite having Presence
defined in both the JSON export and the modelpack, training fails with:
-
category_name = None
-
Possible options list is empty (
[]
).
This suggests that:
-
The JSON structure may not match what
MetaCAT.train_from_json
expects, or -
The category name mapping (
alternative_category_names
) is not being resolved correctly.