Error Report: Training Meta Annotations with CogStack Scripts + MedCAT

Samora · September 5, 2025, 10:18am

Error Report: Training Meta Annotations with CogStack Scripts + MedCAT

Environment

MedCAT version: 1.16.0
Python version: 3.10
Workflow: Training meta_model using Cogstack scripts and project-exported annotations.

Steps to Reproduce

Exported project annotations including meta annotations (e.g., Presence).
Attempted to train the meta_model:

save_dir_path = "test_meta_" + meta_model  # where to save the meta_model and results
results = mc.train_from_json(mctrainer_export_path, save_dir_path=save_dir_path)

# Save results
json.dump(
    results["report"],
    open(os.path.join(save_dir_path, "meta_" + meta_model + "_results.json"), "w"),
)

Encountered the following error during training.

Full Error Traceback

Exception                                 Traceback (most recent call last)
Cell In[26], line 13
    results = mc.train_from_json(mctrainer_export_path, save_dir_path=save_dir_path)

File /.../site-packages/medcat/meta_cat.py:162, in MetaCAT.train_from_json(...)
    return self.train_raw(data_loaded, save_dir_path, data_oversampled=data_oversampled)

File /.../site-packages/medcat/meta_cat.py:249, in MetaCAT.train_raw(...)
    category_name = g_config.get_applicable_category_name(data)
    if category_name is None:
        raise Exception(
            "The category name does not exist in this json file. You’ve provided ‘{}’, "
            "while the possible options are: {}. Additionally, ensure the populate the "
            "‘alternative_category_names’ attribute to accommodate for variations."
            .format(category_name, " | ".join(list(data.keys())))
        )

Exception:
The category name does not exist in this json file. You’ve provided ‘None’, while the possible options are: .
Additionally, ensure you populate the ‘alternative_category_names’ attribute to accommodate for variations.

Observations

In the MedCAT JSON export I can see:

"meta_anno_defs": [
  {"name": "Presence", "values": ["False", "Hypothetical", "True"]},
  {"name": "Subject/Experiencer", "values": ["Other", "Patient", "Relative"]},
  {"name": "Time", "values": ["Future", "Past", "Recent"]}
],
"relation_anno_defs": []

In the modelpack, the folder Presence exists as expected.

Issue

Despite having Presence defined in both the JSON export and the modelpack, training fails with:

category_name = None
Possible options list is empty ([]).

This suggests that:

The JSON structure may not match what MetaCAT.train_from_json expects, or
The category name mapping (alternative_category_names) is not being resolved correctly.

mart.ratas · September 15, 2025, 9:43am

Hi Samora,

This does indeed look like it’s an issue with the trainer export format you’re using. Neither meta_anno_defs or relation_anno_defs is a defined name in the trainer export that would be used by the library.

What the library expects is a trainer export in the following format:

{
    "projects": [
        {
            "name": "<Proj-name>",
            "id": "<proj-ID>",
            "cuis": "",  # filter for cuis if needed
            "tuis": "",  # filter for type ids if needed
            "documents": [
                {
                    "name": "<Doc-name>",
                    "id": "<doc-ID>",
                    "last_modified": "<last-modified-date>",
                    "text": "<The raw text>",
                    "annotations": [
                        {
                            "id": "<ann-ID>",
                            "cui": "<CUI>",
                            "start": -1,  # start index
                            "end": -1,    # end index
                            "value": "<Annotated Value>",
                            "validated": True,  # whether validated by annotator
                            "meta_anns": {
                                "<Category-name>": {
                                    "name": "<Category-name>",
                                    "value": "<category value>",
                                    "confidence": -1.0,  # the confidence rating
                                },  # and potentially more for other categories
                            }
                        },  # and probably more annotations
                    ]
                },  # and potentially more
            ]
        },  # and potentially more
    ]
}

With that said, if the listed available categories are empty, it’s possible that the model doesn’t also have any MetaCATs loadded. You can check that by printing cat._meta_cats.

Samora · September 17, 2025, 9:10am

Hi Mart,

Thanks for the response,

Loading the model with train supervised working with cogstack notebook and inspecting the cat._meta_cats: [{ “Category Name”: “Presence”, “Description”: “No description”, “Classes”: { “False”: 2, “Hypothetical”: 1, “True”: 0 }, “Model”: “lstm” }, { “Category Name”: “Time”, “Description”: “No description”, “Classes”: { “Future”: 0, “Past”: 2, “Recent”: 1 }, “Model”: “lstm” }, { “Category Name”: “Subject”, “Description”: “No description”, “Classes”: { “Other”: 0, “Patient”: 1 }, “Model”: “lstm” }].

The issue has arisen when using the meta annotation training script.

The trainer export was trainer v2.22.1.

Shubam is looking into this with the export so i’ll leave this for reference.

Thanks again.

Topic		Replies	Views
MedCat meta annotation model poor functionality MedCAT	4	276	January 18, 2023
Key Error when running supervised training on annotations file saved with MedCATrainer MedCAT	6	248	April 19, 2023
Meta annotation basics MedCAT	3	364	October 5, 2022
MetaCAT - Issue with training when there's more than 2 output classes MedCAT	5	39	May 1, 2025
Public models for meta annotations MedCAT	5	290	April 16, 2025

Error Report: Training Meta Annotations with CogStack Scripts + MedCAT

Error Report: Training Meta Annotations with CogStack Scripts + MedCAT

Environment

Steps to Reproduce

Full Error Traceback

Observations

Issue

Related topics