Private KCH Model Description

Dear @tomolopolis,

We are using the Private KCH model below, but the model card does not have the descriptions on source ontology, training dataset, training algorithm and rationale behind CUI filters applied. Hope you could help.
KCH_model_card

We also used the public SNOMED MIMIC-III model, which has a bit more information. We’re trying to compare the two.
Public_snomed_model_card

Also, I understand that this public model does not have meta-annotation trained, may I check then what the following means?
"Status: ‘Detects is a concept affirmed or Negated/ Hypothetical’ "

We appreciate your help.

Thank you.

Hi @Hideaki This KCH model was created prior to the creation of the modelcards. This model with the filter has some issues with it.

Please update to the latest KCH model. It has all of this information on it :slight_smile:

The public model only has one meta-task: “Status”. The KCH one has 3: Experiencer, Presence, Time

1 Like

Thank you, @anthony.shek.

Hope you can help with my understanding of with the meta-annotation result from the public model for this entity. Where does the confidence value come from and what is the purpose of ‘name’:‘Status’ (as we already know the task is ‘Status’ from the more superficial dictionary key)?.
public_meta_annotation

To calculate a meta-model’s confidence or probability for a particular class, you would pass the logits corresponding to that class through the softmax function. The resulting value represents the model’s estimated probability or confidence for that class.

The softmax function ensures that the output values are non-negative and sum up to 1, making them interpretable as probabilities. (1 is most confident and 0 is least.) It is calculated here in the predict function.

As for the duplication of the name ‘Status’ I have no idea the rationale behind this. @tomolopolis has this structure got something to do with MCTrainer?

1 Like

Thank you, @anthony.shek.

May I check if you happen to have further information on the KCH data used for training, i.e. out of 17M documents and 8.8B tokens, what proportion is ICU, Neurology, primary care, psychiatry etc?

No idea this was not recorded during at the time of training. Is this something that you need?
I just you can technically retrieve this from KCH CogStack