Alternative Models for MetaCAT

jkgenser · November 3, 2023, 2:06am

I was reading through the metacat code and noticed that there is a class BertForMetaAnnotation. However as far as I can tell it’s not used anywhere.

Would it follow a similar enough interface to the LSTM model that I could swap in a bert model in order to do the context task? Or would it be a bit of work?

I’m looking at swapping in a distilbert base model for the context task to test whether it improves performance while still being efficient.

github.com

CogStack/MedCAT/blob/master/medcat/utils/meta_cat/models.py#L80


      
                  else:
                      x = x[row_indices, center_positions, :]
          
                  # Push x through the fc network and add dropout
                  x = self.d1(x)
                  x = self.fc1(x)
          
                  return x
          
          
          class BertForMetaAnnotation(BertPreTrainedModel):
          
              _keys_to_ignore_on_load_unexpected: List[str] = [r"pooler"]  # type: ignore
          
              def __init__(self, config: BertConfig) -> None:
                  super().__init__(config)
                  self.num_labels = config.num_labels
          
                  self.bert = BertModel(config, add_pooling_layer=False)
                  self.dropout = nn.Dropout(config.hidden_dropout_prob)
                  self.classifier = nn.Linear(config.hidden_size, config.num_labels)

tomolopolis · November 21, 2023, 6:35pm

yes this implementation can be used through the configuration of the MetaCAT models within the cat instantiation.

Just looking through the MetaCAT code constructor and it looks like this option may have been lost, or at least its no longer an option on the metacat config.

We’ll take a closer look and get back on here when this is fixed / made a little clearer

jkgenser · November 22, 2023, 2:09pm

Thanks for the reply here! One thing I noticed when reading through is that the BertForMetaAnnotation is a token classification task.

However if I’m not mistaken, the LSTM model is a sequence classification task. Would we want to do a bert for sequence classification instead?

jkgenser · November 25, 2023, 2:57pm

Thinking further, one could also use the pseudo attention strategy and frame it as a text classification + passing center position.

OR

Label every token in the sequence chunks with some either just simple flags or BIO labeling scheme. And subsequently either take the average for entities with multiple tokens or add some CRF on top.

Curious to know your thoughts and if someone makes a suggestion on approach preferred for medcat I’m down to take a stab at implementation. If the BLSTM model is not performant enough for my use case, I’ll probably go ahead and work on an implementation but wouldn’t mind some guidance so that there’s a higher likelihood it gets merged in and helpful for others!

Topic		Replies	Views
Public models for meta annotations MedCAT	5	276	April 16, 2025
Meta annotation basics MedCAT	3	326	October 5, 2022
MetaCAT - Issue with training when there's more than 2 output classes MedCAT	5	13	May 1, 2025
Using annotations in a ML model MedCAT	0	104	February 20, 2024
MetaCAT with grid search MedCAT	6	215	April 25, 2025

Alternative Models for MetaCAT

Related topics