Alternative Models for MetaCAT

I was reading through the metacat code and noticed that there is a class BertForMetaAnnotation. However as far as I can tell it’s not used anywhere.

Would it follow a similar enough interface to the LSTM model that I could swap in a bert model in order to do the context task? Or would it be a bit of work?

I’m looking at swapping in a distilbert base model for the context task to test whether it improves performance while still being efficient.

yes this implementation can be used through the configuration of the MetaCAT models within the cat instantiation.

Just looking through the MetaCAT code constructor and it looks like this option may have been lost, or at least its no longer an option on the metacat config.

We’ll take a closer look and get back on here when this is fixed / made a little clearer

Thanks for the reply here! One thing I noticed when reading through is that the BertForMetaAnnotation is a token classification task.

However if I’m not mistaken, the LSTM model is a sequence classification task. Would we want to do a bert for sequence classification instead?

Thinking further, one could also use the pseudo attention strategy and frame it as a text classification + passing center position.


Label every token in the sequence chunks with some either just simple flags or BIO labeling scheme. And subsequently either take the average for entities with multiple tokens or add some CRF on top.

Curious to know your thoughts and if someone makes a suggestion on approach preferred for medcat I’m down to take a stab at implementation. If the BLSTM model is not performant enough for my use case, I’ll probably go ahead and work on an implementation but wouldn’t mind some guidance so that there’s a higher likelihood it gets merged in and helpful for others!