When training a MedCAT model with some Meta-Annotations you get Precision/Recall/F1 scores outputted (generated by the medcat.metacat.MetaCAT.eval()
method).
What is this evaluating - is it performance of the MedCAT model identifying concepts, or is it evaluating the performance of meta models choosing categories?
I’m trying to evaluate the performance of our meta models.