Evaluating the embeddings
The evaluation of models is specific to the task, which is given by the dataset. For example, when using the PPIYeastDataset, we will use the PPIEvaluator. For example, a typical pipeline to train a model would be:
from mowl.datasets.builtin import PPIYeastSlimDataset
from mowl.models import SyntacticPlusW2VModel
from mowl.evaluation import PPIEvaluator
dataset = PPIYeastSlimDataset()
model = SyntacticPlusW2VModel(dataset, corpus_filepath="test")
model.set_evaluator(PPIEvaluator)
model.set_w2v_model(min_count=1)
model.generate_corpus(save=True, with_annotations=True)
model.train()
model.evaluate(dataset.testing)
What characterizes each evaluator class are two things:
The type of entities involved in the evaluation
The type of axiom to be evaluated
In the case of the PPIEvaluator, the entities involved in the evaluation are only those ones representing entities, we do not consider other entities present in the ontologies such as GO functions.
For the PPIEvaluator, the axioms to be evaluated is \(p_i \sqsubseteq \exists interacts\_wih. p_j\), where \(p_i\) and \(p_j\) are proteins.
Every dataset has the attribute evaluation_classes, which is a 2-tuple of objects mowl.datasets.OWLClasses.