EL Embeddings

This example corresponds to the paper EL Embeddings: Geometric Construction of Models for the Description Logic EL++.

The idea of this paper is to embed EL by modeling ontology classes as \(n\)-dimensional balls (\(n\)-balls) and ontology object properties as transformations of those \(n\)-balls. For each of the normal forms, there is a distance function defined that will work as loss functions in the optimization framework.

Let’s just define the imports that will be needed along the example:

import mowl
mowl.init_jvm("10g")
import torch as th

The EL-Embeddings model, maps ontology classes, object properties and operators into a geometric model. The \(\mathcal{EL}\) description logic is expressed using the following General Concept Inclusions (GCIs):

\[\begin{split}\begin{align} C &\sqsubseteq D & (\text{GCI 0}) \\ C_1 \sqcap C_2 &\sqsubseteq D & (\text{GCI 1}) \\ C &\sqsubseteq \exists R. D & (\text{GCI 2})\\ \exists R. C &\sqsubseteq D & (\text{GCI 3})\\ C &\sqsubseteq \bot & (\text{GCI BOT 0}) \\ C_1 \sqcap C_2 &\sqsubseteq \bot & (\text{GCI BOT 1}) \\ \exists R. C &\sqsubseteq \bot & (\text{GCI BOT 3}) \end{align}\end{split}\]

where \(C,C_1, C_2,D\) are ontology classes and \(R\) is an ontology object property

EL-Embeddings (PyTorch) module.

EL-Embeddings defines a geometric modelling for all the GCIs in the EL language. The implementation of ELEmbeddings module can be found at mowl.nn.el.elem.module.ELEmModule.

EL-Embeddings model

The module mowl.nn.el.elem.module.ELEmModule is used in the mowl.models.elembeddings.model.ELEmbeddings. In the use case of this example, we will test over a biological problem, which is protein-protein interactions. Given two proteins \(p_1,p_2\), the phenomenon “\(p_1\) interacts with \(p_2\)” is encoded using GCI 2 as:

\[p_1 \sqsubseteq \exists interacts\_with. p_2\]

For that, we can use the class mowl.models.elembeddings.examples.model_ppi.ELEmPPI mode, which uses the mowl.datasets.builtin.PPIYeastSlimDataset dataset.

Training the model

from mowl.datasets.builtin import PPIYeastSlimDataset
from mowl.models.elembeddings.examples.model_ppi import ELEmPPI

dataset = PPIYeastSlimDataset()

model = ELEmPPI(dataset,
                embed_dim=30,
                margin=0.1,
                reg_norm=1,
                learning_rate=0.001,
                epochs=20,
                batch_size=20000,
                model_filepath=None,
                device='cuda')

model.train()
  0%|          | 0/20 [00:00<?, ?it/s]
  5%|▌         | 1/20 [00:14<04:42, 14.84s/it]
 70%|███████   | 14/20 [00:14<00:04,  1.31it/s]
100%|██████████| 20/20 [00:14<00:00,  1.33it/s]

1

Evaluating the model

Now, it is time to evaluate embeddings. For this, we use the ModelRankBasedEvaluator class.

from mowl.evaluation import PPIEvaluator

model.set_evaluator(PPIEvaluator)
model.evaluate(dataset.testing)

print(model.metrics)
{'mr': 2464.669435215947, 'mrr': 0.0024686094502139495, 'f_mr': 2464.669435215947, 'f_mrr': 0.0024686094502139495, 'auc': 0.5919726009910654, 'f_auc': 0.5919726009910654, 'hits@1': 0.00016611295681063124, 'hits@3': 0.0009966777408637873, 'hits@10': 0.0027408637873754154, 'hits@50': 0.016777408637873754, 'hits@100': 0.03106312292358804, 'f_hits@1': 0.00016611295681063124, 'f_hits@3': 0.0009966777408637873, 'f_hits@10': 0.0027408637873754154, 'f_hits@50': 0.016777408637873754, 'f_hits@100': 0.03106312292358804}

Total running time of the script: (0 minutes 36.813 seconds)

Estimated memory usage: 2066 MB

Gallery generated by Sphinx-Gallery