Ontology to text

In this section we show how mOWL can be used to generate a text corpus out of an ontology. We provide two methods, one for rendering the logical axioms and another for rendering annotation axioms. Both approaches generate a Manchester Syntax representations of the axioms.

Rendering logical axioms

To generate a corpus out of the logical axioms, we can use the following snippet:

from mowl.corpus import extract_axiom_corpus
from mowl.datasets.builtin import FamilyDataset

dataset = FamilyDataset()
corpus = extract_axiom_corpus(dataset.ontology)

In this way we will get a list of strings where each one will be a Manchester Syntax rendering of an axiom. In case the use case is to save the corpus to disk, the next line could be used:

from mowl.corpus import extract_and_save_axiom_corpus
extract_and_save_axiom_corpus(dataset.ontology,
                              "/tmp/file_to_save_corpus",
                              mode="w")

Hint

Parameter mode reflects how to write to the file. mode="w" would overwrite the file and mode="a" would append to the existing contents of the file.

Rendering annotations from ontology

Annotations from ontology can be also rendered in a similar way. To extract the annotations, use the following example:

from mowl.datasets.builtin import PPIYeastSlimDataset
from mowl.corpus import extract_annotation_corpus

dataset = PPIYeastSlimDataset()
corpus = extract_annotation_corpus(dataset.ontology)

And to save into a file:

from mowl.corpus import extract_and_save_annotation_corpus
extract_and_save_annotation_corpus(dataset.ontology,
                              "/tmp/file_to_save_corpus",
                              mode="w")

Embedding ontologies

Note

This feature has been added since version 0.2.0

To train a Word2Vec model with a generated corpus, we can use the class: SyntacticPlusW2VModel:

from mowl.models import SyntacticPlusW2VModel
model = SyntacticPlusW2VModel(dataset, corpus_filepath="test")
model.set_w2v_model(min_count=1)
model.generate_corpus(save=True, with_annotations=True)
model.train()
Corpus saved in test

Attention

The set_w2v_model receives the same arguments as the gensim.models.word2vec.Word2Vec model.