Datasets

mOWL is designed to handle input in OWL format. That is, you can input OWL ontologies. A mOWL dataset contains 3 ontologies: training, validation and testing.

Built-in datasets

There are several built-in datasets related to bioinformatics tasks such as protein-protein interactions prediction and gene-disease association prediction. Datasets can be found at Datasets API docs.

To access any of these datasets you can use:

from mowl.datasets.builtin import PPIYeastSlimDataset
ds = PPIYeastSlimDataset()
train_ontology = ds.ontology
valid_ontology = ds.validation
test_ontology = ds.testing

Your own dataset

In case you have your own training, validation and testing ontologies, you can turn them easily to a mOWL dataset as follows:

from mowl.datasets.base import PathDataset
ds = PathDataset("training_ontology.owl",
                 validation_path="validation_ontology.owl",
                 testing_path="testing_ontology.owl")

training_axioms = ds.ontology.getAxioms()
validation_axiom = ds.validation.getAxioms()
testing_axioms = ds.testing.getAxioms()

Note

Validation and testing ontologies are optional when using PathDataset. By default they are set to None.