Datasets
mOWL is designed to handle input in OWL format. That is, you can input OWL ontologies. A mOWL dataset contains 3 ontologies: training, validation and testing.
Built-in datasets
There are several built-in datasets related to bioinformatics tasks such as protein-protein interactions prediction and gene-disease association prediction. Datasets can be found at Datasets API docs.
To access any of these datasets you can use:
from mowl.datasets.builtin import PPIYeastSlimDataset
ds = PPIYeastSlimDataset()
train_ontology = ds.ontology
valid_ontology = ds.validation
test_ontology = ds.testing
Your own dataset
In case you have your own training
, validation
and testing
ontologies, you can turn them easily to a mOWL dataset as follows:
from mowl.datasets.base import PathDataset
ds = PathDataset("training_ontology.owl",
validation_path="validation_ontology.owl",
testing_path="testing_ontology.owl")
training_axioms = ds.ontology.getAxioms()
validation_axiom = ds.validation.getAxioms()
testing_axioms = ds.testing.getAxioms()
Note
Validation and testing ontologies are optional when using PathDataset
. By default they are set to None
.