Datasets

Base dataset

class mowl.datasets.base.Dataset

Bases: object

This class provide training, validation and testing datasets encoded as OWL ontologies.

Parameters
  • ontology (org.semanticweb.owlapi.model.OWLOntology) – Training dataset

  • validation (org.semanticweb.owlapi.model.OWLOntology) – Validation dataset

  • testing (org.semanticweb.owlapi.model.OWLOntology) – Testing dataset

class mowl.datasets.base.PathDataset(ontology_path: str, validation_path: str, testing_path: str)

Bases: Dataset

Loads the dataset from ontology documents.

Parameters
  • ontology_path (str) – Training dataset

  • validation_path (str) – Validation dataset

  • testing_path (str) – Testing dataset

class mowl.datasets.base.TarFileDataset(tarfile_path: str, *args, **kwargs)

Bases: PathDataset

Loads the dataset from a tar file.

Parameters
  • tarfile_path (str) – Location of the tar file

  • **kwargs – See below

Keyword Arguments
  • dataset_name (str): Name of the dataset

class mowl.datasets.base.RemoteDataset(url: str, data_root='./')

Bases: TarFileDataset

Loads the dataset from a remote URL.

Parameters
  • url (str) – URL location of the dataset

  • data_root (str) – Root directory

PPI Yeast dataset

class mowl.datasets.ppi_yeast.PPIYeastDataset(url=None)

Bases: RemoteDataset

get_evaluation_classes()

Classes that are used in evaluation

class mowl.datasets.ppi_yeast.PPIYeastSlimDataset(*args, **kwargs)

Bases: PPIYeastDataset

Extending ontologies

mowl.datasets.build_ontology.insert_annotations(ontology_file, annotations, out_file=None, verbose=False)

Method to build dataset given an ontology file and the annotations to be inserted to the ontology. Annotation files must be in .tsv format, with no header. Per each row, the first element is the annotated entity and the rest of the elements are the annotation entities (which are the entities in the ontology).

Parameters
  • ontology_file (str) – Ontology file in .owl format

  • annotations (List of (str, str, str) corresponding to (annotation file path, relation name, annotations prefix)) – Annotations to be included in the ontology. There can be more than one annotation file.

  • out_file (str) – Path for the new ontology.

  • verbose (bool) – If true, information is shown.”

mowl.datasets.build_ontology.create_from_triples(triples_file, out_file, relation_name=None, bidirectional=False, head_prefix='', tail_prefix='')

Method to create an ontology from a .tsv file with triples.

Parameters
  • triples_file (str) – Path for the file containing the triples. This file must be a .tsv file and each row must be of the form (head, relation, tail). It is also supported .tsv files with rows of the form (head, tail); in that case the field relation_name must be specified.

  • relation_name (str) – Name for relation in case the .tsv input file has only two columns.

  • bidirectional (bool) – If True, the triples will be considered undirected.

  • out_file (str) – Path for the output ontology. If None and an existing ontology is input, the existing ontology will be overwritten.

  • head_prefix (str) – Prefix to be assigned to the head of each triple. Default is “”

  • tail_prefix (str) – Prefix to be assigned to the tail of each triple. Default is “”