Difference between revisions of "Graph Metrics"

From OntoMetrics
Jump to: navigation, search
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
Graph or structural metrics calculate the structure of ontologies.
 
Graph or structural metrics calculate the structure of ontologies.
  
==Average Breadth==
+
==Cardinality==
 +
Cardinality is a property of graphs which expresses a graph related number of specific elements.
  
Breadth is a property related to the cardinality of levels (“generations”) in a graph, where the arcs considered here are only subClassOf arcs. This measure only applies to digraphs (directed graphs).
+
===Absolute root cardinality===
The average breadth states at which degree the ontology has a horizontal modelling of hierarchies.
+
Absolute root cardinality is a property of a directed graph which represents the number of root nodes of the graph.
  
As reference value, there is also shown the '''maximal breadth'''.
+
  <math>m = n_{ROO \subseteq g}</math>
  
==Average Depth==
+
<math>n_{ROO \subseteq g}</math> is the cardinality of the set <math>ROO</math> in the directed graph <math>g</math>.
  
Depth is a graph property related to the cardinality of paths in a graph, where the arcs considered here are only subClassOf arcs. This measure type only applies to digraphs (directed graphs).
+
 
 +
===Absolute leaf cardinality===
 +
Absolute root cardinality is a property of a directed graph which is related to leaf node sets and represents the number of leaf nodes of the graph.
 +
 
 +
  <math>m = n_{LEA \subseteq g}</math>
 +
 
 +
<math>n_{LEA \subseteq g}</math> is the cardinality of the set <math>LEA</math> in the directed graph <math>g</math>.
 +
 
 +
 
 +
===Absolute sibling cardinality===
 +
Absolute root cardinality is a property of a directed graph which is related to sibling node sets and represents the number of sibling nodes of the graph.
 +
 
 +
  <math>\sum_{j}^{SIB} N_{j \in SIB}</math>
 +
 
 +
<math>N_{j \in SIB}</math> is the cardinality of a sibling set <math>j</math> from <math>SIB</math> in the graph <math>g</math>.
 +
 
 +
 
 +
==Depth==
 +
Depth is a property of graphs which is related to cardinality of paths existing in the graph. The arcs which are considered are only subClassOf (isa) arcs but this only applies to directed graphs.
 +
 
 +
===Absolute depth===
 +
The value of the absolute depth is calculated as follows:
 +
 
 +
  <math>m = \sum_{j}^P N_{j \in P}</math>
 +
 
 +
<math>N_{j \in P}</math> is the cardinality of each path <math>j</math> from the set of paths <math>P</math> in a graph <math>g</math>.
 +
 
 +
 
 +
===Average depth===
 
The average depth states at which degree the ontology has a vertical modelling of hierarchies.  
 
The average depth states at which degree the ontology has a vertical modelling of hierarchies.  
  
As reference value, there is also shown the '''maximal depth'''.
+
  <math>m = \frac{1}{n_{P \subseteq g}} \sum_{j}^P N_{j \in P}</math>
  
 +
<math>N_{j \in P}</math> is the cardinality of each path <math>j</math> from the set of paths <math>P</math> in a graph <math>g</math> and <math>n_{P \subseteq g}</math> is the cardinality of <math>P</math>.
  
==Tangledness==
 
  
Tangledness is related to the multihierarchical nodes of a graph, where the arcs considered here are again only subClassOf arcs. This measure only applies to digraphs.
+
===Maximal depth===
The tangledness of a class tree is subject of multiple hierarchy knots of a graph. It means, that this knot has more than one incoming edge.
+
The height of a graph is its maximal depth and it is the result of the following:
 +
 
 +
  <math>m = N_{j \in P}
 +
  \forall i \exists j (N_{j \in P} \geq N_{i \in P})</math>
 +
 
 +
<math>N_{i \in P}</math> and <math>N_{j \in P}</math> is the cardinality of each path <math>j</math> from the set of paths <math>P</math> in a graph <math>g</math>.
 +
 
 +
 
 +
==Breadth==
 +
Here are again only subClassOf (isa) arcs considered to get a property called breadth which is related to the cardinality of levels, so-called generations, in a directed graph.
 +
 
 +
===Absolute breadth===
 +
The value of the absolute breadth is calculated as follows:
 +
 
 +
  <math>m = \sum_{j}^L N_{j \in L}</math>
 +
 
 +
<math>N_{j \in L}</math> is the cardinality of each level/generation <math>j</math> from the set of generations <math>L</math> in a directed graph <math>g</math>.
 +
 
 +
 
 +
===Average breadth===
 +
The average breadth states at which degree the ontology has a horizontal modelling of hierarchies.
 +
 
 +
  <math>m = \frac{1}{n_{L \subseteq g}} \sum_{j}^L N_{j \in L}</math>
 +
 
 +
<math>N_{j \in L}</math> is the cardinality of each generation <math>j</math> from the set of generations <math>L</math> in a directed graph <math>g</math> and <math>n_{L \subseteq g}</math> is the cardinality of <math>L</math>.
 +
 
 +
 
 +
===Maximal breadth===
 +
The maximal breadth of a directed graph is calculated the following way:
 +
 
 +
  <math>m = N_{j \in L}
 +
  \forall i \exists j (N_{j \in L} \geq N_{i \in L})</math>
 +
 
 +
<math>N_{i \in L}</math> and <math>N_{j \in L}</math> is the cardinality of each generation <math>j</math> from the set of generations <math>L</math> in a graph <math>g</math>.
 +
 
  
 
==Fan-outness==
 
==Fan-outness==
 +
Fan-outness is related to the “dispersion” of graph nodes. The arcs considered for this property are again subClassOf (isa) arcs. There are two different measures which can be distinguished: those which are related to leaf node set, like the following ratio of leaf fan-outness, and those which are related to sibling node sets ("internal dispersion"), like the following ratio of sibling fan-outness.
 +
 +
===Ratio of leaf fan-outness===
 +
The ratio of leaf fan-outness is the result of the absolute leaf cardinality divided by the cardinality of the graph.
 +
 +
  <math>m = \frac{n_{LEA} \subseteq g}{n_G}</math>
 +
 +
<math>n_{LEA} \subseteq g</math> is the cardinality of the set <math>LEA</math> in a directed graph <math>g</math> and <math>n_G</math> is the cardinality of <math>G</math>.
 +
 +
 +
===Ratio of sibling fan-outness===
 +
The ratio of sibling fan-outness is the result of the absolute sibling cardinality divided by the cardinality of the graph.
 +
 +
  <math>m = \frac{\sum_{j}^{SIB} N_{j \in SIB}}{n_G}</math>
 +
 +
<math>\sum_{j}^{SIB} N_{j \in SIB}</math> is the absolute sibling cardinality for the directed graph <math>g</math> and <math>n_G</math> is the cardinality of <math>G</math>.
 +
 +
 +
==Tangledness==
 +
 +
Tangledness is related to the multihierarchical nodes of a graph, where the arcs considered here are again only subClassOf (isa) arcs and the measure again only applies to directed graphs.
 +
The tangledness of a class tree is subject of multiple hierarchy knots of a graph. It means, that this knot has more than one incoming edge. This is the formula for this measurement:
 +
 +
  <math>m = \frac{n_G}{t_{\in G \wedge \exists a_1 ,a_2 (isa(m,a_1 )\wedge (isa(m,a_2)))}}</math>
 +
 +
<math>n_G</math> is the cardinality of <math>G</math> and <math>t_{\in G \wedge \exists a_1 ,a_2 (isa(m,a_1 )\wedge (isa(m,a_2)))}</math> is the cardinality of the set of nodes with more than one ingoing subClassOf (isa) arc in <math>g</math>.
 +
 +
 +
==Total number of paths==
 +
The total number of paths is the sum of distinct paths which exists in (set of) graphs where each path start at a root node and ends at a leaf node.
 +
 +
  <math>m = n_{DP \subseteq g}</math>
 +
 +
<math>n_{DP \subseteq g}</math> is the cardinality of the set <math>DP</math> in the directed graph <math>g</math>.
 +
 +
 +
==Average number of paths==
 +
The average number of paths is the quotient of the total number of distinct paths and the number of graphs.
 +
 +
  <math>m = \frac{n_{DP \subseteq g}}{j}</math>
 +
 +
<math>n_{DP \subseteq g}</math> is the cardinality of the set <math>DP</math> in the directed graphs <math>g</math> and the number of graphs <math>j</math>.
  
Fan-outness is related to the “dispersion” of graph nodes, where the arcs considered here are subClassOf arcs.
 
  
 
==Density==
 
==Density==
Line 30: Line 134:
 
To detect those areas, there are already several classifying techniques existing to be able to measure the absolute size and quantity.
 
To detect those areas, there are already several classifying techniques existing to be able to measure the absolute size and quantity.
  
==Modularity==
+
''This metric is not implemented in the ontometrics project yet.''
  
Modularity is related to the asserted modules of a graph, where the arcs considered here are either subClassOf or non-subClassOf arcs. In comparison to cohesion, the number of knots of connected components are put into proportion to the number of all graph elements. However, basically they describe equivalent metrics.
 
  
 
==Logical Adequacy==
 
==Logical Adequacy==
  
 
The logical adequacy of a graph is described by formal semantics where either directed or conceptual relations exist. Consistency ratio can be derived from it with 'nInc' of quantity cardinality from consistent classes of the graph 'g' and 'nG' of quantity cardinality from class knots of the graph 'g'.
 
The logical adequacy of a graph is described by formal semantics where either directed or conceptual relations exist. Consistency ratio can be derived from it with 'nInc' of quantity cardinality from consistent classes of the graph 'g' and 'nG' of quantity cardinality from class knots of the graph 'g'.
 +
 +
''This metric is not implemented in the ontometrics project yet.''
 +
 +
 +
==Modularity==
 +
 +
Modularity is related to the asserted modules of a graph, where the arcs considered here are either subClassOf or non-subClassOf arcs. In comparison to cohesion, the number of knots of connected components are put into proportion to the number of all graph elements. However, basically they describe equivalent metrics.
 +
 +
''This metric is not implemented in the ontometrics project yet.''
  
  
 
==Sources==
 
==Sources==
  
#''Aldo Gangemi, Carola Catenacci, Massimiliano Ciaramita, Jos Lehmann:<br /> Ontology evaluation and validation - An integrated formal model for the quality diagnostic task<br /> September 2005 <br />http://www.loa.istc.cnr.it/old/Files/OntoEval4OntoDev_Final.pdf''
+
#''Aldo Gangemi, Carola Catenacci, Massimiliano Ciaramita, Jos Lehmann:<br /> Ontology evaluation and validation - An integrated formal model for the quality diagnostic task<br /> September 2005 , pp 11-16. <br />http://www.loa.istc.cnr.it/old/Files/OntoEval4OntoDev_Final.pdf''
#''Marcel Ihlo:<br /> Entwicklung eines web-basierten Werkzeugs zur Berechnung von Ontologie-Metriken<br /> Bachelorarbeit 2014 bei der Universität Rostock, Fakultät für Informatik und Elektrotechnik''
+

Latest revision as of 23:42, 10 September 2016

Graph or structural metrics calculate the structure of ontologies.

Cardinality

Cardinality is a property of graphs which expresses a graph related number of specific elements.

Absolute root cardinality

Absolute root cardinality is a property of a directed graph which represents the number of root nodes of the graph.

 m = n_{ROO \subseteq g}

n_{ROO \subseteq g} is the cardinality of the set ROO in the directed graph g.


Absolute leaf cardinality

Absolute root cardinality is a property of a directed graph which is related to leaf node sets and represents the number of leaf nodes of the graph.

 m = n_{LEA \subseteq g}

n_{LEA \subseteq g} is the cardinality of the set LEA in the directed graph g.


Absolute sibling cardinality

Absolute root cardinality is a property of a directed graph which is related to sibling node sets and represents the number of sibling nodes of the graph.

 \sum_{j}^{SIB} N_{j \in SIB}

N_{j \in SIB} is the cardinality of a sibling set j from SIB in the graph g.


Depth

Depth is a property of graphs which is related to cardinality of paths existing in the graph. The arcs which are considered are only subClassOf (isa) arcs but this only applies to directed graphs.

Absolute depth

The value of the absolute depth is calculated as follows:

 m = \sum_{j}^P N_{j \in P}

N_{j \in P} is the cardinality of each path j from the set of paths P in a graph g.


Average depth

The average depth states at which degree the ontology has a vertical modelling of hierarchies.

 m = \frac{1}{n_{P \subseteq g}} \sum_{j}^P N_{j \in P}

N_{j \in P} is the cardinality of each path j from the set of paths P in a graph g and n_{P \subseteq g} is the cardinality of P.


Maximal depth

The height of a graph is its maximal depth and it is the result of the following:

 m = N_{j \in P}
  \forall i \exists j (N_{j \in P} \geq N_{i \in P})

N_{i \in P} and N_{j \in P} is the cardinality of each path j from the set of paths P in a graph g.


Breadth

Here are again only subClassOf (isa) arcs considered to get a property called breadth which is related to the cardinality of levels, so-called generations, in a directed graph.

Absolute breadth

The value of the absolute breadth is calculated as follows:

 m = \sum_{j}^L N_{j \in L}

N_{j \in L} is the cardinality of each level/generation j from the set of generations L in a directed graph g.


Average breadth

The average breadth states at which degree the ontology has a horizontal modelling of hierarchies.

 m = \frac{1}{n_{L \subseteq g}} \sum_{j}^L N_{j \in L}

N_{j \in L} is the cardinality of each generation j from the set of generations L in a directed graph g and n_{L \subseteq g} is the cardinality of L.


Maximal breadth

The maximal breadth of a directed graph is calculated the following way:

 m = N_{j \in L}
  \forall i \exists j (N_{j \in L} \geq N_{i \in L})

N_{i \in L} and N_{j \in L} is the cardinality of each generation j from the set of generations L in a graph g.


Fan-outness

Fan-outness is related to the “dispersion” of graph nodes. The arcs considered for this property are again subClassOf (isa) arcs. There are two different measures which can be distinguished: those which are related to leaf node set, like the following ratio of leaf fan-outness, and those which are related to sibling node sets ("internal dispersion"), like the following ratio of sibling fan-outness.

Ratio of leaf fan-outness

The ratio of leaf fan-outness is the result of the absolute leaf cardinality divided by the cardinality of the graph.

 m = \frac{n_{LEA} \subseteq g}{n_G}

n_{LEA} \subseteq g is the cardinality of the set LEA in a directed graph g and n_G is the cardinality of G.


Ratio of sibling fan-outness

The ratio of sibling fan-outness is the result of the absolute sibling cardinality divided by the cardinality of the graph.

 m = \frac{\sum_{j}^{SIB} N_{j \in SIB}}{n_G}

\sum_{j}^{SIB} N_{j \in SIB} is the absolute sibling cardinality for the directed graph g and n_G is the cardinality of G.


Tangledness

Tangledness is related to the multihierarchical nodes of a graph, where the arcs considered here are again only subClassOf (isa) arcs and the measure again only applies to directed graphs. The tangledness of a class tree is subject of multiple hierarchy knots of a graph. It means, that this knot has more than one incoming edge. This is the formula for this measurement:

 m = \frac{n_G}{t_{\in G \wedge \exists a_1 ,a_2 (isa(m,a_1 )\wedge (isa(m,a_2)))}}

n_G is the cardinality of G and t_{\in G \wedge \exists a_1 ,a_2 (isa(m,a_1 )\wedge (isa(m,a_2)))} is the cardinality of the set of nodes with more than one ingoing subClassOf (isa) arc in g.


Total number of paths

The total number of paths is the sum of distinct paths which exists in (set of) graphs where each path start at a root node and ends at a leaf node.

 m = n_{DP \subseteq g}

n_{DP \subseteq g} is the cardinality of the set DP in the directed graph g.


Average number of paths

The average number of paths is the quotient of the total number of distinct paths and the number of graphs.

 m = \frac{n_{DP \subseteq g}}{j}

n_{DP \subseteq g} is the cardinality of the set DP in the directed graphs g and the number of graphs j.


Density

Density can be defined as the presence of clusters of classes with many non-taxonomical relations holding among them. For example, so-called core ontology patterns (for thematic roles in events, contracts, diagnoses, etc.) usually constitute dense areas in an ontology. To detect those areas, there are already several classifying techniques existing to be able to measure the absolute size and quantity.

This metric is not implemented in the ontometrics project yet.


Logical Adequacy

The logical adequacy of a graph is described by formal semantics where either directed or conceptual relations exist. Consistency ratio can be derived from it with 'nInc' of quantity cardinality from consistent classes of the graph 'g' and 'nG' of quantity cardinality from class knots of the graph 'g'.

This metric is not implemented in the ontometrics project yet.


Modularity

Modularity is related to the asserted modules of a graph, where the arcs considered here are either subClassOf or non-subClassOf arcs. In comparison to cohesion, the number of knots of connected components are put into proportion to the number of all graph elements. However, basically they describe equivalent metrics.

This metric is not implemented in the ontometrics project yet.


Sources

  1. Aldo Gangemi, Carola Catenacci, Massimiliano Ciaramita, Jos Lehmann:
    Ontology evaluation and validation - An integrated formal model for the quality diagnostic task
    September 2005 , pp 11-16.
    http://www.loa.istc.cnr.it/old/Files/OntoEval4OntoDev_Final.pdf