Canberra stability indicator on top-k positions [Jurman08]
Compute mean Canberra distance indicator on top-k sublists.
Input
- lists - [2D numpy array integer] position lists Positions must be in [0, #elems-1]
- k - [integer] top-k sublists
- modules - [list] modules (list of group indicies)
Output
- cd - [float] canberra distance
>>> from numpy import *
>>> from mlpy import *
>>> lists = array([[2,4,1,3,0], # first positions list
... [3,4,1,2,0], # second positions list
... [2,4,3,0,1], # third positions list
... [0,1,4,2,3]]) # fourth positions list
>>> canberra(lists, 3)
1.0861983059292479
Compute mean Canberra distance indicator on generic lists.
Input
- lists - [2D numpy array integer] position lists Positions must be in [-1, #elems-1], where -1 indicates features not present in the list
- complete - [bool] complete
- normalize - [bool] normalize
Output
- cd - [float] canberra distance
>>> from numpy import *
>>> from mlpy import *
>>> lists = array([[2,-1,1,-1,0], # first positions list
... [3,4,1,2,0], # second positions list
... [2,-1,3,0,1], # third positions list
... [0,1,4,2,3]]) # fourth positions list
>>> canberraq(lists)
1.0628570368721744
Compute the average length of the partial lists (nm) and the corresponding normalizing factor (nf) given by 1 - a / b where a is the exact value computed on the average length and b is the exact value computed on the whole set of features.
Inputs
- lists - [2D numpy array integer] position lists Positions must be in [-1, #elems-1], where -1 indicates features not present in the list
Output
- (nm, nf) - (float, float)
Borda Count [Borda1781]
Compute the number of extractions on top-k sublists and the mean position on lists for each element. Sort the element ids with decreasing number of extractions, AND element ids with equal number of extractions should be sorted with increasing mean positions.
Input
- lists - [2D numpy array integer] ranked feature-id lists. Feature-id must be in [0, #elems-1].
- k - [integer] on top-k sublists
- modules - [list] modules (list of group indicies)
Output
- borda - (feature-id, number of extractions, mean positions)
Example:
>>> from numpy import *
>>> from mlpy import *
>>> lists = array([[2,4,1,3,0], # first ranked feature-id list
... [3,4,1,2,0], # second ranked feature-id list
... [2,4,3,0,1], # third ranked feature-id list
... [0,1,4,2,3]]) # fourth ranked feature-id list
>>> borda(lists, 3)
(array([4, 1, 2, 3, 0]), array([4, 3, 2, 2, 1]), array([ 1.25 , 1.66666667, 0. , 1. , 0. ]))
- Element 4 is in the first position with 4 extractions and mean position 1.25.
- Element 1 is in the first position with 3 extractions and mean position 1.67.
- Element 2 is in the first position with 2 extractions and mean position 0.00.
- Element 3 is in the first position with 2 extractions and mean position 1.00.
- Element 0 is in the first position with 1 extractions and mean position 0.00.
[Jurman08] | G Jurman, S Merler, A Barla, S Paoli, A Galea, and C Furlanello. Algebraic stability indicators for ranked lists in molecular profiling. Bioinformatics, 24(2):258-264, 2008. |
[Borda1781] | J C Borda. Mémoire sur les élections au scrutin. Histoire de l’Académie Royale des Sciences, 1781. |