Statistics, Similarity Measure: To find the best match unit in the data set
Units in data set are identities that have three variables
Units may have different variables
Each variables in units are ranked by their values
The total number of variables are not limited in the data set. (In test data set, there are eight variables A, B, C, D, E, F, G and H)
For a given test sample, the solution should find the best match unit based on the weight and order of variables. For example, if the test data has C=49, E=32 and G=28, the unit #100 is the best match.
The result of this project:
1). To develop the algorithm to solve this projblem
2). A tool to verify the matching result (in one of the computer languages such as C#, java, c, etc.)
Examples of the data (sample data file is available on request):
UID,VName,Qval,QRank
100,C,49,1
100,E,32,2
100,G,29,3
101,C,48,1
101,E,32,2
101,G,29,3
102,C,48,1
102,E,30,2
102,G,29,3
103,C,47,1
103,G,29,2
103,E,28,3
104,C,46,1
104,G,30,2
104,H,26,3
105,C,46,1
105,G,30,2
105,H,25,3
106,C,45,1
106,G,30,2
106,A,25,3
107,C,41,1
107,F,38,2
107,E,37,3