I need to have a program that looks for repeated matches of sequence y in sequence x. This would employ the "Repeated Matches" algorithm that is used in Biological sequence analysis. The program would have to be written in Perl, and not complex, and with good comments.
The user provides the scoring matrix (a file), a threshold value (an int) and a gap penalty(also an int), and two sequences (strings). The program would take the two strings, compute a matrix using the scoring matrix and the following formula:
A) F(i, 0) = max (1) F(i -1, 0)
(2) F(i-1, j) - T j =1,....m;
B) F(i, j) = max of (1) F (i, 0)
(2) F (i -1, j -1) + s(xi, yj)
(3) F (i-1, j) - d
(4) F (i, j-1) - d
xi and yj are obtained from the scoring matrix.
The method finds one or more non-overlapping copies of sections of one sequence in the other.
Start by initializing F(0,0) = 0, and then fill the matrix with the recurrence relation given above (A) which handles unmatched regions and ends of matches, only allowing matches to end when they have a score of at least T. Equation B handles the start of matches and extensions. The total score of all the matches is obtained by adding an extra cell to the matrix, F(n+1, 0) using equation A. This score will have T subtracted for each match; if there were no matches of score greater than T it will be 0, obtained by repeated application of the first option in equation A.
I can provide more information if needed. I have attached the java implementation of this program, I however need it done in Perl. I would like to receive feed back on a regular basis regarding the progress of the project. And the deadline is _October 6th 5:00PM EDT._
1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Installation package that will install the software (in ready-to-run condition) on the platform(s) specified in this bid request.
3) Exclusive and complete copyrights to all work purchased. (No GPL, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site).
It needs to run on a unix platform. The program needs to be in Perl (easily readable, simple) with good comments.