We need a simple Perl script.
Basically this is what needs to be done:
Each line in file 1 must be matched against those in the file 2 and inserted into file 2 immediately after the closest match.
Format of the files:
File 1 "Newdata.txt" is a list of addresses. All lines in this file contain one to four segments of text. Each segment is separated by a comma. Some segments contain 3 segments and others contain all 4.
Here is an example:
NEWCASTLE/UNIV NEWCASTLE,FAC MED & HLTH SCI,HLTH SERV FLIGHT 302,RAAF BASE WILLIAMTOW
Segment 1: Format is LOCATION/INSTITUTION NAME. Maximum length of this segment is 51 characters
Segment 2: Maximum length is 30 characters
Segment 3: Maximum length is 20 characters
Segment 4: Maximum length is 20 characters
This file is not sorted in any order and may be sorted if necessary.
File 2 "Database.txt"
This is a quite large file with more than 200,000 lines sorted by different address groups. Each address group is separated by a blank line and the first line of the each group ends with a colon and a letter code with one to three characters. All the other lines are in the same format as those in Newdata.txt.
This is what needs to be done in detail:
For nearly all the lines in [url removed, login to view] file, there is almost an identical line in the Database.txt. However, the match in the database file may contain only 19 characters (instead of 20) for segments 3 and 4.
Therefore, the Perl script should match:
segment 1 exactly as it is in both files
segment 2 exactly as it is in both files
segment 3, if there is one, up to the first 19 characters in both files
segment 4, if there is one, up to the first 19 characters in both files
Once the match is found, the line should be inserted immediately after the closest match in the [url removed, login to view] and must be removed from the Newdata.txt. Those not matched are left in Newdata.txt.
Also, whenever a line is inserted into the [url removed, login to view] please mark it with * , preferably at the beginning of the line, so we can check the accuracy of the script.
Please make sure that the Perl script will not alter the existing data or the order of each line in Database.txt.
I have included the two sample files and the first 11 lines in "Newdata.txt" do have a match in "Database.txt". For example, the line 1 in [url removed, login to view] matches the line 3 in [url removed, login to view], except for the last character in segment 3.
8 freelancers are bidding on average $60 for this job
Dear Sir, I am experienced perl developer. I have already completed a number perl jobs successfully. I have gone through your given project description. I can provide the solution in one day.