Write a program for Windows. Program will perform multiple searches on [url removed, login to view] looking for certain items using criteria from an existing data file. The program will extract data from the search results and will save the relevant information to a data file.
**Additional information for “Amazon Search Program??**
**Background information:** [url removed, login to view] allows users to search for any particular book title. Usually an Amazon book search returns 12 titles at a time. Clicking the “next?? button brings up subsequent pages with lists of the next 12 titles.
To give an example, at the time of this writing, if you go to [url removed, login to view] and search for “history of the usa??, this results in a list of over 70,342 ? books, displayed 12 at a time per screen. Let’s say that the specific book that the program is searching for is titled “**Money and Power: The History of Business**?? and has an identification code of 0471216526 (note: the identification code is properly called an ISBN number and we will refer to it in that manner as we continue)**.** That particular book is ranked number 30 on the list; in other words it is the 30th ranked book out of a total of 70,342 books for the search “history of the usa?? on amazon.com.
The rank a particular book appears on the list for any given search changes frequently. Different search terms will produce different results. Even the same search terms will produce different results if ordered differently.
**Description**: The program will initiate a search by reading the search parameters from a data file sending the search query to amazon.com. When sending the search, the query sent must mimic the form a query would have used if being sent from a browser.
The program will be able to determine when the correct book appears because the web page’s source code for that page would contain the following (continuing the example previously mentioned above):
In this case, the ten digit ISBN code 0471216526 seen above is what the program would have been scanning for within the search results - not the book’s title.
Using our previous example, the program would have sent to Amazon a search for “history of the usa??, and would scan each resulting list of books for the ISBN 0471216526.
Continuing with this example, upon finding the book on its corresponding page, the program would record in a text file (specifically in a comma delimited CSV file) the following: **Money and Power: The History of Business**,0471216526, history of the usa,30,70342? ? ? ? which signify the the books name, ISBN product code, search criteria, search rank, and number of titles returned by search (in the case above the book ranked 30 out of 70,342? results). If need be, the programmer may add identifying code numbers which may be placed at the beginning or end of each data line. The function of the code numbers, if used, would be for the purpose of facilitating the program’s ability to read the data in an orderly fashion of progress.
The book’s rank, in this example 30, appears within the html for the line containing the book’s ISBN number, as noted previously: within the tag ref=sr_1_30 the number 30 refers to the books present rank.
The number of titles returned in the search is listed earlier on the source code of the page and appears in the following manner:
<div class="img_header hdr"><div id="resultCount" class="resultCount">Showing 25 - 36 of 70,343 Results</div>
and it will be required to extract the value which is the maximum number of results for that particular search. This number will be used by the program to determine how deep to search (until the maximum number of results).
NOTE: It is possible that any particular search will not result in the searched for book appearing anywhere in the results, and such a result would be recorded in the text data file by assigning the book a rank of 0 (zero).
Additionally, when the desired title appears in the search list, besides recording its rank and other previously mentioned data in the text file, the program will select the title’s link and allow the resulting page to display in the web browser for the amount of time set in the programs options panel (as explained below).
Once written to the text file and after the page’s link has been sent to the browser for the time specified in the program’s options (as explained below), the program will perform the same series of actions on the subsequent search parameters, repeating these actions and appending all results to the same file until all search requests have been performed. ? Typically, there will be several hundred searches which must be performed.
The final result of running the program will be the creation of a comma delimited text file in CSV format with a sequential listing of all searches and results.
The final resulting text file after running the program might have hundreds of titles and search criteria all recorded one after the other.
On subsequent executions of the program, the program will know to read sequentially all data from the latest version of the text file generated previously and located in a specific path as entered in the programs options and would write a new text file time stamped with the current date with the latest results. The data will be read from the latest version of the text file and the program will extract the information and search criteria required to perform the searches again.
To be more specific, the program will search for a given title from a list of titles supplied to the program in a specifically formatted text file. If the title does not appear in the first set of results (based on reading the html source code for the page and determining whether a specific ISBN appears in the results) the program will call for the next screen, continuing in this manner until the desired title (as determined by its ISBN number) finally appears. At that time, the program will append the title’s name and rank and other data as was previously explained, to a text file and if the text file does not exist will create one with a specific name based on the time and date.?
The first line of the text file should list the IP address from which the searches are taking place.
**OPTIONS**: The program will have an options screen which will allow for the setting of the following variables:
1]? Time between requests: (i.e. time before generating a “next page?? command) with tenth of a second precision.
2]? Time the title screen is left displayed before moving to the next title
3]? Maximum number of times per 24 hour time period the program will run automatically at random intervals if left on.
4]? Depth: how far should the program search for any given title (i.e. 1,000 titles, 500 titles, etc.) [note 1: the program can easily convert this number into screens searched by dividing by 12 as 12 books appear per search screen.] [note 2: as noted earlier in the description, the program will be recording the total number of results for each search as well as the specific rank for the title. The number of total results will also be used by the program as an automatic maximum depth of search.? For example, if 316 books are returned for a given search and the default maximum search depth was set to 500, the program will adjust and stop this particular search after 316 results.]
5]? Path location to data files.
6] Select searching only within books or all products on Amazon
Program will display the options screen on start up with default options preselected. The default options can be saved for future use.
It is possible to search on Amazon in a general manner (within all of Amazon’s products) or to specify to search only within books listed.
The command format for searching on Amazon within all of Amazon’s products is as follows:
<[url removed, login to view]+PARAMETERS+GO+HERE>
replacing the words “SEARCH+PARAMETERS+GO+HERE?? with the actual title or search query being searched for, using the? “ + ?? sign to separate each word.
The command format for searching on Amazon only within its database of books is as follows:
[[url removed, login to view]+PARAMETERS+GO+HERE]
replacing the words “SEARCH+PARAMETERS+GO+HERE?? with the actual search query being searched for, using the? “ + ?? sign to separate each word. As mentioned previously, the program will detect if the 10 digit product code appears in the search results, and if not will execute a “next page “ command until the desired searched item comes up, unless the doing so would exceed the maximum search attempts per title as specified by the user in the programs opening options.
All interaction with Amazon must be submitted as standard HTML so as to mimic the actions and format of a common user’s activity with an Internet browser.