The aim is to develop a simple-to-use but powerful Mac text analysis program. The programming should be done with MacRuby or some other environment that hooks into Cocoa. Speed, memory management, and GUI are all important.
The software will allow the user to load large text files (20 million words for example)
Then produce frequency word lists
Allow searching -- simple text searches with wildcard characters, regex searches. tag searches
The concordance results are displayed with the search word centred (KWIC).
Other formats are allowed.
A window showing a larger context is included above the results.
The results can be sorted according to various criteria
The frequency of co-ocurring words (collocates) is presented in a table
Other information such as distribution of the search word is presented, as described below.
## Deliverables
A Windows version of the program is in the attached file.
Here are the main elements
FILE menu -- Load corpus files; Unload corpus; View Corpus File; Tag Settings; Language
CONCORDANCE menu -- Search (and Search Options); Advanced Search (includes Regex Search and Tag Search) Save/print. Also checkbox for Ignore case; Append Search; Sentence mode
FREQUENCY menu -- Collocate Frequency; Frequency Options; Advanced Collocation; Copy and Save/Print commands
SORT menu -- Primary and secondary sorts based on the alphabetical order of the search term or of surrounding words
DISPLAY menu -- Highlight collocates; Context Type, Word Wrap, Suppress (tags); Distribution; delete and copy commands