Encerrado

Perl program for bio-chemistry

For this assignment, you will create aPOD-documented module. The creation of the module is Assignment 7; itsdocumentation is Assignment 8. Your job is to write several functions and putthem into a module in such a way that they can be imported by a program thatwishes to use them. In addition, the documentation must be created using Perl'sPOD functions. This is a draft version of the assignment. It will have moredetails in the next two days.

Background

A DNA string or DNA strand is a _nite sequence consisting of thefour lowercase letters a, c, g, and t in any order. The four letters stand forthe four nucleotides: adenine, cytosine, guanine, and thymine. Thesenucleotides are called bases.A poly-T sequence of length N is a sequence of N or more consecutive tnucleotides. The GCcontent of a DNA strandis the ratio of the total number of c and g nucleotides to the length of thestrand. For example, the sequence 'atcgtttgga' is of length 10 and has a total of 4 c's and g's, so its GC content is0.4. A CpG island is a c followed by a g in a DNAstrand. (The p in between the C and G represents the fact

that a phosphodiester bond connectsthem.)

## Deliverables

Assignment

For this assignment, you will create aPOD-documented module. The creation of the module is Assignment 7; itsdocumentation is Assignment 8. Your job is to write several functions and putthem into a module in such a way that they can be imported by a program thatwishes to use them. In addition, the documentation must be created using Perl'sPOD functions. This is a draft version of the assignment. It will have moredetails in the next two days.

Background

A DNA string or DNA strand is a _nite sequence consisting of thefour lowercase letters a, c, g, and t in any order. The four letters stand forthe four nucleotides: adenine, cytosine, guanine, and thymine. Thesenucleotides are called bases.A poly-T sequence of length N is a sequence of N or more consecutive tnucleotides. The GCcontent of a DNA strandis the ratio of the total number of c and g nucleotides to the length of thestrand. For example, the sequence 'atcgtttgga' is of length 10 and has a total of 4 c's and g's, so its GC content is0.4. A CpG island is a c followed by a g in a DNAstrand. (The p in between the C and G represents the fact

that a phosphodiester bond connectsthem.)

Required Functions in the Module

The module should have the followingfunctions. It is important that you name these functions the exact names givenhere. Failure to do so will be equivalent to not writing that funcation at all(since my program will search for the functions with those names.) Thefunctions should have the parameters described here as well and shoul;d returnexactly what is described. This means that the functions do not write anythingto standard output! They simply return values.

1. A function named gc_content, which, given a DNA sequence S,returns the GC content of S.

2. A function named poly_t, which, given a DNA sequence S and a positive integerN, returns the number of poly-T sequences in S of length at least N.

3. A function named cpg_islands, which, given a DNA sequence S andtwo positions, j and k, returns the number of CpG islands in the sequencebetween positions j and k inclusive. Assume that the _rst nucleotide is atposition 1.

4. A function named digram_frequencies, which, given a DNA sequence S,computes the frequencies of all 16 di_erent pairwise combinations ofnucleotides in the sequence. The pairwise combinations are (using uppercase foremphasis) AA, AC, AG, AT, CA, CC, CG, CT, GA, GC, GG, GT, TA. TC, TG, and [url removed, login to view] function should return these frequencies in a hash whose keys are thetwo-letter combinations and whose values are the frequencies. Note thatfrequencies are fractions between 0 and 1, so the total of the frequenciesshould add up to 1.

5. An unrelated function, named atom_counts, which, given the pathname of a PDB_le, returns a hash consisting of the number of atoms of each type found in the_le. In other words, the keys in the has are the atomic symbols like H, N, andC, and the values are the number of occurrences in the _le of each atom.

?

Documentation

The module must contain POD markuplanguage to produce something like man pages for the module. POD markuplanguage is described in Cahpter 9 of the textbook and will be covered inclass. It is worth pointing out that the h2xs programwill create modules with POD markup that must then be customized. The qualityand thoroughness of the documentation are how it will be assessed. It must beproper English and it must be thorough and clear enough so that someone else inthe class would be able to _gure out how to use your module if they did nothave to write the same module themselves.

Testing Your Module

You should write a program thatthoroughly exercises your module on various inputs.

Habilidades: Engenharia, MySQL, PHP, Gestão de projetos, Arquitetura de software, Teste de Software

Ver mais: writing with symbols, writing symbols, writing fractions, writing equivalent fractions, writing a bio, write a letter how you search for someone, what is search string, what is a search string, tc module, symbols in writing, string standard functions in c, string j, string hash, someone to do my writing assignment, search tc, my tc, management positions, letter writing for a job, job search ct, job in ct, hash string, example of a job letter, example letter job, content writing for chemistry, clear a string in c

Acerca do Empregador:
( 13 comentários ) Sialkot, Pakistan

ID do Projeto: #3015694