Online Lectures on Bioinformatics
|
Physical Mapping and Sequence Assembly
Exercises
GenFrag version 2.2.1 provides
a set of programs which were developed to fragment and mutate
(since data in sequencing projects is noisy and ambiguous)
DNA sequences in order to generate artificial data sets
that can be used to systematically and independently vary
sequencing project parameters in testing sequence assembly algorithms.
 theory
- Obtain human beta globin cluster from
Genbank
at NCBI (entry HUMHBB).
- Obtain GenFrag version 2.2.1 by sending an email to
bioserve@t10.lanl.gov
containing the text 'genfrag' and compile.
- chop up the above data using enzyme
use
- random seed = 1789
- coverage = 3, 5, 7
- mutate the fragments
use
- random seed = 1789
- error file from genfrag times 1, 1.5, 2
- Download cap.c
(Contig assembly program) and sim.c
and compile the programs.
- Reconstruct by means of cap.
- align contigs with sim.
- report:
- number of contigs
- for each contig: length, quality,
- do contigs overlap?
- do contigs cover same part of the original sequence?
Comments are very welcome.
luz@molgen.mpg.de
|