previous section previous page next page next section
CMB

Online Lectures on Bioinformatics

navigation


Physical Mapping and Sequence Assembly


Exercises


GenFrag version 2.2.1 provides a set of programs which were developed to fragment and mutate (since data in sequencing projects is noisy and ambiguous) DNA sequences in order to generate artificial data sets that can be used to systematically and independently vary sequencing project parameters in testing sequence assembly algorithms.

go to theory page
theory


  • Obtain human beta globin cluster from Genbank at NCBI (entry HUMHBB).
  • Obtain GenFrag version 2.2.1 by sending an email to bioserve@t10.lanl.gov containing the text 'genfrag' and compile.
  • chop up the above data using enzyme
    use
    • random seed = 1789
    • coverage = 3, 5, 7
  • mutate the fragments
    use
    • random seed = 1789
    • error file from genfrag times 1, 1.5, 2
  • Download cap.c (Contig assembly program) and sim.c and compile the programs.
  • Reconstruct by means of cap.
  • align contigs with sim.
  • report:
    • number of contigs
    • for each contig: length, quality,
    • do contigs overlap?
    • do contigs cover same part of the original sequence?

Comments are very welcome.
luz@molgen.mpg.de