You will need to familiarize yourself with the Arabidopsis section of Entrez Genomes at NCBI or with
TAIR (The Arabidopsis Information Resource) to complete this assignment. Other organisms that have extensive genomic sequencing projects have similar pages (e.g. yeast, mouse)
which we will visit during other labs in this course.
REMEMBER: Databases are like
mazes. There is always a way to get through to the answer, sometimes more than
one way, but often you will get stuck and will need to go back the way you came.
This can be frustrating, but the key is to look at the information on a page and
learn how to ignore what is superfluous and focus on what is important. How do
you do that? Practice! And refuse to ask for help at every step, no matter how
You have found a gene in Arabidopsis thaliana that is tightly linked to the copper/zinc superoxide dismutase gene CZSOD2
(or CSD2). You suspect, based on recombination data, that your clone is "south" of CZSOD2 at about 60 cM.
1) CZSOD2 is located on which chromosome?
2) Where is CZSOD2 located (in cM or Mbp)? Is it left or right of the centromere?
3) List the BAC clones in the contig in order from the BAC containing CZSOD2 to the BAC containing the
SSLP marker nga361.
4) What other molecular markers are located in the region of the chromosome contained in the above contig?
(hint: be sure that CZSOD2 and nga361 are both on the screen; there should be 15)
5) What types of markers are mi54 and m283?
6) How many clones are in the contig from the BAC containing mi54 to the BAC
link to the HapMap page that was shown to you in lecture. Click on "Data" at
the top of the page, and then click on "Generic Genome Browser", to begin this
The gene that we will be examining is HBB or the human beta-globin
Be sure that you are using the most recent release of the HapMap data (under
1) On which chromosome is the HBB gene located? Where,
approximately, is this gene on the chromosome, near a telomere (or end) of the
chromosome, or more toward the centromere (or middle)?
2) There are several SNPs
in the region of this gene that have been mapped in the populations under study
in this project. Double click on one of the red and blue bar graph icons for the leftmost SNP to
see a table of data labeled "Frequency Report". What do G/G and A/G and
(what is an SNP)? What is the reference allele? What
is "freq"? What is the difference between Genotype frequencies and Allele
frequencies (notice that the "total count" for allele frequencies is always
twice the "total count" for genotype frequencies)?
3) If there are two
alleles for a particular SNP and one has a frequency of 0.678 in a population,
what will the frequency of the second allele be?
4) Now go back and click on
the icon for SNP rs334. What do you suppose the "n/a" means for this
5) You are genotyped for four of these SNPs and the following genotypes
rs1609812 A/G; rs7946748 G/G; rs7480526
A/A; rs10768683 C/G
Using the frequency data in the four SNP tables,
calculate the probability that you are related to the Yoruba (YRI) and then
calculate the probability that you are related to the Han-Chinese (CHB). To
which group are you more likely related and what evidence supports that answer?
If you sequenced your HBB gene and the HBB gene for the YRI and
the CHB, which group's HBB gene sequence would you expect to be most
similar to your own and why?
6) Now, given what you have just learned about
the HapMap data, what will be the positive impact of this project on medicine
and human genetic disorders in the future?