How many protein-coding genes are there in the Saccharomyces cerevisiae genome?.
Mackiewicz P, Kowalczuk M, Mackiewicz D, Nowicka A, Dudkiewicz M, Laszkiewicz A, Dudek MR, Cebrat S
Yeast (2002)
We have compared the results of estimations of the total number of protein-coding genes in the Saccharomyces cerevisiae genome, which have been obtained by many laboratories since the yeast genome sequence was published in 1996. We propose that there are 5300-5400 genes in the genome. This makes the first estimation of the number of intronless ORFs longer than 100 codons, based on the features of the set of genes with phenotypes known in 1997 to be correct. This estimation assumed that the set of the first 2300 genes with known phenotypes was representative for the whole set of protein-coding genes in the genome. The same method used in this paper for the approximation of the total number of protein-coding sequences among more than 40 000 ORFs longer than 20 codons gives a result that is only slightly higher. This suggests that there are still some non-coding ORFs in the databases and a few dozen small ORFs, not yet annotated, which probably code for proteins.