Since there is a time constraint on this project and the deadline is approaching, there is no more time for any more, or any deeper analysis of the assembled genome of the ETEC p7 bacteriophage. This post will summarize the additional analysis and recommendations of approach for the further characterization of this genome.
The genes that were found through Glimmer and characterized through homology search can be further characterized by using for example BLASTP to compare the protein sequences with the protein sequences of other phage to get a sense of similarities and novelty of the proteins between ETEC p7 and other phage species.
Further characterization of the genome is also needed. We made a preliminary prediction of the position of the ORFs and genes. But Glimmer and ORFfinder made somewhat different predictions about the positions of each gene and it remains to investigate the exact positions of every gene and ORF. One way of doing this would be to find the promoters, Shine-Dalgarno sequences and transcriptional terminators. Finding these elements should make it possible to assess if the positions of the genes are correct or if they need to be adjusted.
We made an attempt of searching for promoters in the genome. But using three different softwares for this gave three completely different results. It turned out searching for promoters is difficult and can be time consuming. Promoters of viruses are either the same as the host promoters or very closely related (so that the RNA polymerase of the host will bind with the promoters of the virus). But the promoters can also be specific to the virus. The approach to finding the promoters are thus to find the sequences of the promoters of the bacterial host. If the exact sequence of the promoter is not found mismatches can be allowed. If the promoters are specific to the virus it will become very difficult to find the promoters, but one approach is to look for UTR regions of the genome.
We also made an attempt of finding the transcriptional terminators, but the number of promoters was not even close of matching the number of terminators. Thus, a lot more time and effort is needed into elucidating promoter, Shine-Dalgarno and terminator sequences of this genome.
The larger intergenic gaps should be further investigated for ORFs that might have been missed by Glimmer. For example by homology searches by BLASTX or searching in databases over unfinished microbial genomes. There is a large gap from about 16300 to about 17600 that could potentially hold more ORFs.
As ETEC p7 has a genome consisting of double stranded DNA it belongs to the order of Caudovirales, but we have not been able to gain any definitive information about which family it belongs to. Our best guess at the moment is that it belongs to the family of podoviridae, since some of the apparently closests relatives of ETEC p7, like SU10 and phiEco32 are podoviridae. For the same reason we suspect that ETEC p7 has C3 morphology. But considering the fast evolution of bacteriophages and their ability to acquire DNA horizontally from both other phages and from their hosts, genomes of phages are mosaics and it is nog possible to just rely on close relationships according to homology searches. To be able to get a definitive answer studies of the structural proteins of the virion need to be conducted with different types of electron microscopes, so that visual assessments can be made. Furthermore, predicting secondary structures of the scaffolding proteins can also give clues to the morphology of the bacteriophages, as described in the paper by Mirzaei et al. Predicting secondary structure of protein sequences can be done with for example PSIPRED and JPred.
And lastly, a phylogenetic analyses needs to be conducted. For this it is necessary to have knowledge what features of the phages that scientist use to make the phylogenetic trees of phages. With very basic knowledge about this it seems that the most important features are scaffolding proteins and head proteins that has to be considered. This means that a study needs to be conducted where these structural proteins of ETEC p7 are compared to the same structural proteins of other bacteriophages.