You may have heard that PURR may be down temporarily this Thursday (10/17) for maintenance. The maintenance is being rescheduled, and we do not expect to have any downtime this week. We will let you know when the maintenance has been rescheduled. close

Occurrences of MLL1 morphemes in human cDNAs retrieved from GenBank

Listed in Datasets

By Minou Bina1, Phillip J. Wyss1

Purdue University, Department of Chemistry

Supplementary material for the publication entitled “Impact of the MLL1 Morphemes on Codon Utilization and Preservation in CpG Islands.” Bina, M, Wyss P. Biopolymers in press.

Version 1.0 - published on 24 Apr 2015 doi:10.4231/R71834DW - cite this Archived on 25 Oct 2016

Licensed under CC0 1.0 Universal


The human genome may include overlapping codes to accommodate a wide-range of selective pressures imposed by various cellular requirements.  In this context, we have examined whether occurrences of MLL1 morphemes in exons of human genes may impose constraints on utilization of synonymous codons.  These morphemes have been implicated in binding MLL1, a protein that plays central roles in numerous biological processes including the development of body plan during embryogenesis, in the regulation of transcription, and in maintaining the cellular memory of activate transcription at the onset of mitosis.

This publication provides a link to download a text file, listing 9-mers collected from human coding sequences (CDSs), to investigate the impact of MLL1 morphemes on synonymous codon utilization in exons dispersed in protein-coding genes.  Our analyses were done in the context of a parameter designated Rank CDS.  This parameter provides probability thresholds to differentiate 9-mers that could be of importance to exons from those that appear often in human genomic DNA. CDS Rank = 1 identified 9-mers that occurred equally in CDSs and in human genomic DNA;  CDS Rank < 1 identified 9-mers that appeared frequently in total genomic DNA; CDS Rank > 1 identified 9-mers that seemed to reflect the sequence context of exons in human genomic DNA.

The text file offered for download includes several columns:  9-mer sequence; its frequency (Gi) in total human genomic DNA (built hg19); its frequency (CDSi) in human coding sequences; its conceptual translation.  The 9-mers are listed according to their descending ranks. The file does not include 9-mers whose CDS Ranks were less than 2.00.  In the listing, a * denotes a termination codon.   Each 9-mer record includes the nucleotide sequence of morphemes that it includes.

Cite this work

Researchers should cite this work as follows:


The Purdue University Research Repository (PURR) is a university core research facility provided by the Purdue University Libraries, the Office of the Executive Vice President for Research and Partnerships, and Information Technology at Purdue (ITaP).