Self-driven web technologist and cross-platform solutions enthusiast with strong analytic mind. Interest in pattern recognition and integration between web platforms and databases. PhD and Engineering degrees in fields of chemistry / bioinformatics with a strong publication record in peer-reviewed journals. Java, Python & PHP programmer. Keen on rock music and rock climbing
Search This Blog
Application and implementation of probabilistic profile-profile comparison methods for protein fold recognition
Fold recognition is a method of fold detecting and protein tertiary structure prediction
applied for proteins lacking homologues sequences of known fold and structure
deposited in the Protein Data Bank. They are based on assumption that there is strictly
limited number of different protein folds in nature, mostly as a result of evolution and
due to basic physical and chemical constraints of polypeptide chains.
Fold recognition methods are useful for protein structure prediction, evolutionary
analysis, metabolic pathways and enzymatic efficiency prediction, molecular docking
and drug design.
Currently there are about 1300 discovered and characterized protein folds in SCOP and
CATH databases. Every newly discovered protein sequence has significant chances to
be classified into one of those folds. Many different approaches have been proposed for
finding the correct fold for a new sequence and it is often useful to include evolutionary
information for query as well as for target proteins. One of the methods of including this
information is a comparison of a query and target sequences profiles. These fold
recognition techniques are called profile-profile methods.
Profile-profile alignments can be calculated using a dot-product, a probabilistic model,
stochastic or theoretical measures. Here are presented applications and
implementations of probabilistic profile-profile comparison methods and advantages of
usage of probabilistic scoring function over comparable fold recognition techniques.
The purpose of this comparison is to show that probabilistic profile-profile methods may
outperform other fold recognition methods in comparison in analysis of distantly related
proteins and that they can be applied not only for fold recognition but also for slightly
different purposes like gene identification, detection of domain boundaries and
modeling of complex proteins.
Full text of my Ph.D. thesis can be downloaded here
Structural genomics is the wide term which describes process of determination of structure representation of information in human genome and at present is limited almost exclusively on proteins. Although in common understanding genetic information means “genes and their encoded protein products”, thousands of human genes produce transcripts which are important in biological point of view but they do not necessarily produce proteins. Furthermore, even though the sequence of the human DNA is known by now, the meaning of the most of the sequences still remains unknown. It is very likely that a large amount of genes has been highly underestimated, mainly because the actual gene finders only work well for large, highly expressed, evolutionary conserved protein-coding genes. Most of those genome elements encode for RNA from which transfer and ribosomal RNAs are the classical examples. But beside these well-known molecules there is a vast unknown world of tiny RNAs that might play a crucial …
Continued device scaling into the nanometer region has given rise to new effects that previously had negligible impact but now present greater challenges to designing successful mixed-signal silicon. Design efforts are further exacebated by unprecedented computational resource requirements for accurate design simulation and verification. This paper presents a GPGPU accelerated sparse linear solver for fast simulation of on-chip coupled problems using nVIDIA and ATI GPGPU accelerators on a multi-core computational cluster and evaluate parallelization strategies from a computational perspective.