                       IUPred RELEASE NOTES
                       ====================

IUPred2A

by Balint Meszaros, Gabor Erdos and Zsuzsanna Dosztanyi


IUPred2A is supplied in source code form along with the required data files. The
program is written in python3 and requires no external libraries.

NOTE: IUPred is not compatible with python2.x versions!

TO RUN IUPred:

python3 iupred2a.py (options) (sequence file) (iupred2 type)

Or if iupred.py is executable

iupred2a.py (options) (sequence file) (iupred2 type)
  
  where sequence file is the location of a FASTA formatted sequence file
  
  iupred2 type is any of the option of

  	long
	short 
	glob
	
  for prediction of long disorder, short disorder ( e.g. missing residues in
  X-ray structures) or predicting globular domains.

  Options:
    -a: ANCHOR2 binding region prediction
    -d path: Location of the 'data' directory. Default is the location of iupred2a.py


INPUT FILE: sequence_file in fasta format. One sequence per file.

EXAMPLE RUN: 

python3 iupred2a P53_HUMAN.seq long


INTERPRETATION OF THE OUTPUT:

In the case of long and short types of disorder the output  gives the
likelihood of disorder for each residue, i.e. it is a value between 0 and 1,
and higher values indicate higher probability of disorder. Residues with values
above 0.5 can be regarded as disordered, and at this cutoff 5% of globular
proteins is expected to be predicted to disordered (false positives).
 
For the prediction type of globular domains it gives the number of globular
domains and list their start and end position in the sequence. This is followed
by the submitted sequence with residues of globular domains indicated by
uppercase letters.

SHORT SUMMARY OF THE METHOD

Intrinsically unstructured/disordered proteins have no single well-defined
tertiary structure in their native, functional state. Our server recognizes
such regions from the amino acid sequence based on the estimated pairwise
energy content. The underlying assumption is that globular proteins make a
large number of interresidue interactions, providing the stabilizing energy to
overcome the entropy loss during folding. In contrast, IUPs have special
sequences that do not have the capacity to form sufficient interresidue
interactions. Taking a set of globular proteins with known structure, we have
developed a simple formalism that allows the estimation of the pairwise
interaction energies of these proteins. It uses a quadratic expression in the
amino acid composition, which takes into account that the contribution of an
amino acid to order/disorder depends not only its own chemical type, but also
on its sequential environment, including its potential interaction partners.
Applying this calculation for IUP sequences, their estimated energies are
clearly shifted towards less favorable energies compared to globular proteins,
enabling the predicion of protein disorder on this ground. 
