Introduction to Features
"Cleavage site" Record

Where a protein sequence is cleaved in activation or preactivation processing, an appropriate "Product" feature should be used and not a "Cleavage site". Where a protein sequence has a cleavage site for proteolytic enzymes in a digestive process, no annotation should be used. The only appropriate use of a "Cleavage site" record is for physiologically significant, proteolytic inactivation. This feature is only used with a hyphenated pair (range) of adjacent residues. The format is
"Cleavage site: " res " - " res " (" activity ")" "#status" status
Some acceptable examples are:
Cleavage site: Arg-Ser (thrombin)
Cleavage site: Gly-Ile (collagenase)
Cleavage site: His-Ser (plasmin)
Cleavage site: Phe-Leu (chymosin)
Cleavage site: Phe-Met (rennin)
Cleavage site: Pro-Ile (autolytic)
A Comment is usually appropriate to explain the biological significance of this feature.
Where a sequence is cleaved by an enzyme that is thereby inhibited by the product (a "suicide inhibitor"), the cleavage site of the inhibitor should also be annotated as an inhibitory site.

"Cross-link" Record

The "Cross-link" record should be used when two or more residues form a covalent bond through their side chains, other than cysteine disulfides, or through the amino- or carboxyl-terminal.
The format for an intramolecular "Cross-link" record is:
"Cross-link:" cross-link name "(" res " -" res ")" [ " (" extent ") " ] "#status" status
This should be applied only to hyphenated pairs of residues. Some current examples are:
Cross-link: (2S,3S,6R)-3-methyl-lanthionine (Cys-Thr)
Cross-link: 5-imidazolinone (Ser-Gly)
Cross-link: cysteinylhistidine (Cys-His)
Cross-link: cysteinyltyrosine (Cys-Tyr)
Cross-link: isopeptide amino end (Cys-Asn)
Cross-link: isopeptide amino end (Gly-Asn)
Cross-link: lysinoalanine (Ser-Lys)
Cross-link: lysine-topaquinone (Lys-Tyr)
Cross-link: oxazole (Cys-Ser)
Cross-link: peptide (Asn-Ser)
Cross-link: sn-(2S,6R)-lanthionine (Ser-Cys)
Cross-link: thiazole (Gly-Cys)
Cross-link: thiolester (Cys-Gln)
Cross-link: tryptophan-tryptophyl quinone (Trp-Trp)
The format for the intermolecular "Cross-link" record is:
"Cross-link:"name" (" res ") (interchain [to partner ]")" ["("extent")"] "#status" status

This should be applied only to individual residues. Some current examples are:

Cross-link: desmosine (Lys) (interchain)
Cross-link: isopeptide (Gln) (interchain to ... -Lys)
Cross-link: isopeptide (Lys) (interchain to ... -Gln)
Cross-link: isopeptide carboxyl end (Gly) (interchain to ...)
Cross-link: thiolester (Cys) (interchain to ...)
Cross-link: thiolester carboxyl end (Gly) (interchain to ... -Cys)

[GRAY] The use of numbered partners here has the same difficulties as with the "Disulfide bonds: (interchain)" and extreme caution is urged.

This record should be applied when only the side chains of two or more identified residues are directly involved in the cross-link. If an amino- or carboxyl-terminal group is involved, both are annotated as Cross-links and the terminal features carry "amino end" or "carboxyl end" in their name.

The "Cross-link: isopeptide" is used for a side chain linked to either an amino- or carboxyl-terminal group. The "Cross-link: peptide" is used when both amino- or carboxyl-terminal groups are linked from different chain segments.

[GRAY] A "Cross-link: cyclopeptide" may be used when both amino- or carboxyl-terminal groups are linked from the same chain segment. The use of the cyclopeptide feature is extremely dangerous and it may be limited to entries in PIR4 and NRL_3D.

If the cross-link is secondary to the chemical modification of one or both residues (in the sense of a modified site as defined above), the participating residue may also be marked as a modified site. For example:

112-163/Cross-link: tryptophan-tryptophyl quinone (Trp-Trp)
112/Modified site: tryptophyl quinone (Trp)
in which the chemically distinct nature of the two tryptophan residues is obvious. If non-proteinaceous compounds are involved, this latter case will generally apply. If the partner of a cross-linked residue is not identified, the residue may be annotated as a covalent binding site.

[BLACK] The old "Thiolester bond" record should not be used. Instead, use the form

Cross-link: thiolester

Thiol ethers should be denoted by appropriate compound names, like lanthionine or cysteinylhistidine.

"Disulfide Bonds" Record

The format for the intramolecular "Disulfide bonds:" record is
"Disulfide bonds:" ["(in"form")"]"#status" status
This record should be applied to hyphenated pairs (ranges) of residues, and pairs with the same experimental status should be grouped into lists.
Disulfide bonds:
Disulfide bonds: (in conotoxin GI)
Features with the "in" form are in the process of being converted to the "#link" form.

Alternative bonds may be indicated within the same record using this format.

"Disulfide bonds:" "(or "hyphenated pairs")"] "#status" status
For example,
Disulfide bonds: (or 106-121)
Disulfide bonds: (or 20-42, 41-99)
The "or" form should be avoided whenever possible.

[BLACK] Disulfide bonds that would have different annotations must be placed in separate records. For example, instead of

28-44,43-95,49-122,50-88/Disulfide bonds: #status predicted (except for 49-122)
two records should appear
28-44,43-95,50-88/Disulfide bonds: #status predicted
49-122/Disulfide bonds: #status experimental

The format for the interchain "Disulfide bonds:" record is

"Disulfide bonds:" interchain ["(to "partner")"] "#status" status
Generally, this record will applied to individual residues. Without "(to ...)" the interchain bond is assumed to be to the same residue in a dimeric partner, for example:
56/Disulfide bonds: interchain
It may be applied to lists of residues when it is thought that all the residues participate in intermolecular bonds to partners of the same sequence but the pattern of bonding is not known. Such cases will usually be status pedicted.
56,72,98/Disulfide bonds: interchain
Where the bond is between partners of the same sequence (homopolymeric), records should be applied to both residues individually.
136/Disulfide bonds: interchain (to 133)
133/Disulfide bonds: interchain (to 136)
Examples of intermolecular bonds to partners with different sequences (heteropolymeric):
Disulfide bonds: interchain (to heavy chain)
Disulfide bonds: interchain (to beta chain)
Disulfide bonds: interchain (to chain B1)
Disulfide bonds: interchain (to alpha-180)
Disulfide bonds: interchain (to gamma-34 or gamma-35)
The special case of intermolecular bonds to different partners with the same sequence may be distinguished:
Disulfide bonds: interchain (to mu chain in another subunit)

The partner should be indicated as clearly as necessary. (In NRL_3D the partner's code is used.) This is not yet permitted in PIR sections. [GRAY] The problems of checking and maintaining correct codes and numbers in references to other entries cannot be dealt with in the current database form. Any comments on alternative mechanisms for conveying this information in a manner which will allow easy checking and maintenance would be appreciated. The same problem applies to the interchain "Cross-link" records.

In some cases, alternative monomeric, dimeric or multimeric forms are known to exist. Each form should have an appropriate record with an "(in " modifier. For example,

Disulfide bonds: (in monomeric form)
Disulfide bonds: interchain (in polymeric form).
These may be replaced by appropriate "#link" records.

[GRAY] Disulfide bonds can also form with "free" cysteine or with the small peptide glutathione. These are now treated as covalent "Binding site" features. Disulfide bond features should be regarded as a special case of a Cross-link. It should only be used between encoded polypeptide sequences. Disulfide bonds for a transient or active site "Disulfide bonds" record is

"Disulfide bonds: redox-active ("status")"
This record should be applied to hyphenated pairs (ranges) of residues.

