The initial lines of a PDB entry contain information on the protein, the source, the folks who sent the entry to the PDB, some other useful references and some basic data regarding the crystallographic data.
HEADER PROTEINASE INHIBITOR (TRYPSIN) 27-SEP-82 4PTI 4PTI 3 COMPND TRYPSIN INHIBITOR 4PTI 4 SOURCE BOVINE (BOS $TAURUS) PANCREAS 4PTIE 1 AUTHOR R.HUBER,D.KUKLA,A.RUEHLMANN,O.EPP,H.FORMANEK,J.DEISENHOFER, 4PTI 6 AUTHOR 2 W.STEIGEMANN 4PTI 7 REVDAT 6 16-APR-87 4PTIE 1 SOURCE REMARK 4PTIE 2 REVDAT 5 31-MAY-84 4PTID 1 REMARK 4PTID 1 REVDAT 4 23-FEB-84 4PTIC 1 JRNL 4PTIC 1 REVDAT 3 31-JAN-84 4PTIB 1 REMARK 4PTIB 1 REVDAT 2 30-SEP-83 4PTIA 1 REVDAT 4PTIA 1 REVDAT 1 18-JAN-83 4PTI 0 4PTIA 2 SPRSDE 18-JAN-83 4PTI 3PTI 4PTIA 3 JRNL AUTH M.MARQUART,J.WALTER,J.DEISENHOFER,W.BODE,R.HUBER 4PTI 8 JRNL TITL THE GEOMETRY OF THE REACTIVE SITE AND OF THE 4PTI 9 JRNL TITL 2 PEPTIDE GROUPS IN TRYPSIN, TRYPSINOGEN AND ITS 4PTI 10 JRNL TITL 3 COMPLEXES WITH INHIBITORS 4PTI 11 JRNL REF ACTA CRYSTALLOGR.,SECT.B V. 39 480 1983 4PTIC 2 JRNL REFN ASTM ASBSDK DK ISSN 0108-7681 622 4PTIC 3 REMARK 1 REFERENCE 1 4PTIE 3 REMARK 1 AUTH A.WLODAWER,J.DEISENHOFER,R.HUBER 4PTIE 4 REMARK 1 TITL COMPARISON OF TWO HIGHLY REFINED STRUCTURES OF 4PTIE 5 REMARK 1 TITL 2 BOVINE PANCREATIC TRYPSIN INHIBITOR 4PTIE 6 REMARK 1 REF J.MOL.BIOL. V. 193 145 1987 4PTIE 7 REMARK 1 REFN ASTM JMOBAK UK ISSN 0022-2836 070 4PTIE 8 REMARK 2 4PTI 56 REMARK 2 RESOLUTION. 1.5 ANGSTROMS. 4PTI 57 REMARK 3 4PTI 58 REMARK 3 REFINEMENT. J. DEISENHOFER*S VERSION OF THE JACK AND 4PTI 59 REMARK 3 LEVITT REFINEMENT PROCEDURE COMBINING CRYSTALLOGRAPHIC AND 4PTI 60 REMARK 3 ENERGY REFINEMENT. (A.JACK,M.LEVITT, ACTA CRYSTALLOGR., 4PTI 61 REMARK 3 A34, 931-935, 1978). THE R-VALUE FOR REFLECTIONS WITHIN 4PTI 62 REMARK 3 THE SHELL 1.5 TO 7.0 ANGSTROMS AND WITH 4PTI 63 REMARK 3 2*(ABS(FO)-ABS(FC))/(ABS(FO)+ABS(FC)) LESS THAN 1.2 IS 4PTI 64 REMARK 3 0.162. 4PTI 65 REMARK 4 4PTI 66 REMARK 4 COORDINATES FOR 60 WATER MOLECULES ARE GIVEN FOLLOWING THE 4PTI 67 REMARK 4 MAIN BODY OF THE PROTEIN. THE NOMENCLATURE OF THE WATER 4PTI 68 REMARK 4 MOLECULES IS THAT OF THE DEPOSITORS. 4PTI 69 REMARK 5 4PTIA 4
Following the introductory material, some specific information regarding the protein and its crystalline form are provided, including the protein sequence, secondary structure, disulfide bonds (where present) and the like. Also the dimensions of the unit cell, the space group and related information are given.
SEQRES 1 58 ARG PRO ASP PHE CYS LEU GLU PRO PRO TYR THR GLY PRO 4PTI 70 SEQRES 2 58 CYS LYS ALA ARG ILE ILE ARG TYR PHE TYR ASN ALA LYS 4PTI 71 SEQRES 3 58 ALA GLY LEU CYS GLN THR PHE VAL TYR GLY GLY CYS ARG 4PTI 72 SEQRES 4 58 ALA LYS ARG ASN ASN PHE LYS SER ALA GLU ASP CYS MET 4PTI 73 SEQRES 5 58 ARG THR CYS GLY GLY ALA 4PTI 74 FORMUL 2 HOH *60(H2 O1) 4PTI 75 HELIX 1 H1 PRO 2 GLU 7 1 HELIX 2 H2 SER 47 GLY 56 1 4PTI 76 SHEET 1 S1 2 ALA 16 ALA 25 0 4PTI 77 SHEET 2 S1 2 GLY 28 GLY 36 -1 4PTI 78 SSBOND 1 CYS 5 CYS 55 4PTI 79 SSBOND 2 CYS 14 CYS 38 4PTI 80 SSBOND 3 CYS 30 CYS 51 4PTI 81 CRYST1 43.100 22.900 48.600 90.00 90.00 90.00 P 21 21 21 4 4PTI 82 ORIGX1 1.000000 0.000000 0.000000 0.00000 4PTI 83 ORIGX2 0.000000 1.000000 0.000000 0.00000 4PTI 84 ORIGX3 0.000000 0.000000 1.000000 0.00000 4PTI 85 SCALE1 .023202 0.000000 0.000000 0.00000 4PTI 86 SCALE2 0.000000 .043668 0.000000 0.00000 4PTI 87 SCALE3 0.000000 0.000000 .020576 0.00000 4PTI 88
Then come the atom coordinates, whose listing takes up most of the average PDB file. Each listing begins with "ATOM" and is followed by:
ATOM 1 N ARG 1 26.465 27.452 -2.490 1.00 25.18 4PTI 89 ATOM 2 CA ARG 1 25.497 26.862 -1.573 1.00 17.63 4PTI 90 ATOM 3 C ARG 1 26.193 26.179 -.437 1.00 17.26 4PTI 91 ATOM 4 O ARG 1 27.270 25.549 -.624 1.00 21.07 4PTI 92 ATOM 5 CB ARG 1 24.583 25.804 -2.239 1.00 23.27 4PTI 93 ATOM 6 CG ARG 1 25.091 24.375 -2.409 1.00 13.42 4PTI 94 ATOM 7 CD ARG 1 24.019 23.428 -2.996 1.00 17.32 4PTI 95 ATOM 8 NE ARG 1 23.591 24.028 -4.287 1.00 17.90 4PTI 96 ATOM 9 CZ ARG 1 24.299 23.972 -5.389 1.00 19.71 4PTI 97 ATOM 10 NH1 ARG 1 25.432 23.261 -5.440 1.00 24.10 4PTI 98 ATOM 11 NH2 ARG 1 23.721 24.373 -6.467 1.00 14.01 4PTI 99 ATOM 12 N PRO 2 25.667 26.396 .708 1.00 10.92 4PTI 100 ATOM 13 CA PRO 2 26.222 25.760 1.891 1.00 9.21 4PTI 101 ATOM 14 C PRO 2 26.207 24.242 1.830 1.00 12.15 4PTI 102 ATOM 15 O PRO 2 25.400 23.576 1.139 1.00 14.46 4PTI 103 ATOM 16 CB PRO 2 25.260 26.207 3.033 1.00 13.09 4PTI 104 ATOM 17 CG PRO 2 24.512 27.428 2.493 1.00 11.42 4PTI 105 ATOM 18 CD PRO 2 24.606 27.382 .978 1.00 11.88 4PTI 106
This goes on for a while, until the end of the peptide chain, which is marked by the "TER" line. If there are any other molecules that cocrystallized with the protein (such as solvent molecules or ligands) they are listed as "heteroatoms" near the end of the file.
TER 455 ALA 58 4PTI 543 HETATM 456 O HOH 101 14.483 32.405 -3.949 1.00 16.73 4PTI 544 HETATM 457 O HOH 102 5.350 14.061 18.456 1.00 25.35 4PTI 545 HETATM 458 O HOH 103 18.785 30.833 -6.010 1.00 30.52 4PTI 546 HETATM 459 O HOH 104 25.258 31.756 -3.598 1.00 34.15 4PTI 547 HETATM 460 O HOH 105 23.626 30.718 1.059 1.00 35.13 4PTI 548 HETATM 461 O HOH 106 16.662 21.017 18.977 1.00 29.19 4PTI 549 HETATM 462 O HOH 107 16.177 23.996 10.526 1.00 27.91 4PTI 550 HETATM 463 O HOH 108 18.137 26.764 7.490 1.00 31.88 4PTI 551 HETATM 464 O HOH 109 20.608 29.238 6.548 1.00 36.56 4PTI 552