The Protein Structure Series: Post – VII2 min read

By | February 9, 2020

The final post in this series has been published as a screen cast on YouTube.

The final code used in the video is

from Bio.PDB import *
import numpy as np
import warnings
p = PDBParser()
structure = p.get_structure('A', '1hv4.pdb')
model = structure[0]
chain = model['A']

# Access all atom coordinates
x,y,z,AtomCount = 0,0,0,0
Rcen, Rname = [],[]
for residue in chain:
    res_x,res_y,res_z,resAtomCount = 0,0,0,0
    for atom in residue:
        # Collect Stuff for Structure centre
        x += atom.coord[0]
        y += atom.coord[1]
        z += atom.coord[2]
        AtomCount += 1
        res_x += atom.coord[0]
        res_y += atom.coord[1]
        res_z += atom.coord[2]
        resAtomCount += 1
    rX_cen = res_x/resAtomCount
    rY_cen = res_y/resAtomCount
    rZ_cen = res_z/resAtomCount
x_cen = x/AtomCount
y_cen = y/AtomCount
z_cen = z/AtomCount

dist = []
for i in Rcen:
    dx = i[0] - x_cen
    dy = i[1] - y_cen
    dz = i[2] - z_cen
    d2 = dx**2 + dy**2 + dz**2
    d = np.sqrt(d2)
for i in range(len(dist)):
Rname = np.asarray(Rname)
dist = np.asarray(dist)
# Get unique AA
AA = np.unique(Rname)
for i in AA:
    print(i,dist[np.where(i == Rname)[0]])


Please keep in mind that the research question was phrased in this way (see video) to make it easy to understand and to develop code to answer it. In the end of the video I do talk about other better ways in which this question might have been answered, instead of the method used. Again, this video is about phrasing a research question surrounding protein structures and writing a simple program to answer it. So this video should not be used for any other purpose. It just acts as an example.

Leave a Reply

Your email address will not be published. Required fields are marked *