The Protein Structure Series: Post – VII2 min read

By | February 9, 2020

The final post in this series has been published as a screen cast on YouTube.

The final code used in the video is

from Bio.PDB import *
import numpy as np
import warnings
p = PDBParser()
structure = p.get_structure('A', '1hv4.pdb')
model = structure[0]
chain = model['A']

# Access all atom coordinates
x,y,z,AtomCount = 0,0,0,0
Rcen, Rname = [],[]
for residue in chain:
    res_x,res_y,res_z,resAtomCount = 0,0,0,0
    for atom in residue:
        # Collect Stuff for Structure centre
        x += atom.coord[0]
        y += atom.coord[1]
        z += atom.coord[2]
        AtomCount += 1
        res_x += atom.coord[0]
        res_y += atom.coord[1]
        res_z += atom.coord[2]
        resAtomCount += 1
    rX_cen = res_x/resAtomCount
    rY_cen = res_y/resAtomCount
    rZ_cen = res_z/resAtomCount
x_cen = x/AtomCount
y_cen = y/AtomCount
z_cen = z/AtomCount

dist = []
for i in Rcen:
    dx = i[0] - x_cen
    dy = i[1] - y_cen
    dz = i[2] - z_cen
    d2 = dx**2 + dy**2 + dz**2
    d = np.sqrt(d2)
for i in range(len(dist)):
Rname = np.asarray(Rname)
dist = np.asarray(dist)
# Get unique AA
AA = np.unique(Rname)
for i in AA:
    print(i,dist[np.where(i == Rname)[0]])


Please keep in mind that the research question was phrased in this way to make it easy to understand and develop code to answer it. In the end of the video I do talk about real and better ways in which this question can be answered, instead of the method used. Again, this video is more about phrasing a research question surrounding protein structures and writing a simple program to answer it. So this video should not be used for any other purpose then as an example.

Leave a Reply

Your email address will not be published. Required fields are marked *