The final post in this series has been published as a screen cast on YouTube.
The final code used in the video is
from Bio.PDB import * import numpy as np import warnings warnings.filterwarnings('ignore')</pre> p = PDBParser() structure = p.get_structure('A', '1hv4.pdb') model = structure[0] chain = model['A'] # Access all atom coordinates x,y,z,AtomCount = 0,0,0,0 Rcen, Rname = [],[] for residue in chain: res_x,res_y,res_z,resAtomCount = 0,0,0,0 for atom in residue: # Collect Stuff for Structure centre x += atom.coord[0] y += atom.coord[1] z += atom.coord[2] AtomCount += 1 # res_x += atom.coord[0] res_y += atom.coord[1] res_z += atom.coord[2] resAtomCount += 1 rX_cen = res_x/resAtomCount rY_cen = res_y/resAtomCount rZ_cen = res_z/resAtomCount Rcen.append([rX_cen,rY_cen,rZ_cen]) Rname.append(residue.get_resname()) x_cen = x/AtomCount y_cen = y/AtomCount z_cen = z/AtomCount #print(x_cen,y_cen,z_cen) dist = [] for i in Rcen: dx = i[0] - x_cen dy = i[1] - y_cen dz = i[2] - z_cen d2 = dx**2 + dy**2 + dz**2 d = np.sqrt(d2) dist.append(d) for i in range(len(dist)): print(Rname[i],dist[i]) Rname = np.asarray(Rname) dist = np.asarray(dist) # Get unique AA AA = np.unique(Rname) for i in AA: print(i,dist[np.where(i == Rname)[0]])
Disclaimer:
Please keep in mind that the research question was phrased in this way (see video) to make it easy to understand and to develop code to answer it. In the end of the video I do talk about other better ways in which this question might have been answered, instead of the method used. Again, this video is about phrasing a research question surrounding protein structures and writing a simple program to answer it. So this video should not be used for any other purpose. It just acts as an example.