Naïve Networking 2

The following note explains how to deploy a high-performance VPN using WireGuard on VPP with DPDK. This lets lab members connect to computation servers from anywhere on the campus network with minimal overhead.

(more…)

Naïve Networking

The following note outlines how to set up a VPN so that lab members can connect to the computation servers from anywhere on the campus network.

1) Requirements

  • Goal: Allow remote VPN clients to access lab servers on a different subnet.
  • VPN software: SoftEther VPN Server (L2-bridge mode) + SoftEther VPN Client.
  • Network device: Layer-3 switch that supports SVIs and inter-VLAN routing.
(more…)

Unwrap big molecules & self-assemblies in PBC

Periodic boundary conditions (PBC) are ubiquitous in molecular simulations. However, when visualizing results, one often prefers to see whole molecules rather than fragments cut by the simulation box. If the molecule percolates through the box, minimizing artificial cuts becomes essential. Below are two methods for reconstructing intact structures under PBC:

(more…)

Vector calculus notes (1)

Problem: given a series molecules $i=1,\dots,N$, connected with junction atoms $\mathbf{x}_i^k$, for $k=1,2,…m$ if there are $m_i$ atoms from molecule $i$ that are bonded with other molecules. The connection bonds are defined as $\mathbf{b}_i := \lbrace\mathbf{x}_j^{k_1}-\mathbf{x}_i^{k_2}|\forall k_1, k_2\, \mathrm{and}\, \forall j\in \mathrm{neighbor\, of}\, i\rbrace$, assume that the center-of-mass of the molecules are fixed, but are rotatable, minimize $\sum_i |\mathbf{b}_i|^2$ with a series of rotation matrices $R_i$:

$$ R_i = \min_{R_i}(L:= \sum_i \sum_{j \in \mathrm{neighbor\, of}\, i} \sum_{k_1,k_2} |\mathrm{pbc}(R_j \mathbf{x}_{j}^{k_1} – R_i \mathbf{x}_{i}^{k_2})|^2) $$

with constraint

$$ c=\sum_i |R_i^TR_i – I|^2=0 $$

the constraint ensures $R^T R =I$ for each rotation matrix thus $R_i$ are rotation matrices. The periodic boundary condition is introduced for most simulation systems.

Minimization procedure includes calculation of jacobian of loss and constraint functions:

$$ \frac{\partial L}{\partial R_i} = -2 \sum_{j \in \mathrm{neighbor\, of}\, i} \sum_{k_1, k_2} \mathrm{pbc}(R_{j} \mathbf{x}_{j}^{k_1} – R_i \mathbf{x}_i^{k_2})^T \mathbf{x}_i^{k_2} $$

and

$$\frac{\partial c}{\partial R_i} = 4 (R_i R_i^T – I) R_i$$

Calculating SQ with FFT

The structure factor, commonly denoted as $S(\mathbf{q})$, is an essential characterization quantity in molecular simulations. Its calculation is straightforward based on its definition. Since periodic boundary conditions are typically used in these simulations, we have

$$S(\mathbf{q}) := \mathcal{FT}\lbrace\langle\rho(\mathbf{r})\rho(\mathbf{0})\rangle\rbrace(\mathbf{q})$$

which actually represents a circular convolution. This means that using the Fast Fourier Transform (FFT) can dramatically accelerate the computation compared to performing the calculation directly.

The FFT-based procedure proceeds as follows:

  1. Compute the density field $\rho(\mathbf{r})$, which is a sum of Dirac delta functions approximated by a histogram.
  2. Use the FFT to calculate the Fourier transform $\hat{\rho}(\mathbf{q})$.
  3. Obtain the structure factor by taking the modulus squared of the transform:
    $$ S(\mathbf{q}) = |\hat{\rho}(\mathbf{q})|^2. $$
  4. Average over the modulus to yield
    $$ S(q) = \frac{\int S(\mathbf{q})\delta(|\mathbf{q}|-q)\,\mathrm{d}\mathbf{q}}{\int\delta(|\mathbf{q}|-q)\,\mathrm{d}\mathbf{q}}. $$

In terms of efficiency, if the simulation box is divided into $N$ bins (for example, in 3D, $N = N_x N_y N_z$), the FFT has a computational complexity of $O(N\log(N))$, and the averaging step is $O(N)$. By contrast, a direct method would involve a loop over all particles and $N$ bins in $\mathbf{q}$ space—and typically, the number of particles is much larger than $\log(N)$. Moreover, because most simulation data are real-valued, employing an FFT designed for real inputs (rFFT) can further improve performance.

Below is an example code for a 3D case, where the indices i, j, and k correspond to $N_x$, $N_y$, and $N_z$, respectively.

fftn(a) == np.concatenate([rfftn(a), conj(rfftn(a))[-np.arange(i),-np.arange(j),np.arange(k-k//2-1,0,-1)]], axis=-1)

Accuracy Considerations: Effect of Bin Size

The histogram bin size should be smaller than half the minimum pairwise distance in the system, in accordance with the Nyquist-Shannon sampling theorem. For instance, in Lennard-Jones systems, a bin size of approximately $0.4\sigma$ is advisable. However, if you are only interested in small $q$ values—as is often the case in self-assembly systems—full pairwise resolution is not necessary; a sampling interval that is just smaller than half of the domain of interest may suffice.

Randomness Within Bins

Particles are assumed to be randomly distributed within each bin, especially when employing large bins. The Fourier transform of the particle positions is given by

$$\sum_i \exp(-\mathrm{i}\mathbf{q}\cdot\mathbf{r}_i)$$

This sum can be decomposed as

$$\sum_\mathrm{bin} \exp(-\mathrm{i}\mathbf{q}\cdot\mathbf{r}_\mathrm{bin})\left(\sum_i \exp(-\mathrm{i}\mathbf{q}\cdot\delta\mathbf{r}_i)\right)$$

where each particle’s position is written as

$$\mathbf{r}_i = \mathbf{r}_\mathrm{bin} + \delta\mathbf{r}_i$$

Assuming the displacements $\delta\mathbf{r}_i$ are randomly distributed within the bin, the inner sum can be approximated by

$$n_\mathrm{bin}\langle\exp(-\mathrm{i}\mathbf{q}\cdot\delta\mathbf{r})\rangle$$

with the average taken over the distribution of $\delta\mathbf{r}$. For a one-dimensional example, if $\delta r$ is uniformly distributed in the interval $(-L/2, L/2)$ (where $L$ is the bin size), the average is given by $\mathrm{sinc}(qL/2)$. Multiplying by this sinc correction during step 4 yields a properly corrected $S(q)$.

This strategy ensures that both efficiency and accuracy are maintained when computing the structure factor in molecular simulations.

Back2Top ^