## Saturday, January 18, 2020

### Not SQ again...

Structure factor (SQ) is a characterization quantity which is frequently calculated in molecular simulations. It is easily to code the calculation program according to the definition. In most molecular simulations, periodic boundary condition is adopted therefore
$$S(\mathbf{q}):=\mathcal{FT}\lbrace\langle\rho(\mathbf{r})\rho(\mathbf{0})\rangle\rbrace(\mathbf{q})$$
is actually a circular convolution, where FFT can give a dramatic boost in contrast to calculate directly. The steps using FFT are:
1. Calculate $\rho(\mathbf{r})$, which is a summation of Dirac-Delta function that can be estimated as a hisotgram;
2. Calculate $\hat{\rho}(\mathbf{q})$ by FFT;
3. $S(\mathbf{q})=|\hat{\rho}(\mathbf{q})|^2$;
4. Calculate mean over modulus: $S(q)=\int S(\mathbf{q})\delta(|\mathbf{q}|-q)\mathrm{d}\mathbf{q}/\int \delta(|\mathbf{q}|-q)\mathrm{d}\mathbf{q}$
Efficiency
If the simulation box is divided into $N$ (in 3D systems, $N=N_xN_yN_z$ for example) bins, the FFT gives $O(N\log(N))$ complexity and step 4 is $O(N)$. Generally, in comparison with the direct method, one needs at least loop over number of particles and $N$ bins for $\mathbf{q}$, the number of particles is obviously, way larger than $\log(N)$. For data of most simulations are real numbers, rFFT designed for real inputs could further boost the program. Here is code for a 3D example, i, j, k represent $N_x$, $N_y$, $N_z$ respectively.
fftn(a) == np.concatenate([rfftn(a), conj(rfftn(a))[-np.arange(i),-np.arange(j),np.arange(k-k//2-1,0,-1)]], axis=-1)

Accuracy
1. Binsize effect.
2. The binsize in histogram should be smaller than half of the minimun of the pair distances in the system according to Nyquist-Shannon sampling theorem. For example, in Lennard-Jones systems, $0.4\sigma$ is a good choice. However, if values of only some small $q$ are concerned, i.e., in some self-assembly systems, there is no need to calculate in the "pair accuracy", the sampling distance smaller than half of the interested domain size is fine.
3. Randomness in the bin
4. Positions in the bin are randomly distributed, especially for large binsize mentioned above. The summation $\sum_i \exp(-\mathbb{I}\mathbf{q}\cdot\mathbf{r}_i)$ can be decomposed into $$\sum_\mathrm{bin} \exp(-\mathbb{I}\mathbf{q}\cdot\mathbf{r}_\mathrm{bin})\left(\sum_i \exp( -\mathbb{I}\mathbf{q}\cdot\delta\mathbf{r}_i)\right)$$ The idea is simple, if some particles are inside a bin, the position can be decomposed as $\mathbf{r}=\mathbf{r}_\mathrm{bin}+\delta\mathbf{r}$, and for particles in the bin, $\delta\mathbf{r}$ is assumed uniformly distributed in the bin. The latter summation in the decomposition is thus represented as $n_\mathrm{bin}\overline{\exp(-\mathbb{I}\mathbf{q}\cdot\delta\mathbf{r})}$, the average is approximated according to the distribution of $\delta\mathbf{r}$. If we consider $\delta r$ (1D case, for example) uniformly distributed in $(-L/2, L/2)$ with $L$ be the binsize, the average is $\mathrm{sinc}(qL/2)$. Multiply the sinc function during step 4, the $S(q)$ will be corrected.

## Saturday, January 11, 2020

### Free energy calculation: umbrella integration

Formulae of umbrella sampling method can be found on wikipedia. In umbrella sampling, a reaction coordinate $\xi$ is pre-defined from the atomic coordinates, a bias potential is added to the atoms of interest to keep $\xi$ of the system at a specific window $\xi_w$. The bias form is usually a harmonic potential:
$$u^b_w(\xi)=0.5k_w(\xi-\xi_w)^2$$
Therefore, the energy of biased system $A^b_w = A^{ub}_w + u^b_w(\xi)$. The superscript $ub$ is short for "un-biased". In the simulations, we can sample the reaction coordinate in each window and evaluate their distribution $P^b(\xi)$, since the free energy $A=-k_BT\ln(P)$, we have:
$$A_w^{ub}(\xi) = -k_BT\ln(P^b_w(\xi))-u^b_w(\xi)-F_w$$
with $F_w$ is a reference free energy of each window and remains an unknown constant. One method to derive $F_w$ is WHAM, in year 2005, Kästner et al. (The Journal of Chemical Physics 123, no. 14 (October 8, 2005): 144104.) have proposed a new method whose idea is to take derivative of $A^u_w$ with respect to $\xi$ to eliminate $F_w$, In this method, $P(\xi)$ is assumed to be analytic, e.g., a Gaussian distribution. The general idea is to expand $A(\xi)$ into Taylor series (at $\langle \xi \rangle_w$):
$$A(\xi)=a_1 \xi + a_2 \xi^2 + a_3 \xi^3 +\ldots$$
if $a_i=0$ with $i\ge 3$, the Gaussian form is restored. For systems with higher order terms, the distributions, for example, generally have a non-zero skewness. The crucial step is to determine $\lbrace a_i\rbrace$. In practice, the probability with form of $\exp(-\sum_i a_i \xi^i)$ is difficult to determine, in year 2012, Kästner et al. (The Journal of Chemical Physics 136, no. 23 (June 21, 2012): 234102) studied cases with order $4$ and small $a_3, a_4$. I have tried another more "numerical" method to calculate $\lbrace a_i\rbrace$ when datasets are poorer: fit $\exp(-\sum_i a_i \xi ^i)$ with a Gaussian KDE to find $\lbrace a_i \rbrace$. For some "poor" datasets, higher-order terms of expansion of free energy are required. The normalization factor of the distribution is estimated as $n=\int_{\xi_\mathrm{min}}^{\xi_\mathrm{max}} P(\xi)\mathrm{d}\xi$, the fitting range should be carefully chosen to ensure the convergence of $n$.

## Wednesday, December 11, 2019

### Di-block copolymer analysis: GPC

Gel Permeation Chromatography is a well known method of measuring molecular weight distribution of polymers. For di-block copolymers, a proper calibration curve can be constructed by using Mark-Houwink parameters of homopolymers (see here). Usually, di-block copolymers are obtained by synthesis one block first then "grow" the other block from the end of preceding block. The molecular weight distributions of final di-block copolymer and preceding block are accessible by GPC, the molecular weight distribution of the 2nd block, however, is hard to obtain.

Here I present a simple method to calculate the molecular weight distribution of 2nd block by GPC data of preceding block and final di-block copolymer. Assuming that we already have the molecular weight distribution of the preceding block, $P_1(n)$, it is generally that the length of the 2nd block "grow" from the depends on the preceding chain length $n$, therefore, the pdf of the 2 blocks should be a joint pdf $Q(n, m)=P_1(n)P_2(m, n)$, and the pdf of the final di-block copolymer is straightforward:
$$P(x)=\int Q(n, x-n) \mathrm{d}n$$
However, it is reasonable to assume that the $P_2(m, n)$ can be expressed as some $f(m)g(n)$ form by dropping some higher order correlations between $m,n$; hence, the $g(n)$ can be considered as a "fixing parameter" on the distribution of $P_1(n)$, the calculation of $P(x)$ is therefore a convolution. The simplest case is $g(n)=1$, which means the $P_1(n)$ and $P_2(m)$ are independent, the growth of 2nd block on the end of the 1st block is not effected by the length of the preceding block; or $g(n)\sim n^{-1}$ by considering the diffusion of the preceding block: small preceding chains tend to "grow" more 2nd block.

The evaluation of pdf of the 2nd block follows 3 steps:

1. Determine the range of molecular weight: $(x_{min}-n_{max}, x_{max}-n_{min})$, where $x$ is the chain length of the di-block copolymer and $n$ is the length of the preceding block;
2. Interpolate the distributions obtained from GPC into equally spaced molecular weight, negative part is simply 0;
3. Deconvolute $P(x)$ with $P_1(n)$.

## Monday, October 28, 2019

### A note on Cython Parallelism

Instead of using thread-local ndarrays, a very convenient method is to create an array shaped in (num_of_threads, <whatever the structure>) to make the changes thread-safe. For example:
cdef np.ndarray[np.double_t, ndim=2] ret
and in the prange loop:
...
Afterwards, ret.sum(axis=0) is returned. The loop is set to use num_threads threads: with nogil, parallel(num_threads=num_threads):. By default, I set the prange with schedule='dynamic' option, on my laptop (ArchLinux), with openmp=201511 and gcc=9.2.0, I found interesting outputs, the summation of ret is exactly num_threads times the desired results, i.e. ret.sum(axis=0)[0] should be 200, but if 4 threads are used in parallel computation, then the output becomes 800. However, this does not happen on my server (CentOS 7.6) with gcc=4.2.0 and openmp=201107. All others are the same (Python, Cython, NumPy...) including .c files generated by Cython, for I use Anaconda. This wired problem is solved after setting schedule='static' on my laptop. I think this is some new feature of openmp, I shall read the docs to figure this out when I have some time.

## Friday, October 18, 2019

### Notes on scipy.optimize.minimize

Problem: evenly distributed points on ellipsoid surfaces.
Solution: numerically minimizing energy of charges constrained on the ellipsoid surface.
Assuming the ellipsoid satisfies equation:
$$\left(\frac{x}{a}\right)^2+\left(\frac{y}{b}\right)^2+\left(\frac{z}{c}\right)^2=1$$
the ratios of 3 axes of the ellipsoid surface are $(a,b,c)^T$. The minimization process is:
1. let
$$\mathbf{x}^\prime:=\frac{1}{\sqrt{\left(\frac{x}{a}\right)^2+\left(\frac{y}{b}\right)^2+\left(\frac{z}{c}\right)^2}}(x, y, z)^T$$
then $\mathbf{x}^\prime$ satisfy the ellipsoid equation;
2. minimize the energy function
$$u=\sum_i \sum_{j>i} \frac{1}{\sqrt{(x^\prime_i-x^\prime_j)^2+(y^\prime_i-y^\prime_j)^2+(z^\prime_i-z^\prime_j)^2}}$$
3. the gradient vector is, e.g. the $x$ component of ith particle:
$$\partial u/\partial x_i=\sum_{j\ne i} - ((x^{\prime 2}_i/a^2-1)(x^\prime_j-x^\prime_i) + (x^{\prime}_i y^{\prime}_i(y^{\prime}_j-y^{\prime}_i)+x^{\prime}_i z^{\prime}_i (z^{\prime}_j-z^{\prime}_i))/a^2)/d^3$$ with $d=\sqrt{(x^{\prime}_i -x^{\prime}_j)^2+(y^{\prime}_i -y^{\prime}_j)^2+(z^{\prime}_i -z^{\prime}_j)^2}$
***NOTE THE GRADIENT IS CALCULATED WITH RESPECT TO $\mathbf{x}$, $\nabla_\mathbf{x}U$, NOT $\mathbf{x}^\prime$, $\nabla_{\mathbf{x}^\prime}U$***
The scipy.optimize.minimize function takes jac=True parameter if the function provides gradient vector, this would boost up the program dramatically.