Showing posts from October, 2019

A note on Cython Parallelism

Instead of using thread-local ndarrays, a very convenient method is to create an array shaped in (num_of_threads, <whatever the structure>) to make the changes thread-safe. For example: cdef np.ndarray[np.double_t, ndim=2] ret num_threads = multiprocessing.cpu_count() ret = np.zeros((num_threads, nbins), dtype=np.float) and in the prange loop: thread_num = openmp.omp_get_thread_num() ret[thread_num, l]+=1 ... Afterwards, ret.sum(axis=0) is returned. The loop is set to use num_threads threads: with nogil, parallel(num_threads=num_threads): . By default, I set the prange with schedule='dynamic' option, on my laptop (ArchLinux), with openmp=201511 and gcc=9.2.0 , I found interesting outputs, the summation of ret is exactly num_threads times the desired results, i.e. ret.sum(axis=0)[0] should be 200, but if 4 threads are used in parallel computation, then the output becomes 800. However, this does not happen on my server (CentOS 7.6) with gcc=4.2.0 and openmp=201

Notes on scipy.optimize.minimize

Problem: evenly distributed points on ellipsoid surfaces. Solution: numerically minimizing energy of charges constrained on the ellipsoid surface. Assuming the ellipsoid satisfies equation: $$\left(\frac{x}{a}\right)^2+\left(\frac{y}{b}\right)^2+\left(\frac{z}{c}\right)^2=1$$ the ratios of 3 axes of the ellipsoid surface are $(a,b,c)^T$. The minimization process is: let $$\mathbf{x}^\prime:=\frac{1}{\sqrt{\left(\frac{x}{a}\right)^2+\left(\frac{y}{b}\right)^2+\left(\frac{z}{c}\right)^2}}(x, y, z)^T$$ then $\mathbf{x}^\prime$ satisfy the ellipsoid equation; minimize the energy function $$u=\sum_i \sum_{j>i} \frac{1}{\sqrt{(x^\prime_i-x^\prime_j)^2+(y^\prime_i-y^\prime_j)^2+(z^\prime_i-z^\prime_j)^2}}$$ the gradient vector is, e.g. the $x$ component of ith particle: $$\partial u/\partial x_i=\sum_{j\ne i} - ((x^{\prime 2}_i/a^2-1)(x^\prime_j-x^\prime_i) + (x^{\prime}_i y^{\prime}_i(y^{\prime}_j-y^{\prime}_i)+x^{\prime}_i z^{\prime}_i (z^{\prime}_j-z^{\prime}_i))/a^2)/d^3$$ w

Dealing with aggregates: I don't like tedious work

I used to study nanoparticles (NPs) self-assembly in polymer matrices before 2017 by coarse-grained simulation method. Analyzing morphologies of self-assemblies is a tedious work: one needs to repeatedly watch frames from simulation trajectories, finding appropriate view points, counting number of aggregates, measure size of aggregates, etc. Obtaining integrate aggregates under periodic boundary conditions is also a challenging work. In addition, one usually needs to run plenty of simulations under different conditions to find a desired morphology; selecting desired morphologies automatically is even more challenging: most of the morphologies are ill-defined, it is hard to use some common  characterization methods, e.g., $g(r)$ or $S(q)$ of NPs give little differences amongst percolated morphologies. Based on such demands, I designed a co-pilot which can: Automatically clustering the clusters; Remove periodic boundary conditions and make the center-of-mass at $(0,0,0)^T$; Adjust

Eigenvalues of circulant matrices

A circulant matrix is defined as $$C=\begin{bmatrix}c_{0}&c_{n-1}&\dots &c_{2}&c_{1}\\c_{1}&c_{0}&c_{n-1}&&c_{2}\\\vdots &c_{1}&c_{0}&\ddots &\vdots \\c_{n-2}&&\ddots &\ddots &c_{n-1}\\c_{n-1}&c_{n-2}&\dots &c_{1}&c_{0}\\\end{bmatrix}$$ where $C_{j, k}=c_{j-k \mod n}$, the $k$-th eigenvalue $\lambda_k$ and eigenvector $x_k$ satisfy $C\cdot x_k=\lambda_k x_k$, or, equivalently, $n$ equations as: $$\sum_{j=0}^{m-1}c_{m-j}x_j+\sum_{j=m}^{n-1}c_{n-j+m}x_j=\lambda_k x_m\quad m=0,1,\dots,n-1$$ with $c_n=c_0$, $x_m$ is the $m$-th compoent of an eigenvector $x_k$. Changing the dummy summing ($j\to m-j$ and $j\to n-j+m$) variables yields $$\sum_{j=1}^{m}c_j x_{m-j} +\sum_{j=m+1}^{n}c_j x_{n+m-j}=\lambda_k x_m$$ with $m=0,1,2,\dots,n-1$. One can "guess" a solution that $x_j=\omega^j$, therefore the equation above turns into $$\begin{align}&\sum_{j=1}^{m}c_j \omega^{m-j} +\sum_{j=m+1}^{n}c_j \omeg