论文部分内容阅读
Short-vector SIMD extensions are commonly included in modern processors.This pa per presents a multi-kernel algorithm for MD simulations to utilize the SIMD extensions of CPUs.The algorithm decomposes particle clusters into groups, each of which is solved by a specially organized kernel with manually vectorization.The branch divergence while using SIMD extensions is therefore avoided.The experiment shows that our algorithm can achieve nearly ideal SIMD speedups on different platforms.