Introduction
The Numerics module includes support for:
- 1D, 2D and 3D FFTs that leverage SIMD vector instructions.
- Limited memory BFGS optimization
- b-Splines
- Erf and Erfc
- Tensor recursions
L-BFGS Optimization
The limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm is appropriate for large-scale multidimensional unconstrained optimization problems. It is derived from Robert Dodier's Java translation of orignal FORTRAN code by Jorge Nocedal.
Fast Fourier Transformation
The FFT classes implement:
- 1D Real to Complex
- 1D Complex to Complex
- 3D Real to Complex
- 3D Complex to Complex
- 3D Real to Complex Convolutions
- 3D Complex to Complex Convolutions
The 1D methods compute the FFT of real or complex, double precision data of arbitrary length n using a mixed radix method that has special methods to handle factors of [2, 3, 4, 5, 6, 7] and a general method for larger prime factors.
The 3D methods are serial or SMP parallel. The convolutions are designed for the reciprocal space portion of Particle Mesh Ewald summation.
Selected timing results comparing FFTW and FFX on MacOS with an Apple M2 Max CPU (FFTW 3.3.10) are shown below for 3D double precision C2C round-trip FFTs.
| Size | Library | Threads | Operation | Time (sec) |
|---|---|---|---|---|
| 64×64×64 | FFTW | 1 | Forward + Reverse FFT | 0.005549 |
| FFTW | 8 | Forward + Reverse FFT | 0.001469 | |
| FFX | 8 | Forward + Reverse FFT | 0.001022 | |
| FFX | 8 | Convolution | 0.000867 | |
| 128×128×128 | FFTW | 1 | Forward + Reverse FFT | 0.056120 |
| FFTW | 8 | Forward + Reverse FFT | 0.010322 | |
| FFX | 8 | Forward + Reverse FFT | 0.009983 | |
| FFX | 8 | Convolution | 0.008273 | |
| 256×256×256 | FFTW | 1 | Forward + Reverse FFT | 0.633118 |
| FFTW | 8 | Forward + Reverse FFT | 0.155233 | |
| FFX | 8 | Forward + Reverse FFT | 0.106308 | |
| FFX | 8 | Convolution | 0.083182 |
Note that for FFX, the convolution operation is faster than the sum of individual Forward + Reverse FFTs due to eliminating two 3D matrix transpose operations.

