Performance¶
Recommendation¶
The recommended generator for single use is PCG64DXSM
although SFC64
and Xoshiro256
are both excellent alternatives. Romu
is a newer generator that
is also very fast.
For very large scale
applications – requiring 1,000+ streams –
PCG64DXSM
combined with a SeedSequence
and spawn
,
SFC64
initialized using distinct Weyl increments (k
), or one of
the cryptography-based generators AESCounter
(if you have hardware acceleration),
EFIIX64
, SPECK128
,
Philox
, or HC128
if you do not)
initialized with distinct keys are all excellent choices.
Unless you need backward compatibility, there are no good reasons to use any
of the Mersenne Twister PRNGS: MT19937
, MT64
,
SFMT
, and DSFMT
.
Timings¶
The timings below are the time in ns to produce 1 random value from a
specific distribution. SFC64
is the fastest,
followed closely by Xoshiro256
,
PCG64DXSM
, JSF
,
and EFIIX64
. The original
NumPy MT19937
generator is slower since
it requires 2 32-bit values to equal the output of the faster generators.
Bit Gen |
Uint32 |
Uint64 |
Uniform |
Expon |
Normal |
Gamma |
---|---|---|---|---|---|---|
Romu(variant=”trio”) |
2.3 |
2.1 |
2.0 |
5.2 |
9.3 |
18.2 |
SFC64 |
2.3 |
2.5 |
2.4 |
5.1 |
9.6 |
18.7 |
Romu |
2.5 |
2.5 |
2.4 |
5.0 |
9.5 |
18.8 |
Xoshiro256 |
2.5 |
2.5 |
2.5 |
5.2 |
9.8 |
19.2 |
PCG64DXSM |
2.3 |
2.9 |
3.0 |
5.3 |
11.0 |
20.7 |
JSF |
2.4 |
3.0 |
3.0 |
5.8 |
10.2 |
20.0 |
EFIIX64 |
2.5 |
3.0 |
3.0 |
5.4 |
10.5 |
20.4 |
PCG64 |
2.3 |
3.1 |
3.1 |
5.9 |
11.3 |
21.4 |
Xoshiro512 |
2.7 |
3.5 |
3.3 |
5.8 |
10.3 |
20.3 |
SFMT |
2.9 |
3.3 |
3.1 |
6.3 |
11.0 |
20.9 |
LXM |
2.6 |
3.5 |
3.5 |
6.3 |
11.3 |
21.9 |
PCG64(variant=”dxsm-128”) |
2.8 |
3.4 |
3.5 |
6.1 |
12.5 |
23.1 |
DSFMT |
3.0 |
4.2 |
2.7 |
7.0 |
12.2 |
21.7 |
MT64 |
2.8 |
4.0 |
4.2 |
6.9 |
12.8 |
23.7 |
JSF32 |
3.0 |
4.3 |
4.3 |
6.9 |
11.2 |
22.9 |
Philox(n=2, w=64) |
3.2 |
4.7 |
5.1 |
8.1 |
14.7 |
27.4 |
Philox |
4.0 |
5.9 |
6.1 |
8.8 |
13.6 |
27.0 |
AESCounter |
4.4 |
6.0 |
5.7 |
9.1 |
14.4 |
27.4 |
MT19937 |
3.8 |
6.3 |
7.2 |
9.2 |
14.8 |
28.8 |
ThreeFry(n=2, w=64) |
4.1 |
6.5 |
6.9 |
9.5 |
15.7 |
30.6 |
NumPy |
3.0 |
4.6 |
5.8 |
14.4 |
20.1 |
39.8 |
HC128 |
4.1 |
7.2 |
7.2 |
10.5 |
16.6 |
31.5 |
Philox(n=4, w=32) |
4.2 |
7.6 |
8.7 |
10.8 |
16.5 |
32.6 |
SPECK128 |
5.4 |
8.1 |
9.7 |
11.4 |
17.0 |
33.4 |
ThreeFry |
5.9 |
9.1 |
9.3 |
12.0 |
16.7 |
34.9 |
ChaCha(rounds=8) |
6.7 |
10.4 |
10.3 |
13.2 |
18.1 |
36.4 |
ChaCha |
9.7 |
16.6 |
16.4 |
19.6 |
24.4 |
49.0 |
ThreeFry(n=4, w=32) |
9.0 |
16.8 |
17.5 |
20.7 |
24.3 |
54.1 |
RDRAND |
129.5 |
129.9 |
129.6 |
136.9 |
139.6 |
287.4 |
The next table presents the performance relative to NumPy’s RandomState
in
percentage. The overall performance is computed using a geometric mean.
Bit Gen |
Uint32 |
Uint64 |
Uniform |
Expon |
Normal |
Gamma |
Overall |
---|---|---|---|---|---|---|---|
Romu(variant=”trio”) |
128 |
219 |
282 |
278 |
215 |
219 |
217 |
SFC64 |
129 |
189 |
244 |
283 |
210 |
213 |
205 |
Romu |
118 |
185 |
242 |
285 |
212 |
212 |
202 |
Xoshiro256 |
119 |
185 |
229 |
279 |
205 |
208 |
198 |
PCG64DXSM |
128 |
163 |
190 |
271 |
182 |
192 |
183 |
JSF |
126 |
155 |
193 |
250 |
197 |
199 |
182 |
EFIIX64 |
120 |
154 |
193 |
264 |
190 |
195 |
181 |
PCG64 |
128 |
149 |
184 |
246 |
178 |
186 |
175 |
Xoshiro512 |
111 |
131 |
176 |
249 |
195 |
196 |
170 |
SFMT |
104 |
143 |
185 |
228 |
183 |
190 |
167 |
LXM |
114 |
135 |
164 |
227 |
177 |
182 |
162 |
PCG64(variant=”dxsm-128”) |
105 |
138 |
165 |
234 |
160 |
172 |
158 |
DSFMT |
100 |
110 |
213 |
206 |
165 |
184 |
156 |
MT64 |
104 |
117 |
136 |
210 |
156 |
168 |
145 |
JSF32 |
99 |
108 |
134 |
208 |
178 |
174 |
145 |
Philox(n=2, w=64) |
91 |
98 |
114 |
177 |
136 |
145 |
124 |
Philox |
75 |
79 |
95 |
163 |
147 |
147 |
112 |
AESCounter |
67 |
77 |
102 |
158 |
139 |
145 |
109 |
MT19937 |
78 |
74 |
80 |
156 |
135 |
138 |
105 |
ThreeFry(n=2, w=64) |
72 |
71 |
84 |
152 |
128 |
130 |
101 |
HC128 |
72 |
64 |
80 |
136 |
121 |
126 |
96 |
Philox(n=4, w=32) |
71 |
61 |
66 |
134 |
122 |
122 |
91 |
SPECK128 |
55 |
57 |
59 |
127 |
118 |
119 |
83 |
ThreeFry |
51 |
51 |
62 |
120 |
120 |
114 |
80 |
ChaCha(rounds=8) |
45 |
45 |
56 |
109 |
111 |
109 |
73 |
ChaCha |
30 |
28 |
35 |
73 |
82 |
81 |
49 |
ThreeFry(n=4, w=32) |
33 |
28 |
33 |
69 |
82 |
73 |
48 |
RDRAND |
2 |
4 |
4 |
10 |
14 |
14 |
7 |
Note
All timings were taken using Linux on an Intel Cascade Lake (Family 6, Model 85, Stepping 7) running at 3.1GHz.