The four core distributions all allow existing arrays to be filled using the
out
keyword argument. Existing arrays need to be contiguous and
well-behaved (writable and aligned). Under normal circumstances, arrays
created using the common constructors such as numpy.empty
will satisfy
these requirements.
This example makes use of Python 3 futures
to fill an array using multiple
threads. Threads are long-lived so that repeated calls do not require any
additional overheads from thread creation. The underlying PRNG is xorshift2014
which is fast, has a long period and supports using jump
to advance the
state. The random numbers generated are reproducible in the sense that the
same seed will produce the same outputs.
import randomstate
import multiprocessing
import concurrent.futures
import numpy as np
class MultithreadedRNG(object):
def __init__(self, n, seed=None, threads=None):
rs = randomstate.prng.xorshift1024.RandomState(seed)
if threads is None:
threads = multiprocessing.cpu_count()
self.threads = threads
self._random_states = []
for _ in range(0, threads-1):
_rs = randomstate.prng.xorshift1024.RandomState()
_rs.set_state(rs.get_state())
self._random_states.append(_rs)
rs.jump()
self._random_states.append(rs)
self.n = n
self.executor = concurrent.futures.ThreadPoolExecutor(threads)
self.values = np.empty(n)
self.step = np.ceil(n / threads).astype(np.int)
def fill(self):
def _fill(random_state, out, first, last):
random_state.standard_normal(out=out[first:last])
futures = {}
for i in range(self.threads):
args = (_fill, self._random_states[i], self.values, i * self.step, (i + 1) * self.step)
futures[self.executor.submit(*args)] = i
concurrent.futures.wait(futures)
def __del__(self):
self.executor.shutdown(False)
The multithreaded random number generator can be used to fill an array.
The values
attributes shows the zero-value before the fill and the
random value after.
In [1]: mrng = MultithreadedRNG(10000000, seed=0)
In [2]: print(mrng.values[-1])
0.0
In [3]: mrng.fill()
In [4]: print(mrng.values[-1])
0.27192377619885055
The time required to produce using multiple threads can be compared to the time required to generate using a single thread.
In [5]: print(mrng.threads)
2
In [6]: %timeit mrng.fill()