Parallelization with Python
This file is an introduction to the realm of parallelization, and specifically for use with tudatpy. Tudatpy has many applications and many can be parallelized. For parallelization specifically in combination with PyGMO, further reading is available under Parallelization with PyGMO.
Content of this page
General parallelization with Python
In Python, you can parallelize data processing in various ways. One possible way is to use GPU’s, but this is not discussed here. For Python CPU-based parallelization, there are generally two types: multi-processing and multi-threading. Multi-processing is a method that initializes multiple processes. This means that different processes are running on independent CPU’s, with independent memory management. Multi-threading is a method that uses multiple threads for a single parent process with shared memory. Child processes can be run on separate threads. There are generally two threads per CPU, and each computer system has their own amount of CPU’s with their own specs. The amount of parallellity is therefore determined by the system you want to run on.
It should be noted that it does not always make sense to parallelize your simulations. The initialization of parallel
tasks takes longer, so there is a break even point beyond which it is worthwhile, shown in Multi-threading
with Batch Fitness Evaluation. To enable parallel behavior with Python, the multiprocessing
module is used. Other
alternatives exist as well that are more modern, but they are not as widely spread or as thoroughly documented. Ray, for
instance, is one of these packages, it is arguably more seemless, but it is also rather new and focused on AI
applications.
All parallel processing should be put under if __name__ == "__main__" :
. This line ensures that the code is only run
if that file is the file being executed directly (so not imported, for example). This prevents an infinite loop when
creating new child processes – or starting calculations on other threads. If this line is omitted, child processes
import the python script, which then run the same script again, thereby spawning more child processes. This results in
an infinite loop. Next, mp.get_context("spawn")
is a context object that has the attributes of the multiprocessing
module. Here, the "spawn"
argument refers to the method that creates a new Python process. "spawn"
specifically
starts a fresh Python interpreter process – which is default on macOS and Windows. "fork"
copies a Python process
using os.fork()
– which is the default on Linux. "forkserver"
creates a server process; a new process is then
requested and the server uses "fork"
to create it. This method can generally be left at the default value.
A Pool
object is temporarily created, which is just a collection of available processes that can be allocated to
computational tasks. The number of cores you would like to appoint to the Pool
is given as an argument.
Subsequently, the map()
or starmap
method allows for a function to be applied to an iterable, rather than a
single argument. map()
allows for a single argument to be passed to the function, starmap()
allows for multiple
arguments. The inputs are all the sets of input arguments in the form of a list of tuples, which constitutes the
iterable mentioned previously. The outputs are formatted analogously, where the tuples are the various outputs rather
than the input arguments.
Required Show/Hide
import multiprocessing as mp import numpy as np from tudatpy.kernel.numerical_simulation import environment_setup, propagation_setup from tudatpy.kernel.interface import spice
def run_simulation(arg_1, arg_2):
# Do some tudat things...
return 1, arg_1 + arg_2
# Main script
if __name__ == "__main__":
# Number of simulations to run
N = 500
arg_1_list = np.random.normal(-100, 50, size=N)
arg_2_list = np.random.normal(1e6, 2e5, size=N)
# Combine list of inputs
inputs = []
for i in range(N):
inputs.append((arg_1_list[i], arg_2_list[i]))
# Run simulations in parallel, using half the available cores
n_cores = mp.cpu_count()//2
with mp.get_context("spawn").Pool(n_cores) as pool:
outputs = pool.starmap(run_simulation, inputs)
// required include statements
#include <tudat/simulation/simulation.h>
// required using-declarations
using tudat::simulation_setup;
using tudat;
Note
The memory will be freed only after all the outputs are collected. It may be wise to split the list of inputs into smaller batches in case a high number of simulations are run, to avoid overflowing the memory.
See also
Other ways to specify the context or create a Pool object are also possible, more can be read on the multiprocessing documentation page.
Batch Fitness Evaluation for Monte-Carlo analysis
In this section, the basic structure is presented that can allow for a simple, parallel Monte-Carlo analysis of any problem. An astrodynamics example is used for obvious reasons: the Kepler satellite orbit. Using this, we can change any parameter, let the Monte-Carlo simulations run in parallel, and enjoy the power.
BFE Monte Carlo code structure
In the snippet below, the implementation can be seen. It is straightforward, and looks surprisingly similar to
General parallelization with Python. The run_simulation()
function is shown below as run_dynamics()
. The
same concepts are applied, but rather than two integers being returned without further calculations, the inputs are the
Semi-major Axis and Eccentricity elements of the initial state which has a profound influence on the final results of
the orbit.
Required Show/Hide
# Load bfe modules import multiprocessing as mp # Load standard modules import numpy as np from matplotlib import pyplot as plt # Load tudatpy modules from tudatpy.kernel.interface import spice from tudatpy.kernel import numerical_simulation from tudatpy.kernel.numerical_simulation import environment_setup, propagation_setup from tudatpy.kernel.astro import element_conversion from tudatpy.kernel import constants from tudatpy.util import result2array
if __name__ == "__main__":
plot = False
simulation_type = 'normal' #bfe or normal
#Monte Carlo parameters
bounds = [[7000e3, 8000e3], [0.1, 0.6]] # Semi-major Axis and Eccentricity are tested here
N = 2000
n_cores = 4
# Setup inputs for MC with BFE
arg_dict = {}
for it, bound in enumerate(bounds):
arg_dict[it] = np.random.uniform(bound[0], bound[1], size=N)
inputs = []
for k in range(len(arg_dict[0])):
inputs.append(tuple(arg_dict[p][k] for p in range(2)))
# Run parallel MC analysis
with mp.get_context("spawn").Pool(n_cores) as pool:
outputs = pool.starmap(run_dynamics, inputs)
// required include statements
#include <tudat/simulation/simulation.h>
// required using-declarations
using tudat::simulation_setup;
using tudat;
The basic BFE structure can be seen above. Below the run_dynamics()
function is shown, which is almost identical to
code from the Kepler satellite orbit, with the small
adjustment that the initial state definition is given by the input arguments to the function rather than defined
manually.
Required Show/Hide
# Load bfe modules import multiprocessing as mp # Load standard modules import numpy as np from matplotlib import pyplot as plt # Load tudatpy modules from tudatpy.kernel.interface import spice from tudatpy.kernel import numerical_simulation from tudatpy.kernel.numerical_simulation import environment_setup, propagation_setup from tudatpy.kernel.astro import element_conversion from tudatpy.kernel import constants from tudatpy.util import result2array
def run_dynamics(arg_1, arg_2, spice=spice.load_standard_kernels()):
"""
Function that creates the initial conditions, termination settings, and propagation settings, and runs
create_dynamics_simulator() function and returns the state as an array.
"""
# Set simulation start and end epochs
simulation_start_epoch = 0.0
simulation_end_epoch = constants.JULIAN_DAY
# Create default body settings for "Earth"
bodies_to_create = ["Earth"]
# Create default body settings for bodies_to_create, with "Earth"/"J2000" as the global frame origin and orientation
global_frame_origin = "Earth"
global_frame_orientation = "J2000"
body_settings = environment_setup.get_default_body_settings(
bodies_to_create, global_frame_origin, global_frame_orientation)
# Create system of bodies (in this case only Earth)
bodies = environment_setup.create_system_of_bodies(body_settings)
bodies.create_empty_body("Delfi-C3")
bodies_to_propagate = ["Delfi-C3"]
central_bodies = ["Earth"]
# Define accelerations acting on Delfi-C3
acceleration_settings_delfi_c3 = dict(
Earth=[propagation_setup.acceleration.point_mass_gravity()]
)
acceleration_settings = {"Delfi-C3": acceleration_settings_delfi_c3}
# Create acceleration models
acceleration_models = propagation_setup.create_acceleration_models(
bodies, acceleration_settings, bodies_to_propagate, central_bodies
)
earth_gravitational_parameter = bodies.get("Earth").gravitational_parameter
initial_state = element_conversion.keplerian_to_cartesian_elementwise(
gravitational_parameter=earth_gravitational_parameter,
semi_major_axis=arg_1,
eccentricity=arg_2,
inclination=np.deg2rad(85.3),
argument_of_periapsis=np.deg2rad(235.7),
longitude_of_ascending_node=np.deg2rad(23.4),
true_anomaly=np.deg2rad(139.87),
)
# Create termination settings
termination_settings = propagation_setup.propagator.time_termination(simulation_end_epoch)
# Create numerical integrator settings
fixed_step_size = 10.0
integrator_settings = propagation_setup.integrator.runge_kutta_4(fixed_step_size)
# Create propagation settings
propagator_settings = propagation_setup.propagator.translational(
central_bodies,
acceleration_models,
bodies_to_propagate,
initial_state,
simulation_start_epoch,
integrator_settings,
termination_settings
)
# Create simulation object and propagate the dynamics
dynamics_simulator = numerical_simulation.create_dynamics_simulator(
bodies, propagator_settings
)
# Extract the resulting state history and convert it to an ndarray
states = dynamics_simulator.state_history
return result2array(states)
// required include statements
#include <tudat/simulation/simulation.h>
// required using-declarations
using tudat::simulation_setup;
using tudat;
BFE Monte Carlo results
Regarding the performance of the BFE, a few results are shown in the table below. Once again, a substantial improvement is observed when conducting Monte Carlo analyses using tudatpy.
Note
These simulations are tested on macOS Ventura 13.1 with a 3.1 GHz Quad-Core Intel Core i7 processor only. Four cores (CPU’s) are used during the BFE.
Number of experiments |
Batch Fitness Evaluation |
CPU time [s] |
CPU usage [-] |
Clock time [s] |
---|---|---|---|---|
500 |
no |
107.94 |
99% |
110.51 |
yes |
118.07 |
381% |
32.07 |
|
2000 |
no |
443.83 |
99% |
457.35 |
yes |
475.32 |
385% |
127.11 |
Note
Other applications are possible and may be documented in the future. If you happen to implement any yourself, feel free to contact the developers or open a pull-request.