Parallelization with Python

This file is an introduction to the realm of parallelization, and specifically for use with tudatpy. Tudatpy has many applications and many can be parallelized. For parallelization specifically in combination with PyGMO, further reading is available under Parallelization with PyGMO.

General parallelization with Python

In Python, you can parallelize data processing in various ways. One possible way is to use GPU’s, but this is not discussed here. For Python CPU-based parallelization, there are generally two types: multi-processing and multi-threading. Multi-processing is a method that initializes multiple processes. This means that different processes are running on independent CPU’s, with independent memory management. Multi-threading is a method that uses multiple threads for a single parent process with shared memory. Child processes can be run on separate threads. There are generally two threads per CPU, and each computer system has their own amount of CPU’s with their own specs. The amount of parallellity is therefore determined by the system you want to run on.

It should be noted that it does not always make sense to parallelize your simulations. The initialization of parallel tasks takes longer, so there is a break even point beyond which it is worthwhile, shown in Multi-threading with Batch Fitness Evaluation. To enable parallel behavior with Python, the multiprocessing module is used. Other alternatives exist as well that are more modern, but they are not as widely spread or as thoroughly documented. Ray, for instance, is one of these packages, it is arguably more seemless, but it is also rather new and focused on AI applications.

All parallel processing should be put under if __name__ == "__main__" :. This line ensures that the code is only run if that file is the file being executed directly (so not imported, for example). This prevents an infinite loop when creating new child processes – or starting calculations on other threads. If this line is omitted, child processes import the python script, which then run the same script again, thereby spawning more child processes. This results in an infinite loop. Next, mp.get_context("spawn") is a context object that has the attributes of the multiprocessing module. Here, the "spawn" argument refers to the method that creates a new Python process. "spawn" specifically starts a fresh Python interpreter process – which is default on macOS and Windows. "fork" copies a Python process using os.fork()– which is the default on Linux. "forkserver" creates a server process; a new process is then requested and the server uses "fork" to create it. This method can generally be left at the default value.

A Pool object is temporarily created, which is just a collection of available processes that can be allocated to computational tasks. The number of cores you would like to appoint to the Pool is given as an argument. Subsequently, the map() or starmap method allows for a function to be applied to an iterable, rather than a single argument. map() allows for a single argument to be passed to the function, starmap() allows for multiple arguments. The inputs are all the sets of input arguments in the form of a list of tuples, which constitutes the iterable mentioned previously. The outputs are formatted analogously, where the tuples are the various outputs rather than the input arguments.

Required Show/Hide

import multiprocessing as mp
import numpy as np

from tudatpy.kernel.numerical_simulation import environment_setup, propagation_setup
from tudatpy.kernel.interface import spice
def run_simulation(arg_1, arg_2):
    # Do some tudat things...
    return 1, arg_1 + arg_2

# Main script
if __name__ == "__main__":
    # Number of simulations to run
    N = 500
    arg_1_list = np.random.normal(-100, 50, size=N)
    arg_2_list = np.random.normal(1e6, 2e5, size=N)

    # Combine list of inputs
    inputs = []
    for i in range(N):
        inputs.append((arg_1_list[i], arg_2_list[i]))

    # Run simulations in parallel, using half the available cores
    n_cores = mp.cpu_count()//2
    with mp.get_context("spawn").Pool(n_cores) as pool:
        outputs = pool.starmap(run_simulation, inputs)

Note

The memory will be freed only after all the outputs are collected. It may be wise to split the list of inputs into smaller batches in case a high number of simulations are run, to avoid overflowing the memory.

See also

Other ways to specify the context or create a Pool object are also possible, more can be read on the multiprocessing documentation page.

Batch Fitness Evaluation for Monte-Carlo analysis

In this section, the basic structure is presented that can allow for a simple, parallel Monte-Carlo analysis of any problem. An astrodynamics example is used for obvious reasons: the Kepler satellite orbit. Using this, we can change any parameter, let the Monte-Carlo simulations run in parallel, and enjoy the power.

BFE Monte Carlo code structure

In the snippet below, the implementation can be seen. It is straightforward, and looks surprisingly similar to General parallelization with Python. The run_simulation() function is shown below as run_dynamics(). The same concepts are applied, but rather than two integers being returned without further calculations, the inputs are the Semi-major Axis and Eccentricity elements of the initial state which has a profound influence on the final results of the orbit.

Required Show/Hide

# Load bfe modules
import multiprocessing as mp

# Load standard modules
import numpy as np
from matplotlib import pyplot as plt

# Load tudatpy modules
from tudatpy.kernel.interface import spice
from tudatpy.kernel import numerical_simulation
from tudatpy.kernel.numerical_simulation import environment_setup, propagation_setup
from tudatpy.kernel.astro import element_conversion
from tudatpy.kernel import constants
from tudatpy.util import result2array
if __name__ == "__main__":

    plot = False
    simulation_type = 'normal' #bfe or normal

    #Monte Carlo parameters
    bounds = [[7000e3, 8000e3], [0.1, 0.6]] # Semi-major Axis and Eccentricity are tested here
    N = 2000
    n_cores = 4

    # Setup inputs for MC with BFE
    arg_dict = {}
    for it, bound in enumerate(bounds):
        arg_dict[it] = np.random.uniform(bound[0], bound[1], size=N)

    inputs = []
    for k in range(len(arg_dict[0])):
        inputs.append(tuple(arg_dict[p][k] for p in range(2)))

    # Run parallel MC analysis
    with mp.get_context("spawn").Pool(n_cores) as pool:
        outputs = pool.starmap(run_dynamics, inputs)

The basic BFE structure can be seen above. Below the run_dynamics() function is shown, which is almost identical to code from the Kepler satellite orbit, with the small adjustment that the initial state definition is given by the input arguments to the function rather than defined manually.

Required Show/Hide

# Load bfe modules
import multiprocessing as mp

# Load standard modules
import numpy as np
from matplotlib import pyplot as plt

# Load tudatpy modules
from tudatpy.kernel.interface import spice
from tudatpy.kernel import numerical_simulation
from tudatpy.kernel.numerical_simulation import environment_setup, propagation_setup
from tudatpy.kernel.astro import element_conversion
from tudatpy.kernel import constants
from tudatpy.util import result2array
def run_dynamics(arg_1, arg_2, spice=spice.load_standard_kernels()):
    """
    Function that creates the initial conditions, termination settings, and propagation settings, and runs
    create_dynamics_simulator() function and returns the state as an array.
    """
    # Set simulation start and end epochs
    simulation_start_epoch = 0.0
    simulation_end_epoch = constants.JULIAN_DAY

    # Create default body settings for "Earth"
    bodies_to_create = ["Earth"]

    # Create default body settings for bodies_to_create, with "Earth"/"J2000" as the global frame origin and orientation
    global_frame_origin = "Earth"
    global_frame_orientation = "J2000"
    body_settings = environment_setup.get_default_body_settings(
        bodies_to_create, global_frame_origin, global_frame_orientation)

    # Create system of bodies (in this case only Earth)
    bodies = environment_setup.create_system_of_bodies(body_settings)
    bodies.create_empty_body("Delfi-C3")
    bodies_to_propagate = ["Delfi-C3"]
    central_bodies = ["Earth"]

    # Define accelerations acting on Delfi-C3
    acceleration_settings_delfi_c3 = dict(
        Earth=[propagation_setup.acceleration.point_mass_gravity()]
    )

    acceleration_settings = {"Delfi-C3": acceleration_settings_delfi_c3}

    # Create acceleration models
    acceleration_models = propagation_setup.create_acceleration_models(
        bodies, acceleration_settings, bodies_to_propagate, central_bodies
    )

    earth_gravitational_parameter = bodies.get("Earth").gravitational_parameter
    initial_state = element_conversion.keplerian_to_cartesian_elementwise(
        gravitational_parameter=earth_gravitational_parameter,
        semi_major_axis=arg_1,
        eccentricity=arg_2,
        inclination=np.deg2rad(85.3),
        argument_of_periapsis=np.deg2rad(235.7),
        longitude_of_ascending_node=np.deg2rad(23.4),
        true_anomaly=np.deg2rad(139.87),
    )

    # Create termination settings
    termination_settings = propagation_setup.propagator.time_termination(simulation_end_epoch)

    # Create numerical integrator settings
    fixed_step_size = 10.0
    integrator_settings = propagation_setup.integrator.runge_kutta_4(fixed_step_size)

    # Create propagation settings
    propagator_settings = propagation_setup.propagator.translational(
        central_bodies,
        acceleration_models,
        bodies_to_propagate,
        initial_state,
        simulation_start_epoch,
        integrator_settings,
        termination_settings
    )

    # Create simulation object and propagate the dynamics
    dynamics_simulator = numerical_simulation.create_dynamics_simulator(
        bodies, propagator_settings
    )

    # Extract the resulting state history and convert it to an ndarray
    states = dynamics_simulator.state_history
    return result2array(states)

BFE Monte Carlo results

Regarding the performance of the BFE, a few results are shown in the table below. Once again, a substantial improvement is observed when conducting Monte Carlo analyses using tudatpy.

Note

These simulations are tested on macOS Ventura 13.1 with a 3.1 GHz Quad-Core Intel Core i7 processor only. Four cores (CPU’s) are used during the BFE.

Number of experiments

Batch Fitness Evaluation

CPU time [s]

CPU usage [-]

Clock time [s]

500

no

107.94

99%

110.51

yes

118.07

381%

32.07

2000

no

443.83

99%

457.35

yes

475.32

385%

127.11

Note

Other applications are possible and may be documented in the future. If you happen to implement any yourself, feel free to contact the developers or open a pull-request.