Time-Domain System Equivalent logoTime-Domain System EquivalentLinear dynamics, solved faster.Discuss Integration

Multi-Model Deployment Patterns

Patterns for deploying multiple TDSE models inside one host application.

Use this section when one model is no longer enough and you need to choose a deployment pattern: many independent models, one reused model for repeated sweeps, parallel copies for throughput, or separate models running at different rates.

Related Chapters For one-handle ownership and shutdown rules, see Concurrency and Shutdown. For threading and memory scaling details, see Threading and Scaling. For variable time-stepping in multi-rate scenarios, see Variable Time-Step Integration.

Use these patterns when a deployment runs different TDSE models for different subsystems, or multiple copies of the same model for parameter sweeps.

Choose A Deployment Pattern

SituationRecommended PatternWhy
independent subsystems that can run side by sideN parallel independent modelssimplest ownership story and highest operational clarity
same pack, repeated sweeps, memory is tightbatch sweep with one reused modellowest memory footprint
same pack, repeated sweeps, throughput matters mostbatch sweep with parallel copieseasiest way to scale wall-clock throughput
subsystems need different dt valuesmulti-rate couplingkeeps each model at the rate it actually needs

N Parallel Independent Models

This is the default production pattern. Each model has its own handle, its own state, and a clear owner.

#define N_MODELS 8
tdse_model_t* models[N_MODELS];

for (int i = 0; i < N_MODELS; i++) {
    tdse_model_create_diagnostics_t diag = tdse_model_create_diagnostics_init();
    tdse_model_create(packs[i], pack_sizes[i], &diag, &models[i]);
}

/* Each worker thread owns one model */
#pragma omp parallel for
for (int i = 0; i < N_MODELS; i++) {
    for (uint64_t n = 0; n < nsteps; ++n) {
        tdse_step_begin(models[i], t[n], dt);
        tdse_step_op(models[i], &op);
        tdse_step_hr(models[i], hr);
        tdse_step_commit(models[i], primary[i]);
    }
}

Rules:

  • Apply the one-handle rule from Concurrency and Shutdown: one worker owns one live handle at a time.
  • Each handle has its own history buffer and state machine.
  • Models may use different packs, different dt, or different backends.

Per-Model Resource Budget

Each model handle consumes its own resources. When planning a deployment:

ResourcePer ModelShared
History ring bufferyes (nh * nq * sizeof(double))no
Operator workspaceyes (nq * np * sizeof(double))no
GPU stream/contextyes (when using CUDA backend)GPU device memory is shared
CPU thread poolconfigurable via tdse_local_threads_setOS thread pool
Backend selectionper-model via tdse_backend_setBackend registry

Use per-model resource controls only after ownership is already stable:

/* Assign CPU threads per model */
int threads_per_model = physical_cores / N_MODELS;
for (int i = 0; i < N_MODELS; i++) {
    tdse_local_threads_set(models[i], threads_per_model);
}

GPU Sharing Across Models

Multiple models can share the same GPU. Each gets its own CUDA stream, but device memory is shared.

Guidelines:

  • Estimate total GPU memory as sum(per_model_gpu_footprint) + overhead (~50-100 MB).
  • Monitor with nvidia-smi during initial deployment testing.
  • If GPU allocation fails for any model, tdse_model_create returns TDSE_ERR_OUT_OF_MEMORY.
  • Prefer the async pipeline mode for concurrent GPU models:
for (int i = 0; i < N_MODELS; i++) {
    tdse_cuda_backend_config_t cuda_cfg;
    tdse_cuda_backend_get_config(models[i], &cuda_cfg);
    cuda_cfg.pipeline_mode = TDSE_CUDA_PIPELINE_ASYNC;
    tdse_cuda_backend_set_config(models[i], &cuda_cfg);
}

Batch Sweep Pattern

Use this pattern when the mathematical model stays the same but inputs, operating points, or sweep values change.

For parameter sweeps where the same pack structure is reused with different inputs:

tdse_model_t* base_model;
tdse_model_create(pack, pack_size, &diag, &base_model);

/* Option A: Sequential reuse with reset between sweeps */
for (int sweep = 0; sweep < N_SWEEPS; sweep++) {
    for (uint64_t n = 0; n < nsteps; ++n) {
        tdse_step_begin(base_model, t[n], dt);
        /* ... solve with sweep-specific primary vectors ... */
        tdse_step_commit(base_model, primary_sweep[sweep]);
    }
    tdse_model_reset(base_model);  /* clear committed history for next sweep */
}
/* Option B: Parallel sweep with one model per sweep value */
tdse_model_t* sweep_models[N_SWEEPS];
for (int s = 0; s < N_SWEEPS; s++) {
    tdse_model_create(pack, pack_size, &diag, &sweep_models[s]);
}

Read the tradeoff plainly:

  • Option A is memory-efficient and simpler to operate.
  • Option B is throughput-efficient and easier to spread across workers or devices.
  • If repeated sweeps are frequent but not latency-sensitive, start with Option A.

Multi-Rate Coupling

When different subsystems require different time resolutions, use separate models with different model_dt values and synchronize at coupling boundaries:

tdse_model_t* fast;  /* model_dt = 1 ns */
tdse_model_t* slow;  /* model_dt = 10 ns */

for (step = 0; step < TOTAL_STEPS; step++) {
    tdse_step_begin(fast, t_fast, 1e-9);
    /* ... step fast model ... */
    tdse_step_commit(fast, primary_fast);

    if (step % 10 == 0) {
        /* Extract coupling variables from fast model */
        /* ... */
        tdse_step_begin(slow, t_slow, 10e-9);
        /* ... step slow model with coupled inputs ... */
        tdse_step_commit(slow, primary_slow);
    }

    t_fast += 1e-9;
    t_slow = (step / 10) * 10e-9;
}

See Variable Time-Step Integration for more details on multi-rate patterns.

Deployment Checklist

  • Estimate total memory: N * per-model footprint + shared overhead
  • Assign thread resources: divide local_threads across models
  • Select backend per model: CPU for small models, GPU for large ones
  • Verify GPU memory budget if using CUDA backend
  • Choose sweep strategy: sequential with reset vs. parallel copies
  • Test scaling: run with 1, 2, 4, 8 models and measure throughput per model
  • Monitor guard metrics on each model independently
  • Keep shutdown ownership clear: destroy each model on its owning thread or wrapper path