Concurrency, synchronization, cancellation, and orderly runtime shutdown guidance.

Use this section when TDSE Runtime is live in more than one thread, worker, or shutdown path. It explains the one-handle rule, which calls can overlap safely, and how to reason about contention and teardown races.

Related Chapters For the base lifecycle semantics, see Runtime Lifecycle. For threading and memory scaling with many models, see Threading and Scaling.

Most production Runtime incidents in this area are not numerical defects. They are ownership defects: two threads think they own the same handle, shutdown starts while a step call is still live, or host wrappers treat release as a policy API instead of a finalizer.

Read this chapter together with Lifecycle and Ownership: lifecycle explains what local ownership means; this chapter explains how that ownership behaves under contention and teardown.

The One-Handle Rule

The core runtime concurrency rule is simple:

one live tdse_model_t* handle must not be entered concurrently for same-handle runtime APIs

The protected same-handle step surface is:

tdse_step_begin(...)
tdse_step_op(...)
tdse_step_hr(...)
tdse_step_ir(...)
tdse_step_commit(...)
tdse_step_dr(...)
tdse_model_close(...)
tdse_model_destroy(...)
tdse_model_release(...)

Supported versus unsupported host ownership

The runtime behavior is rejection, not automatic serialization. That is deliberate: silent serialization would hide ownership bugs and make timing behavior much harder to diagnose.

Which APIs Are Guarded Versus Snapshot-Style

Not every API participates in the same-handle execution guard. The most important distinction is:

execution APIs and teardown APIs are guard-sensitive
metadata and diagnostics snapshot APIs are live-handle reads

Snapshot-style query APIs are:

tdse_model_info(...)
tdse_model_state_info(...)
tdse_model_last_error_info(...)

These queries do not enter the same-handle execution guard. On a live handle they are intended to remain readable even if step traffic is happening on another thread. They still stop being valid once close or destroy has already started on that handle, in which case they return TDSE_ERR_INVALID_STATE.

Operational consequence:

do not use snapshot-query success as evidence that a handle is safe to step from another thread
do use snapshot queries to capture diagnostics before teardown begins or after a step failure

Runtime Guarantees Under Conflict

When runtime detects same-handle overlap on the guarded surface:

conflicting entrants are rejected rather than queued
step conflicts return TDSE_ERR_CONCURRENT_API_USE
lifecycle conflicts may return TDSE_ERR_CONCURRENT_API_USE, TDSE_ERR_TIMEOUT, or TDSE_ERR_INVALID_STATE depending on which API raced and whether ownership already moved
runtime avoids deadlock as part of the supported behavior

Important nuance:

forced overlap does not mean every entrant fails
one caller may legitimately acquire the handle first and succeed
conflicting entrants are rejected safely and observably

That distinction matters in stress tests. Under intentional race injection, "one winner plus one or more rejected entrants" is the expected shape.

Safe Parallelism Model

Supported host parallelism looks like this:

one simulation worker owns one runtime handle
different handles may run concurrently
optional internal execution strategy may parallelize work inside one step call
internal parallel execution does not make a single handle concurrently callable

Recommended host rule:

assign both execution ownership and shutdown ownership to the same wrapper or thread controller

If those responsibilities are split across components, the ownership handoff protocol must be explicit rather than implied.

Teardown State Model

The shutdown APIs are easiest to reason about as an ownership state machine.

Teardown-oriented handle states

Two rules keep this understandable:

once another thread acquires teardown ownership, the handle is no longer locally usable to you
TDSE_ERR_TIMEOUT from destroy is the one status that means the handle is still live after return

Close, Destroy, And Release Under Contention

`tdse_model_close(...)`

close is the immediate-answer lifecycle API.

Use it when:

you want a synchronous non-waiting lifecycle result
you want to detect overlap instead of waiting through it

Under contention:

if another guarded same-handle API is still in flight, close returns TDSE_ERR_CONCURRENT_API_USE
if another thread already started close or destroy, close returns TDSE_ERR_INVALID_STATE

Operational reading:

close is not a "fast destroy"
it is the right API when overlap itself is the information you need

`tdse_model_destroy(...)`

destroy is the recommended business-logic shutdown API because it makes wait policy explicit.

Use it when:

your host owns lifecycle policy
you need bounded wait behavior
you want structured wait telemetry

Destroy outcomes:

TDSE_OK: teardown completed and storage is gone
TDSE_ERR_TIMEOUT: destroy could not acquire teardown ownership within the budget; the handle remains valid
TDSE_ERR_INVALID_STATE: another thread already owns close or destroy; local ownership is gone

Destroy race interpretation

`tdse_model_release(...)`

release is for terminal cleanup, not shutdown policy.

Use it when:

a destructor must not become a policy engine
a finally or unwind path needs best-effort terminal cleanup

Under contention:

release waits for an in-flight same-handle API to leave the guard once it owns terminal cleanup
if another thread already started close, destroy, or release, release returns TDSE_ERR_INVALID_STATE

Operational reading:

release is acceptable in finalizers because it is cleanup-oriented
release is a poor choice for ordinary host shutdown because it does not carry a bounded-wait policy

Race Matrix

Use this matrix when a shutdown report is unclear about which thread acted first.

Situation	`close` sees	`destroy` sees	`release` sees	What the caller should assume
step call still in flight on same handle	`TDSE_ERR_CONCURRENT_API_USE`	`TDSE_OK` or `TDSE_ERR_TIMEOUT` depending on wait budget	waits until it can clean up	handle ownership is still local only if destroy timed out
another thread already started close/destroy	`TDSE_ERR_INVALID_STATE`	`TDSE_ERR_INVALID_STATE`	`TDSE_ERR_INVALID_STATE`	local ownership is gone
no same-handle activity, caller owns handle	`TDSE_OK`	`TDSE_OK`	`TDSE_OK`	storage is gone after return
destroy timed out while waiting	n/a	`TDSE_ERR_TIMEOUT`	n/a	handle is still live and policy must decide next step

The support-facing rule is:

only TDSE_ERR_TIMEOUT from destroy preserves local ownership after return

Query Behavior During Shutdown

Support incidents often ask whether a host can still query state while shutdown is underway. Use the strict answer:

before close or destroy starts, snapshot queries are allowed on a live handle
after close or destroy has started, tdse_model_info(...), tdse_model_state_info(...), and tdse_model_last_error_info(...) may return TDSE_ERR_INVALID_STATE
after successful close, destroy, or release, no further handle use is valid

That means the host should capture evidence before starting teardown whenever possible.

Recommended diagnostic order on a failing live handle:

tdse_model_info(...)
tdse_model_state_info(...)
tdse_model_last_error_info(...)
chosen shutdown API and timeout policy

Bounded Destroy Policy

tdse_model_destroy_options_t.wait_timeout_ms is part of the public behavior, not a tuning footnote.

Interpret the wait budget as:

negative value: intentional infinite wait
small bounded value: supervisory shutdown that prefers a fast answer
medium bounded value: ordinary business-logic cleanup where some overlap is tolerated

Worked Host Patterns

Pattern A. Worker-Owned Handle With Clean Shutdown

This is the preferred product integration pattern:

worker thread creates the handle
worker thread performs the step loop
worker thread or its owner wrapper initiates destroy
no other thread touches the handle after shutdown starts

Why it works:

execution ownership and teardown ownership stay aligned
there is no ambiguity about who records diagnostics or clears references

Pattern B. Supervisor Requests Stop, Worker Performs Destroy

This is often better than having the supervisor destroy directly:

supervisor sets a stop request in host code
worker exits its loop at a safe boundary
worker performs tdse_model_destroy(...)
supervisor observes the result through host telemetry

Why it works:

it avoids same-handle races between a live step call and a remote destroy
it keeps timeout policy near the code that already owns the handle

Pattern C. Destructor-Only Final Cleanup

Use this only when ordinary business shutdown has already failed or is unavailable:

class ModelGuard {
 public:
  ~ModelGuard() noexcept {
    if (handle_ != nullptr) {
      (void)tdse_model_release(handle_);
      handle_ = nullptr;
    }
  }
 private:
  tdse_model_t* handle_ = nullptr;
};

Why it is acceptable:

destructors need a terminal cleanup target
they should not be responsible for choosing timeout policy

Troubleshooting Shutdown Symptoms

Symptom: Destroy Times Out Repeatedly

Likely meaning:

a same-handle step call is still live when destroy starts
the host has no clear quiesce-before-destroy protocol

Collect:

destroy wait_timeout_ms
destroy wait_ms
failing thread identities from host logs
whether the worker loop had actually stopped before destroy

First corrective actions:

move destroy to the owning worker or wrapper
add an explicit stop-and-join phase before destroy
keep bounded destroy, but treat repeated timeout as an ownership bug

Symptom: `close` Returns `TDSE_ERR_CONCURRENT_API_USE`

Likely meaning:

another same-handle API is still active

Correct interpretation:

runtime is working as designed
the host attempted an immediate-answer close during active execution

Corrective action:

use destroy if bounded waiting is desired
keep close only when fast overlap detection is the intent

Symptom: `destroy` Or `release` Returns `TDSE_ERR_INVALID_STATE`

Likely meaning:

another thread already owns teardown

Corrective action:

clear local references
stop issuing further same-handle calls
repair the host ownership model instead of retrying locally

Recommended Evidence for Concurrency Issues

When a concurrency issue is reported, collect the smallest bundle that explains ownership:

handle identity in host logs
failing API name
thread or worker identity on both sides of the race
tdse_model_state_info(...) captured before teardown when available
tdse_model_last_error_info(...) captured before teardown when available
chosen shutdown API: close, destroy, or release
destroy wait budget and returned wait_ms
whether the host had already requested worker stop
whether the issue reproduced with one handle per thread

This bundle is usually more valuable than a large raw trace with no ownership annotations.

Review Checklist

During integration review, ask:

Which component owns each live handle?
Which component is allowed to start close or destroy?
Can a supervisor request stop without directly entering the handle?
Where is the timeout policy for destroy chosen and logged?
What happens to local references after TDSE_ERR_INVALID_STATE?
Which path captures diagnostics before teardown starts?

Anti-Patterns

Avoid these patterns even if they look harmless in local tests:

sharing one handle across workers and assuming Runtime will serialize it
using release as the default business-logic shutdown API
treating close as a faster version of destroy
retrying local teardown after TDSE_ERR_INVALID_STATE
treating TDSE_ERR_TIMEOUT as if the handle were already gone
starting teardown before the host has a stop or quiesce protocol

Concurrency and Shutdown

The One-Handle Rule

Which APIs Are Guarded Versus Snapshot-Style

Runtime Guarantees Under Conflict

Safe Parallelism Model

Teardown State Model

Close, Destroy, And Release Under Contention

`tdse_model_close(...)`

`tdse_model_destroy(...)`

`tdse_model_release(...)`

Race Matrix

Query Behavior During Shutdown

Bounded Destroy Policy

Worked Host Patterns

Pattern A. Worker-Owned Handle With Clean Shutdown

Pattern B. Supervisor Requests Stop, Worker Performs Destroy

Pattern C. Destructor-Only Final Cleanup

Troubleshooting Shutdown Symptoms

Symptom: Destroy Times Out Repeatedly

Symptom: `close` Returns `TDSE_ERR_CONCURRENT_API_USE`

Symptom: `destroy` Or `release` Returns `TDSE_ERR_INVALID_STATE`

Recommended Evidence for Concurrency Issues

Review Checklist

Anti-Patterns

On this page

Concurrency and Shutdown

The One-Handle Rule

Which APIs Are Guarded Versus Snapshot-Style

Runtime Guarantees Under Conflict

Safe Parallelism Model

Teardown State Model

Close, Destroy, And Release Under Contention

tdse_model_close(...)

tdse_model_destroy(...)

tdse_model_release(...)

Race Matrix

Query Behavior During Shutdown

Bounded Destroy Policy

Worked Host Patterns

Pattern A. Worker-Owned Handle With Clean Shutdown

Pattern B. Supervisor Requests Stop, Worker Performs Destroy

Pattern C. Destructor-Only Final Cleanup

Troubleshooting Shutdown Symptoms

Symptom: Destroy Times Out Repeatedly

Symptom: close Returns TDSE_ERR_CONCURRENT_API_USE

Symptom: destroy Or release Returns TDSE_ERR_INVALID_STATE

Recommended Evidence for Concurrency Issues

Review Checklist

Anti-Patterns

On this page

`tdse_model_close(...)`

`tdse_model_destroy(...)`

`tdse_model_release(...)`

Symptom: `close` Returns `TDSE_ERR_CONCURRENT_API_USE`

Symptom: `destroy` Or `release` Returns `TDSE_ERR_INVALID_STATE`