Telemetry lifecycle, event reference, exporters, deployment, and runtime observability.

Applies to: TDSE SDK 1.0.0-rc1 with TDSE_ENABLE_TELEMETRY=ON.

Use this chapter when you need runtime observability: enable telemetry, attach it to models, export events, and decide when continuous event collection is a better fit than a short measurement pass.

For RC evaluation, decide three things before you wire telemetry into a host:

whether the delivered evaluation package actually includes telemetry support
whether you need continuous observability or a short profiler-driven sizing pass
whether the exported data is engineering evidence for debugging, or part of a formal qualification record you manage in your own host program

Profiler coverage is intentionally split into the next chapter, Profiler. Use this chapter for always-on or long-running observability. Use the Profiler chapter when the question is measurement methodology, backend comparison, or runtime-plan generation.

Telemetry Versus Profiler

Need	Start Here	Why
capture a steady operational trace	this chapter	telemetry emits ongoing runtime events
benchmark one model or compare backend choices	Profiler	profiler is the measurement tool
generate a runtime plan to apply in code	Profiler	runtime plans are profiler outputs
keep debugging evidence from long runs	this chapter	telemetry is built for persistent observability

What Telemetry Provides

TDSE telemetry adds observability to runtime step execution without changing numerical behavior. When enabled, the runtime automatically emits step-level events from tdse_step_begin, tdse_step_op, tdse_step_hr, tdse_step_ir, and tdse_step_commit.

Use telemetry for long-running or production-like runs where you want a steady record of step activity. Use the Profiler when you want a focused performance investigation, backend comparison, or runtime-plan session.

Typical uses:

measure per-step latency and throughput in production simulations
detect performance regressions across SDK upgrades
feed step-level metrics into existing monitoring infrastructure
record resource usage alongside simulation progress

Telemetry is compiled out by default and has zero runtime cost when disabled.

Package Availability

Telemetry is package-variant-specific in the RC line. Before wiring the API, confirm that your delivered evaluation package or delivery notes explicitly say telemetry support is enabled (TDSE_ENABLE_TELEMETRY=ON).

If telemetry is not enabled in the delivered package, treat that as a package variant choice, not as a runtime misconfiguration in your integration.

Two-Step Lifecycle

Telemetry follows a process-global service + per-model attachment pattern.

Step 1: Initialize the Service (once per process)

#include <tdse/tdse_telemetry.h>

tdse_telemetry_service_config_t service_cfg;
tdse_telemetry_service_config_init(&service_cfg);
service_cfg.json_output_path = "tdse_telemetry.json";
service_cfg.worker_sleep_ms = 10;
tdse_telemetry_service_init(&service_cfg);

tdse_telemetry_service_config_t fields:

Field	Meaning	Recommended Default
`json_output_path`	file path for JSON lines export	`"tdse_telemetry.json"`
`worker_sleep_ms`	background worker poll interval in milliseconds	`10`

Call tdse_telemetry_service_init() once at process startup, before creating any models. Call tdse_telemetry_service_is_initialized() to check status.

Step 2: Attach Each Model

tdse_telemetry_model_config_t model_cfg;
tdse_telemetry_model_config_init(&model_cfg);
model_cfg.sampling_interval_steps = 1;
model_cfg.ring_capacity = 1024;
tdse_model_telemetry_attach(model, &model_cfg);

tdse_telemetry_model_config_t fields:

Field	Meaning	Recommended Default
`sampling_interval_steps`	record every N-th step	`1` (every step)
`ring_capacity`	number of events retained in the ring buffer	`1024`

Set sampling_interval_steps to a larger value (e.g., 10 or 100) to reduce overhead in long-running simulations where per-step detail is not required.

Telemetry is designed for observability, not for zero-overhead measurement. When you are trying to prove peak throughput, strict WCET, or backend selection, use the Profiler first and add telemetry only if you also need a persistent operational trace.

Detach and Shutdown

/* per model */
tdse_model_telemetry_detach(model);

/* per process */
tdse_telemetry_service_shutdown();

Detach before destroying the model. Shutdown after all models are detached.

Event Reference

Telemetry Levels

Set per-model via tdse_model_telemetry_set_level():

Level	Behavior
minimal	lifecycle events only (create, destroy)
standard	adds step-level timing
verbose	adds per-query breakdown (op, hr, ir, commit, dr)

Event Kinds

Events are typed via tdse_telemetry_event_kind_t. Key kinds emitted automatically by the runtime:

Kind	When Emitted	Payload
step begin	`tdse_step_begin()`	`t`, `dt`
step commit	`tdse_step_commit()`	`committed_steps`
model lifecycle	create / destroy	handle metadata

Extended Statistics

Query accumulated statistics at any time:

tdse_telemetry_model_stats_t stats;
tdse_model_telemetry_get_stats(model, &stats);

tdse_telemetry_model_extended_stats_t ext;
tdse_model_telemetry_get_extended_stats(model, &ext);

Custom Tags

Attach key-value tags for correlation in multi-model deployments:

tdse_model_telemetry_set_custom_tags(model, "circuit=my_netlist,run=42");

Instance IDs

Each attached model receives a unique instance ID for log correlation:

uint64_t id = tdse_model_telemetry_get_instance_id(model);

Exporters

JSON Lines (built-in)

Always active when the service is initialized. Events are written as JSON lines to json_output_path. Each line is a self-contained JSON object.

OpenTelemetry

Export to an OTLP-compatible endpoint:

tdse_telemetry_export_opentelemetry("http://127.0.0.1:4318");

Security note: the OpenTelemetry exporter only accepts loopback or private targets by default (e.g., http://127.0.0.1:4318). This is intentional to prevent accidental data exposure.

HTTPS export on non-Windows builds requires OpenSSL to be found at configure time.

Prometheus

Export to a Prometheus push gateway:

tdse_telemetry_export_prometheus("http://127.0.0.1:9091");

Same loopback/private restriction applies.

What Telemetry Does Not Prove

Telemetry can help you preserve evidence, correlate incidents, and compare runs under the same host policy. It does not by itself prove:

release qualification on a new host platform
WCET or target-machine timing acceptance for RT/HIL
correctness of a runtime-plan or backend recommendation
support for optional accelerators that were not included in the delivered package

Health Status

Check service health at any time:

tdse_telemetry_health_status_t health;
tdse_telemetry_get_health_status(&health);

System Metrics

Record or query system-level metrics:

tdse_telemetry_system_metrics_t sys;
tdse_telemetry_get_system_metrics(&sys);
tdse_telemetry_record_system_metrics();
tdse_telemetry_record_memory_usage();
tdse_telemetry_record_gpu_usage();

Performance Alerts

Record custom performance alerts:

tdse_telemetry_record_performance_alert(model, TDSE_TELEMETRY_ALERT_SEVERITY_WARNING,
    "step_latency_exceeded", 150.0);

Alert severities:

Severity	Use When
info	informational note
warning	performance degraded but tolerable
error	performance issue requiring attention
critical	simulation may be invalid

Backend Switch Events

The runtime automatically records backend switch events. You can also record custom events:

tdse_telemetry_record_backend_switch(model, old_backend_id, new_backend_id);

Flush

Force-flush pending events:

tdse_telemetry_flush_events();

Complete Example

#include <tdse/tdse.h>
#include <tdse/tdse_telemetry.h>

void run_with_telemetry(tdse_model_t* model, size_t nsteps) {
    /* Service is assumed to be initialized before this function. */

    tdse_telemetry_model_config_t model_cfg;
    tdse_telemetry_model_config_init(&model_cfg);
    model_cfg.sampling_interval_steps = 1;
    model_cfg.ring_capacity = 2048;
    tdse_model_telemetry_attach(model, &model_cfg);

    tdse_model_telemetry_set_custom_tags(model, "scenario=baseline");
    tdse_model_telemetry_set_level(model, TDSE_TELEMETRY_LEVEL_VERBOSE);

    /* Normal step loop */
    for (size_t n = 0; n < nsteps; ++n) {
        tdse_step_begin(model, n * 0.001, 0.001);
        /* op, hr, ir, solve, commit */
        tdse_step_commit(model, primary);
    }

    tdse_model_telemetry_detach(model);
}

Telemetry