Telemetry
Telemetry lifecycle, event reference, exporters, deployment, and runtime observability.
Applies to: TDSE SDK 1.0.0-rc1 with
TDSE_ENABLE_TELEMETRY=ON.
Use this chapter when you need runtime observability: enable telemetry, attach it to models, export events, and decide when continuous event collection is a better fit than a short measurement pass.
For RC evaluation, decide three things before you wire telemetry into a host:
- whether the delivered evaluation package actually includes telemetry support
- whether you need continuous observability or a short profiler-driven sizing pass
- whether the exported data is engineering evidence for debugging, or part of a formal qualification record you manage in your own host program
Profiler coverage is intentionally split into the next chapter, Profiler. Use this chapter for always-on or long-running observability. Use the Profiler chapter when the question is measurement methodology, backend comparison, or runtime-plan generation.
Telemetry Versus Profiler
| Need | Start Here | Why |
|---|---|---|
| capture a steady operational trace | this chapter | telemetry emits ongoing runtime events |
| benchmark one model or compare backend choices | Profiler | profiler is the measurement tool |
| generate a runtime plan to apply in code | Profiler | runtime plans are profiler outputs |
| keep debugging evidence from long runs | this chapter | telemetry is built for persistent observability |
What Telemetry Provides
TDSE telemetry adds observability to runtime step execution without changing
numerical behavior. When enabled, the runtime automatically emits step-level
events from tdse_step_begin, tdse_step_op, tdse_step_hr, tdse_step_ir,
and tdse_step_commit.
Use telemetry for long-running or production-like runs where you want a steady record of step activity. Use the Profiler when you want a focused performance investigation, backend comparison, or runtime-plan session.
Typical uses:
- measure per-step latency and throughput in production simulations
- detect performance regressions across SDK upgrades
- feed step-level metrics into existing monitoring infrastructure
- record resource usage alongside simulation progress
Telemetry is compiled out by default and has zero runtime cost when disabled.
Package Availability
Telemetry is package-variant-specific in the RC line. Before wiring the API,
confirm that your delivered evaluation package or delivery notes explicitly say
telemetry support is enabled (TDSE_ENABLE_TELEMETRY=ON).
If telemetry is not enabled in the delivered package, treat that as a package variant choice, not as a runtime misconfiguration in your integration.
Two-Step Lifecycle
Telemetry follows a process-global service + per-model attachment pattern.
Step 1: Initialize the Service (once per process)
#include <tdse/tdse_telemetry.h>
tdse_telemetry_service_config_t service_cfg;
tdse_telemetry_service_config_init(&service_cfg);
service_cfg.json_output_path = "tdse_telemetry.json";
service_cfg.worker_sleep_ms = 10;
tdse_telemetry_service_init(&service_cfg);
tdse_telemetry_service_config_t fields:
| Field | Meaning | Recommended Default |
|---|---|---|
json_output_path | file path for JSON lines export | "tdse_telemetry.json" |
worker_sleep_ms | background worker poll interval in milliseconds | 10 |
Call tdse_telemetry_service_init() once at process startup, before creating
any models. Call tdse_telemetry_service_is_initialized() to check status.
Step 2: Attach Each Model
tdse_telemetry_model_config_t model_cfg;
tdse_telemetry_model_config_init(&model_cfg);
model_cfg.sampling_interval_steps = 1;
model_cfg.ring_capacity = 1024;
tdse_model_telemetry_attach(model, &model_cfg);
tdse_telemetry_model_config_t fields:
| Field | Meaning | Recommended Default |
|---|---|---|
sampling_interval_steps | record every N-th step | 1 (every step) |
ring_capacity | number of events retained in the ring buffer | 1024 |
Set sampling_interval_steps to a larger value (e.g., 10 or 100) to reduce
overhead in long-running simulations where per-step detail is not required.
Telemetry is designed for observability, not for zero-overhead measurement. When you are trying to prove peak throughput, strict WCET, or backend selection, use the Profiler first and add telemetry only if you also need a persistent operational trace.
Detach and Shutdown
/* per model */
tdse_model_telemetry_detach(model);
/* per process */
tdse_telemetry_service_shutdown();
Detach before destroying the model. Shutdown after all models are detached.
Event Reference
Telemetry Levels
Set per-model via tdse_model_telemetry_set_level():
| Level | Behavior |
|---|---|
| minimal | lifecycle events only (create, destroy) |
| standard | adds step-level timing |
| verbose | adds per-query breakdown (op, hr, ir, commit, dr) |
Event Kinds
Events are typed via tdse_telemetry_event_kind_t. Key kinds emitted
automatically by the runtime:
| Kind | When Emitted | Payload |
|---|---|---|
| step begin | tdse_step_begin() | t, dt |
| step commit | tdse_step_commit() | committed_steps |
| model lifecycle | create / destroy | handle metadata |
Extended Statistics
Query accumulated statistics at any time:
tdse_telemetry_model_stats_t stats;
tdse_model_telemetry_get_stats(model, &stats);
tdse_telemetry_model_extended_stats_t ext;
tdse_model_telemetry_get_extended_stats(model, &ext);
Custom Tags
Attach key-value tags for correlation in multi-model deployments:
tdse_model_telemetry_set_custom_tags(model, "circuit=my_netlist,run=42");
Instance IDs
Each attached model receives a unique instance ID for log correlation:
uint64_t id = tdse_model_telemetry_get_instance_id(model);
Exporters
JSON Lines (built-in)
Always active when the service is initialized. Events are written as JSON
lines to json_output_path. Each line is a self-contained JSON object.
OpenTelemetry
Export to an OTLP-compatible endpoint:
tdse_telemetry_export_opentelemetry("http://127.0.0.1:4318");
Security note: the OpenTelemetry exporter only accepts loopback or private
targets by default (e.g., http://127.0.0.1:4318). This is intentional to
prevent accidental data exposure.
HTTPS export on non-Windows builds requires OpenSSL to be found at configure time.
Prometheus
Export to a Prometheus push gateway:
tdse_telemetry_export_prometheus("http://127.0.0.1:9091");
Same loopback/private restriction applies.
What Telemetry Does Not Prove
Telemetry can help you preserve evidence, correlate incidents, and compare runs under the same host policy. It does not by itself prove:
- release qualification on a new host platform
- WCET or target-machine timing acceptance for RT/HIL
- correctness of a runtime-plan or backend recommendation
- support for optional accelerators that were not included in the delivered package
Health Status
Check service health at any time:
tdse_telemetry_health_status_t health;
tdse_telemetry_get_health_status(&health);
System Metrics
Record or query system-level metrics:
tdse_telemetry_system_metrics_t sys;
tdse_telemetry_get_system_metrics(&sys);
tdse_telemetry_record_system_metrics();
tdse_telemetry_record_memory_usage();
tdse_telemetry_record_gpu_usage();
Performance Alerts
Record custom performance alerts:
tdse_telemetry_record_performance_alert(model, TDSE_TELEMETRY_ALERT_SEVERITY_WARNING,
"step_latency_exceeded", 150.0);
Alert severities:
| Severity | Use When |
|---|---|
| info | informational note |
| warning | performance degraded but tolerable |
| error | performance issue requiring attention |
| critical | simulation may be invalid |
Backend Switch Events
The runtime automatically records backend switch events. You can also record custom events:
tdse_telemetry_record_backend_switch(model, old_backend_id, new_backend_id);
Flush
Force-flush pending events:
tdse_telemetry_flush_events();
Complete Example
#include <tdse/tdse.h>
#include <tdse/tdse_telemetry.h>
void run_with_telemetry(tdse_model_t* model, size_t nsteps) {
/* Service is assumed to be initialized before this function. */
tdse_telemetry_model_config_t model_cfg;
tdse_telemetry_model_config_init(&model_cfg);
model_cfg.sampling_interval_steps = 1;
model_cfg.ring_capacity = 2048;
tdse_model_telemetry_attach(model, &model_cfg);
tdse_model_telemetry_set_custom_tags(model, "scenario=baseline");
tdse_model_telemetry_set_level(model, TDSE_TELEMETRY_LEVEL_VERBOSE);
/* Normal step loop */
for (size_t n = 0; n < nsteps; ++n) {
tdse_step_begin(model, n * 0.001, 0.001);
/* op, hr, ir, solve, commit */
tdse_step_commit(model, primary);
}
tdse_model_telemetry_detach(model);
}
Production Deployment Checklist
- Confirm the delivered RC package enables telemetry (
TDSE_ENABLE_TELEMETRY=ON) - Call
tdse_telemetry_service_init()once at process startup - Attach each model before stepping
- Set
sampling_interval_stepsintentionally (1 for dev, higher for prod) - Configure exporter endpoints (JSON always active; OTLP/Prom optional)
- Ensure exporter targets are loopback/private or explicitly authorized
- On Linux, OpenSSL is available if HTTPS export is needed
- Detach before destroy, shutdown after all models are detached
- Verify
tdse_telemetry_service_is_initialized()returns true before attaching - Archive telemetry JSON alongside simulation results for post-hoc analysis
