OTAs and Rollouts

Once a release is published (build succeeded, artifact signed and uploaded), the fleet learns about it through MQTT. Devices apply the update to their idle OTA slot, validate boot, and stay on it. If validation fails the bootloader rolls back automatically. There's no "apply" API for firmware to call.

What happens on the device

broker → SCADABLE_EVT_OTA_AVAILABLE (firmware sees it via event callback)
       │
       ▼
library downloads artifact from cdn.scadable.com
       │
       ▼
library verifies signature against the namespace's release key
       │
       ▼
library writes to idle slot (ota_0 if running ota_1, vice versa)
       │
       ▼
library marks the new slot as "boot pending"
       │
       ▼
device reboots
       │
       ▼
bootloader boots the pending slot
       │
       ▼
new firmware calls scadable_init → scadable_connect succeeds → marks "validated"
       │
       ▼
boot is sticky; old slot is now the rollback target

If scadable_connect doesn't succeed within the validation window (default 5 minutes), the bootloader marks the new slot as bad on next reset and the device reverts to the old slot. The dashboard records the failure.

Validation: what counts as "the new firmware works"

Validation = the new firmware completes a scadable_init + scadable_connect handshake. That exercises WiFi, DNS, TLS, the cert in NVS, and mTLS auth — enough to call the slot trusted. Subsequent panics in your code don't trigger rollback; the library will reconnect and try again.

For a stricter gate (only validate if your self-test passes), pass SCADABLE_OPT_MANUAL_VALIDATE to scadable_init and call scadable_mark_validated() from your code.

Rollback semantics

  • Automatic on validation failure. Bootloader handles it.
  • No manual rollback API today. To roll back, cut a new tag from the older source. Roadmap.
  • Crash loops. Three consecutive boot failures within 60s → treated as validation failure.
  • Can't brick. The provisioner stays in factory forever; if both OTA slots corrupt, chip boots the provisioner.

Staged rollouts

Today: every device in the namespace receives SCADABLE_EVT_OTA_AVAILABLE immediately on release. No built-in canary yet.

You can stage manually with two namespaces (staging / production) pointing at the same repo. Push a tag → both build the same artifact → staging devices update first because they're subscribed to that namespace. Promote by re-tagging the production-linked branch.

Native percent-based staged rollouts are on the roadmap.

Pacing the download

Devices on slow links can defer the download to a quiet window. Pass SCADABLE_OPT_OTA_WINDOW to scadable_init with a cron schedule:

scadable_config_t cfg = {
    .ota_window = "0 2-5 * * *",   // 2 AM to 5 AM local
};
scadable_init(&cfg);

Notification is received immediately; download is postponed until the cron matches.

When OTAs don't reach a device

  • Offline. Queued at the broker (durable subscription); fires on reconnect.
  • Already on latest. Library no-ops if version matches.
  • Cert expired. Can't reach the broker. See Device cert lifecycle.
  • OTA window in the future. Postponed.

Where to look when an OTA misbehaves

Dashboard → Releases → fleet status. Per-device state: pending, downloading, applying, validated, rolled-back, failed. Click a failed device for its OTA error log.

For serial-level debug, see the OTA section of Troubleshooting.