OTAs and Rollouts

Once a release is published (your push completed a build), the dashboard shows it in the Releases view. From there you dispatch it to devices — there's no auto-deploy. OTA in v0.1.0 is operator-driven.

What happens on the device

backend publishes scadable/<cn>/ota/command to the target device(s)
       │
       ▼
library receives the command (subscribed at MQTT connect)
       │  payload: { "version": "0.1.1", "url": "https://app.scadable.com/api/firmware/<sha>.bin" }
       │
       ▼
library calls esp_https_ota with the URL
       │
       ▼
library validates the broker's TLS cert (CA from NVS)
       │
       ▼
library writes the downloaded image to the idle OTA slot
       │  (ota_1 if running ota_0, vice versa)
       │
       ▼
library publishes progress to scadable/<cn>/ota/status
       │  payload: { "version": "0.1.1", "state": "progress", "details": "42% (256000/608000)" }
       │
       ▼
on completion, library marks the new slot pending-verify and reboots
       │
       ▼
bootloader boots the new slot
       │
       ▼
new firmware completes its first scadable_init + MQTT connect cycle
       │  → library calls esp_ota_mark_app_valid_cancel_rollback
       │
       ▼
boot is sticky; old slot is now the rollback target

If the new firmware fails to boot cleanly (panic, watchdog, brownout), the bootloader marks the slot bad on the next reset and reverts to the old slot. ESP-IDF's standard app_update machinery handles this — no custom code on your side.

What counts as "validated"

Currently: the new firmware reaches the point where the library connects to MQTT and starts heartbeating. That exercises Wi-Fi, DNS, TLS, the cert in NVS, and broker auth — enough to call the slot trusted. Subsequent panics in your code don't trigger rollback; the library reconnects and tries again.

If your firmware doesn't reach that connect call within ESP-IDF's verification window (default 5 minutes), the bootloader reverts to the previous slot.

Rollback semantics

  • Automatic on boot failure. Bootloader handles it.
  • No manual rollback API today. To deliberately roll back, push the older code as a new commit and re-deploy.
  • Crash loops. Three consecutive boot failures within 60 seconds → treated as validation failure.
  • Can't permanently brick. As long as the previous OTA slot is intact, the bootloader has a recovery target.

Dispatching from the dashboard

In the Releases view, click on a successful build → Deploy. Choose one device, a set, or the whole namespace. The backend publishes the OTA command to each target's scadable/<cn>/ota/command topic. Online devices start downloading immediately; offline devices receive the command when they reconnect.

Per-device state: pending, downloading, progress, success, failed. Click a failed device for the published error in its ota/status topic.

What's not in OTA yet (v0.1.0)

  • Artifact signing. v0.1.0 ships unsigned binaries; broker auth via cert is the trust anchor. Signed artifacts with on-device verification land in v0.2.0+.
  • Percent-based staged rollouts. Today it's "deploy to these devices." Native canary / percent rollouts are roadmap.
  • OTA windows. Today the download starts on receipt. Scheduled downloads (e.g. only between 2–5 AM) are roadmap.
  • Automatic deploys on push. v0.1.0 builds automatically, dispatches manually.

When OTAs don't reach a device

  • Offline. The OTA command sits at the broker until the device reconnects. Library re-subscribes on every connect and picks up pending commands.
  • Wrong topic. The library subscribes to scadable/<cn>/ota/command exactly. Anything published elsewhere is invisible.
  • Cert / TLS broken. Device can't reach the broker → no OTA command received. See Troubleshooting.
  • Partition table mismatch. Device received the command but esp_https_ota rejects with ESP_FAIL because the partition layout lacks ota_0 / ota_1 / otadata. Fix the partition table in your repo and push.

Where to look when an OTA misbehaves

Dashboard → Releases → click the affected deploy → per-device status. Each device shows its last published ota/status payload (state + details). For deeper diagnosis, attach idf.py monitor to the device and read the scd.ota log tags.