Why offline‑first?
- Modern users expect apps to work on flaky networks, in airplane mode, and in low‑signal areas.
- Offline‑first isn’t just caching; it’s a deliberate data model, write semantics, and conflict strategy.
- Getting it right reduces support load, increases retention, and simplifies UX: users never see “can’t save”.
This guide provides a comprehensive blueprint for offline‑first Flutter apps, with patterns for storage, sync, conflicts, background jobs, and testing.
Contents
- Principles and guarantees
- Data model and identifiers
- Local storage and change log
- Write semantics: queueing, retries, backoff
- Sync strategies (pull/push/merge)
- Conflict resolution patterns
- Background syncing and lifecycle
- UX patterns for trust and clarity
- Observability, metrics, and alerting
- Testing and failure injection
- Reference implementation outline
Principles and guarantees
- Availability first: operations succeed locally, even when offline.
- Convergence: replicas eventually reach a consistent state under no further updates.
- Idempotence: re‑applying the same operation yields the same state; critical for retries.
- Causal ordering (where needed): maintain relative order within an entity/stream.
- Bounded staleness: define acceptable windows for how stale UI can be.
Guarantee matrix (what you can promise)
- Local durability: once the user sees “Saved”, the mutation exists in the local log (fsync’ed).
- Eventual server persistence: queued writes are retried with exponential backoff until success or manual intervention.
- Deterministic conflict policy: given concurrent edits, the outcome is predictable, documented, and test‑covered.
Data model and identifiers
- Use client‑generated, globally unique IDs (UUID v4 or KSUID) for new entities.
- Prefer immutable event stream (append‑only change log) over in‑place mutation for local writes; apply reducers to compute current state.
- Store server ETags/versions per entity to support optimistic concurrency.
Entity shape (example)
{
"id": "ksuid_29JYBqz9G3S6po1hy...",
"type": "Note",
"title": "Grocery list",
"body": "Milk, eggs, butter",
"updatedAt": "2025-03-14T09:12:00Z",
"version": "W/\"etag-server-v15\"",
"local": {
"pending": false,
"dirtyFields": ["body"],
"lastSyncAt": "2025-03-14T09:15:00Z"
}
}
Local storage and change log
- Chosen stores:
- drift (SQLite) for relational/stateful queries and constraints
- isar/hive for pure KV or document workloads
- Tables:
- entities (current materialized state)
- changes (append‑only operations: create/update/delete, payload, timestamp, dependency)
- sync_state (cursors for pull, last_success, backoff state)
- Keep indexes on entity id, updatedAt, and dirty flags for efficient queries.
Write semantics: queueing, retries, backoff
- Local flow:
- Validate input in domain layer.
- Begin transaction:
- Append change to
changes - Apply change to
entitiesmaterialized view
- Append change to
- Mark entity
local.pending=trueand capturedirtyFields. - Enqueue a background job to push.
- Retries:
- Exponential backoff with jitter (e.g., 1s, 2s, 4s, 8s, cap at 5m) on transient errors.
- Immediate retry on network regain.
- Idempotence:
- Include clientChangeId in payload; server stores last applied id per entity to de‑duplicate.
Sync strategies
- Pull (read):
- Incremental with server cursor (updatedSince watermark or opaque cursor).
- On first run: full sync per collection, then incremental.
- Merge strategy avoids clobbering local unsynced changes; consider three‑way merges on fields.
- Push (write):
- Order by dependency (create before update; parent before child when needed).
- Batch small operations to reduce overhead but keep batches small enough for rollback handling.
- On 409 Conflict / 412 Precondition Failed, trigger conflict resolution.
Conflict resolution patterns
- Last‑Writer‑Wins (LWW): compare server
updatedAtvs. clientupdatedAt.- Pros: simple, deterministic; Cons: may clobber legitimate changes.
- Field‑level merge: merge only independent fields; conflicting fields use LWW or custom policies.
- Operational transforms (OT)/CRDT: for collaborative text or counters; higher complexity, powerful guarantees.
- Manual resolution: queue conflict items for user attention (notifications, “Resolve” screens).
- Recommendation:
- Start with field‑level merge for independent fields, LWW fallback for conflicts, and manual surface for critical entities.
Example: optimistic concurrency with ETag
- Client sends
If-Match: W/"etag-server-v15". - On mismatch (412), fetch server version, compute three‑way diff:
- Base: last known server version
- Local: user‑edited version
- Remote: current server version
- Merge:
- If fields disjoint → merge automatically.
- Else if resolvable by policy → apply.
- Else create a “conflict item” for manual resolution.
Background syncing and lifecycle
- Architecture:
- Foreground sync manager (Dart) for active app.
- Background fetch (platform): Android WorkManager; iOS BGAppRefreshTask/BGProcessingTask with limits.
- Triggers:
- App start (deferred by a few seconds to avoid blocking UX)
- Connectivity changes (network regained)
- Significant time elapse (e.g., every 15–30 min)
- Manual pull‑to‑refresh
- Guardrails:
- Respect battery saver, unmetered networks options, and OS quotas.
- Rate‑limit to avoid server overload; server should also enforce quotas.
UX patterns for trust and clarity
- Optimistic UI: reflect local changes immediately with subtle “Syncing…” status.
- Sync banners/toasts: “You’re offline. Changes will sync when connection resumes.”
- Conflict chips: indicate conflicts on list rows; tap → resolution screen.
- Disabled destructive actions when not possible safely (e.g., permanent delete without server ack).
- Visual status on settings: last sync time, queued ops count, error list (with resend).
Observability, metrics, and alerting
- Client metrics:
- Queue length over time
- Sync cycle duration and success rate
- Conflict rate per collection/field
- Retry histogram and backoff distribution
- Logging:
- Redact PII; include change ids, entity ids, and error codes.
- Sample to avoid volume spikes.
- Alert thresholds (analytics backend):
- Conflict rate > X% → alert platform squad
- Retry > N attempts → circuit‑break or escalate UI
Testing and failure injection
- Unit tests:
- Reducers (applyChange), merge policy, clock‑controlled timestamping.
- Integration tests (widget + persistence):
- End‑to‑end mutation flows with forced offline/online transitions.
- Failure injection:
- Simulate 409/412/5xx, timeouts, and partial batch failures.
- Use dependency injection to swap HTTP client and persistence.
- Property‑based tests:
- Generate random interleavings of changes/pulls to verify convergence and invariants.
Reference implementation outline (pseudo)
// Domain change model
sealed class Change {
final String id; // clientChangeId
final String entityId;
final DateTime at;
const Change(this.id, this.entityId, this.at);
}
class UpdateFields extends Change {
final Map<String, dynamic> patch;
const UpdateFields(super.id, super.entityId, super.at, this.patch);
}
// Persistence interface
abstract class ChangeStore {
Future<void> append(Change c);
Stream<List<Change>> pending();
Future<void> markApplied(String changeId, {required String serverVersion});
}
// Sync manager
class SyncManager {
final ChangeStore changes;
final EntityStore entities;
final Api api;
final Clock clock;
SyncManager(this.changes, this.entities, this.api, this.clock);
Future<void> push() async {
final batch = await changes.pending().first; // small batch
for (final c in batch) {
try {
final res = await api.apply(c);
await changes.markApplied(c.id, serverVersion: res.etag);
await entities.updateVersion(c.entityId, res.etag);
} on PreconditionFailed catch (e) {
await _resolveConflict(c, e.serverEntity);
} on TransientError {
// leave in queue; retry later with backoff
return;
}
}
}
}
When not to go offline‑first
- Real‑time transactional integrity (e.g., stock trades) with strong consistency might not fit offline semantics.
- If product does not require offline and infra budget is minimal, choose simpler caching.
Conclusion Offline‑first is an architectural choice that must permeate data modeling, UI, and operational practices. With a change log, idempotent writes, measured retries, and explicit conflict policies, you can deliver a robust experience that users trust—on any network.