UX Latency Patterns

UX latency patterns are user interface patterns and design strategies for managing perceived wait times during agent execution. These patterns focus on creating responsive user experiences even when background processes, API calls, or agent computations take significant time to complete.

In agentic systems, where AI agents may spend seconds or minutes processing requests, executing tools, or generating responses, effective latency management becomes critical to maintaining user engagement and system usability.

Why It Matters

Perceived Speed vs Actual Speed

Human perception of time is subjective and influenced by feedback. A 3-second operation with clear progress indicators can feel faster than a 1-second operation with no feedback. UX latency patterns exploit this psychological phenomenon by:

Providing immediate visual feedback to confirm user actions
Breaking long waits into perceivable stages with status updates
Creating the illusion of speed through optimistic UI updates
Maintaining context so users understand what's happening

User Patience and Abandonment Rates

Research shows that user tolerance for wait times varies dramatically based on context and feedback:

0-100ms: Feels instant. Users perceive no delay.
100-300ms: Slight perceptible delay, but still feels responsive.
300ms-1s: Noticeable delay. Users need subtle feedback.
1-3s: Moderate delay. Clear progress indicators required.
3-10s: Long delay. Detailed status updates essential.
10s+: Very long delay. Consider background processing with notifications.

Without proper latency management, users may:

Abandon tasks before completion (dropout rates increase 7% per second of delay)
Submit duplicate requests, overwhelming the system
Lose context about what they were doing
Develop negative perceptions of system reliability

In agentic systems where tasks routinely exceed 3 seconds, effective UX latency patterns are not optional—they're essential for usability.

Concrete Examples

Optimistic UI Updates

Immediately reflect user actions in the interface before server confirmation:

async function sendMessage(content: string) {
  // Immediately add message to UI with "sending" state
  const optimisticMessage = {
    id: generateTempId(),
    content,
    status: 'sending',
    timestamp: Date.now()
  };

  addMessageToUI(optimisticMessage);

  try {
    // Send to server in background
    const confirmed = await api.sendMessage(content);
    updateMessageStatus(optimisticMessage.id, 'sent', confirmed.id);
  } catch (error) {
    updateMessageStatus(optimisticMessage.id, 'failed');
  }
}

Use cases: Chat interfaces, form submissions, agent commands where success is highly likely.

Streaming Responses

Display content as it's generated rather than waiting for completion:

async function streamAgentResponse(prompt: string) {
  const stream = await agent.generateStream(prompt);
  let accumulated = '';

  for await (const chunk of stream) {
    accumulated += chunk.text;
    updateResponseDisplay(accumulated);
    // User sees text appearing in real-time
  }
}

Use cases: LLM responses, agent reasoning traces, long-form content generation.

Skeleton Screens

Show content placeholders that match the expected layout:

function AgentResponseSkeleton() {
  return (
    <div className="space-y-4">
      <div className="h-4 bg-gray-200 rounded w-3/4 animate-pulse" />
      <div className="h-4 bg-gray-200 rounded w-full animate-pulse" />
      <div className="h-4 bg-gray-200 rounded w-5/6 animate-pulse" />
      <div className="h-32 bg-gray-200 rounded animate-pulse" />
    </div>
  );
}

Use cases: Loading agent workspaces, tool execution results, structured data displays.

Progress Indicators

Communicate execution status and remaining time:

function AgentExecutionProgress({ task }: Props) {
  return (
    <div className="space-y-2">
      <div className="flex justify-between text-sm">
        <span>{task.currentStep}</span>
        <span>{task.completedSteps}/{task.totalSteps}</span>
      </div>

      <ProgressBar value={(task.completedSteps / task.totalSteps) * 100} />

      {task.estimatedTimeRemaining && (
        <p className="text-xs text-gray-500">
          About {task.estimatedTimeRemaining}s remaining
        </p>
      )}
    </div>
  );
}

Use cases: Multi-step agent workflows, batch operations, file processing.

Background Processing with Notifications

Move long-running tasks off the main thread and notify on completion:

async function executeLongRunningAgent(task: AgentTask) {
  // Start background job
  const jobId = await backgroundQueue.enqueue(task);

  // Show non-blocking notification
  showToast({
    type: 'info',
    message: 'Agent task started. You\'ll be notified when complete.',
    action: { label: 'View Status', link: `/jobs/${jobId}` }
  });

  // User can continue working
  // Notification appears when job completes
  backgroundQueue.onComplete(jobId, (result) => {
    showToast({
      type: 'success',
      message: 'Agent task completed',
      action: { label: 'View Results', link: `/results/${result.id}` }
    });
  });
}

Use cases: Data migrations, batch agent operations, report generation.

Common Pitfalls

Blocking the Entire UI

Problem: Disabling all interaction during background operations.

Why it's bad: Users can't check other information, cancel operations, or multitask.

Solution: Use localized loading states. Only disable the specific component affected by the operation:

// Bad: Blocks everything
{isLoading && <FullPageSpinner />}

// Good: Localized loading
<Button disabled={isExecuting}>
  {isExecuting ? 'Executing...' : 'Run Agent'}
</Button>

No Feedback for Long Tasks

Problem: Showing a generic spinner for 30+ second operations.

Why it's bad: Users don't know if the system is frozen, what's happening, or how long to wait.

Solution: Provide granular status updates:

// Bad
setLoading(true);
await longOperation();
setLoading(false);

// Good
setStatus('Initializing agent...');
await initialize();

setStatus('Loading tools (1/5)...');
await loadTools();

setStatus('Executing task (2/5)...');
const result = await execute();

setStatus('Processing results (3/5)...');
// ... etc

Misleading Progress Bars

Problem: Progress bars that don't reflect actual progress or get stuck at 99%.

Why it's bad: Erodes user trust and creates anxiety.

Solution: Use indeterminate loaders when progress is unknown, or calculate genuine progress:

// Use indeterminate when progress is unknown
{!canCalculateProgress && <IndeterminateSpinner />}

// Only show percentage when it's accurate
{canCalculateProgress && (
  <ProgressBar
    value={completedItems / totalItems * 100}
    label={`${completedItems}/${totalItems} items processed`}
  />
)}

Ignoring Network Conditions

Problem: Using patterns optimized for fast connections on slow networks.

Why it's bad: Creates jarring experiences when latency varies.

Solution: Adapt patterns based on connection quality:

const connection = navigator.connection;
const isSlowConnection = connection?.effectiveType === '2g' ||
                         connection?.effectiveType === 'slow-2g';

if (isSlowConnection) {
  // Use aggressive caching and simplified UI
  enableOfflineMode();
} else {
  // Use rich real-time features
  enableRealtimeUpdates();
}

No Cancellation Options

Problem: Forcing users to wait for long operations they no longer need.

Why it's bad: Wastes resources and frustrates users who clicked by mistake.

Solution: Always provide cancellation for operations > 3 seconds:

const abortController = new AbortController();

async function executeCancellableAgent() {
  try {
    const result = await agent.execute(task, {
      signal: abortController.signal
    });
    return result;
  } catch (error) {
    if (error.name === 'AbortError') {
      showToast('Agent execution cancelled');
    }
  }
}

// UI shows cancel button
<Button onClick={() => abortController.abort()}>
  Cancel
</Button>

Implementation Patterns

Streaming Pattern for Agent Responses

Stream data from the server and update UI incrementally:

// Backend: Stream server-sent events
export async function POST(request: Request) {
  const encoder = new TextEncoder();
  const stream = new ReadableStream({
    async start(controller) {
      const agent = new Agent();

      for await (const event of agent.executeStream(task)) {
        const data = `data: ${JSON.stringify(event)}\n\n`;
        controller.enqueue(encoder.encode(data));
      }

      controller.close();
    }
  });

  return new Response(stream, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      'Connection': 'keep-alive'
    }
  });
}

// Frontend: Consume stream
async function subscribeToAgentStream(taskId: string) {
  const response = await fetch(`/api/agent/stream/${taskId}`);
  const reader = response.body.getReader();
  const decoder = new TextDecoder();

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    const chunk = decoder.decode(value);
    const events = chunk.split('\n\n').filter(Boolean);

    for (const event of events) {
      if (event.startsWith('data: ')) {
        const data = JSON.parse(event.slice(6));
        handleAgentEvent(data);
      }
    }
  }
}

Chunked Responses with Pagination

Break large datasets into manageable chunks:

function useInfiniteAgentResults(query: string) {
  const [results, setResults] = useState<Result[]>([]);
  const [cursor, setCursor] = useState<string | null>(null);
  const [isLoading, setIsLoading] = useState(false);

  const loadMore = async () => {
    setIsLoading(true);

    const response = await fetch('/api/agent/results', {
      method: 'POST',
      body: JSON.stringify({ query, cursor, limit: 20 })
    });

    const data = await response.json();
    setResults(prev => [...prev, ...data.results]);
    setCursor(data.nextCursor);
    setIsLoading(false);
  };

  return { results, loadMore, hasMore: cursor !== null, isLoading };
}

// UI with infinite scroll
function ResultsList() {
  const { results, loadMore, hasMore, isLoading } = useInfiniteAgentResults(query);

  return (
    <InfiniteScroll
      loadMore={loadMore}
      hasMore={hasMore}
      loader={<SkeletonResults key="loader" />}
    >
      {results.map(result => <ResultCard key={result.id} {...result} />)}
    </InfiniteScroll>
  );
}

Status Polling for Long Operations

Poll for updates when server-sent events aren't available:

function usePollingStatus(jobId: string, interval = 2000) {
  const [status, setStatus] = useState<JobStatus | null>(null);

  useEffect(() => {
    const poll = async () => {
      const response = await fetch(`/api/jobs/${jobId}/status`);
      const data = await response.json();
      setStatus(data);

      // Stop polling when complete
      if (data.status === 'completed' || data.status === 'failed') {
        clearInterval(pollInterval);
      }
    };

    poll(); // Initial poll
    const pollInterval = setInterval(poll, interval);

    return () => clearInterval(pollInterval);
  }, [jobId, interval]);

  return status;
}

// Adaptive polling: slow down over time
function useAdaptivePolling(jobId: string) {
  const [interval, setInterval] = useState(1000);

  useEffect(() => {
    // Start fast, slow down over time
    const timer = setTimeout(() => {
      setInterval(prev => Math.min(prev * 1.5, 10000)); // Cap at 10s
    }, 30000); // After 30s

    return () => clearTimeout(timer);
  }, []);

  return usePollingStatus(jobId, interval);
}

Background Processing with Web Workers

Offload heavy computation to prevent UI blocking:

// agent-worker.ts
self.addEventListener('message', async (event) => {
  const { type, payload } = event.data;

  if (type === 'EXECUTE_AGENT') {
    try {
      // Post progress updates
      self.postMessage({ type: 'PROGRESS', step: 'initializing' });

      const agent = new Agent();
      await agent.initialize();

      self.postMessage({ type: 'PROGRESS', step: 'executing' });

      const result = await agent.execute(payload.task);

      self.postMessage({ type: 'COMPLETE', result });
    } catch (error) {
      self.postMessage({ type: 'ERROR', error: error.message });
    }
  }
});

// Main thread
const agentWorker = new Worker('/agent-worker.js');

function executeAgentInBackground(task: AgentTask) {
  agentWorker.postMessage({ type: 'EXECUTE_AGENT', payload: { task } });

  agentWorker.addEventListener('message', (event) => {
    const { type, ...data } = event.data;

    switch (type) {
      case 'PROGRESS':
        updateProgressUI(data.step);
        break;
      case 'COMPLETE':
        handleResult(data.result);
        break;
      case 'ERROR':
        handleError(data.error);
        break;
    }
  });
}

Debouncing and Throttling User Input

Reduce unnecessary agent invocations for real-time features:

import { debounce } from 'lodash';

// Debounce: Wait for user to stop typing
const debouncedSearch = debounce(async (query: string) => {
  const results = await agent.search(query);
  setResults(results);
}, 300);

// Throttle: Execute at most once per interval
const throttledUpdate = throttle(async (data: FormData) => {
  await agent.updateContext(data);
}, 1000);

// Example usage
<input
  type="text"
  onChange={(e) => debouncedSearch(e.target.value)}
  placeholder="Search with agent..."
/>

Key Metrics

Perceived Latency

Definition: How long users feel an operation takes, influenced by feedback quality.

Measurement:

// Track user perception through engagement
const metrics = {
  actualDuration: 3500, // ms
  feedbackDelay: 150, // time until first feedback
  interactionsDuringWait: 5, // user remained engaged
  abandonmentRate: 0.02 // 2% abandoned
};

Targets:

Feedback delay: < 100ms for all operations
Perceived speed rating: > 4.0/5.0 in user surveys
Abandonment rate: < 5% for operations < 10s

Time to First Byte (TTFB)

Definition: Time from request initiation to first byte of response received.

Measurement:

const startTime = performance.now();

fetch('/api/agent/execute')
  .then(response => {
    const ttfb = performance.now() - startTime;
    analytics.track('agent.ttfb', { duration: ttfb });
  });

Targets:

TTFB for agent endpoints: < 200ms
TTFB for streaming responses: < 100ms
95th percentile TTFB: < 500ms

Time to First Token (TTFT)

Definition: For streaming responses, time until the first content token is received.

Measurement:

const startTime = performance.now();
let ttft: number | null = null;

for await (const chunk of stream) {
  if (ttft === null) {
    ttft = performance.now() - startTime;
    analytics.track('agent.ttft', { duration: ttft });
  }
  processChunk(chunk);
}

Targets:

TTFT for LLM responses: < 500ms
TTFT for agent reasoning: < 1000ms
TTFT 99th percentile: < 2000ms

Time to Interactive (TTI)

Definition: Time until the UI is fully responsive and users can interact.

Measurement:

// Using PerformanceObserver
const observer = new PerformanceObserver((list) => {
  for (const entry of list.getEntries()) {
    if (entry.name === 'first-input-delay') {
      analytics.track('agent.tti', {
        duration: entry.processingStart - entry.startTime
      });
    }
  }
});

observer.observe({ entryTypes: ['first-input'] });

Targets:

TTI for agent UI: < 3000ms
TTI after agent response: < 500ms
Main thread blocking: < 50ms per frame

Progress Update Frequency

Definition: How often users receive status updates during long operations.

Measurement:

const progressEvents: number[] = [];
const startTime = Date.now();

agentStream.on('progress', () => {
  progressEvents.push(Date.now() - startTime);
});

agentStream.on('complete', () => {
  const avgInterval = progressEvents.reduce((sum, time, i) => {
    return i > 0 ? sum + (time - progressEvents[i-1]) : sum;
  }, 0) / (progressEvents.length - 1);

  analytics.track('agent.progress_frequency', { avgInterval });
});

Targets:

Update frequency for operations > 3s: at least every 1-2s
Updates should show meaningful state changes
No "stuck" progress for > 5s without update

Related Concepts

Latency SLO - Service level objectives for response times
Handoff patterns - Transitioning control between agents and users
Activation TTFV - Time to first value for agent activation
Progress indicators - Visual feedback patterns for ongoing operations

Last updated: 2025-10-23

UX Latency Patterns

Why It Matters

Perceived Speed vs Actual Speed

User Patience and Abandonment Rates

Concrete Examples

Optimistic UI Updates

Streaming Responses

Skeleton Screens

Progress Indicators

Background Processing with Notifications

Common Pitfalls

Blocking the Entire UI

No Feedback for Long Tasks

Misleading Progress Bars

Ignoring Network Conditions

No Cancellation Options

Implementation Patterns

Streaming Pattern for Agent Responses

Chunked Responses with Pagination

Status Polling for Long Operations

Background Processing with Web Workers

Debouncing and Throttling User Input

Key Metrics

Perceived Latency

Time to First Byte (TTFB)

Time to First Token (TTFT)

Time to Interactive (TTI)

Progress Update Frequency

Related Concepts

Related Concepts

Latency SLO

Handoff patterns

Activation (TTFV)

Progress indicators