UX Latency Patterns
UX latency patterns are user interface patterns and design strategies for managing perceived wait times during agent execution. These patterns focus on creating responsive user experiences even when background processes, API calls, or agent computations take significant time to complete.
In agentic systems, where AI agents may spend seconds or minutes processing requests, executing tools, or generating responses, effective latency management becomes critical to maintaining user engagement and system usability.
Why It Matters
Perceived Speed vs Actual Speed
Human perception of time is subjective and influenced by feedback. A 3-second operation with clear progress indicators can feel faster than a 1-second operation with no feedback. UX latency patterns exploit this psychological phenomenon by:
- Providing immediate visual feedback to confirm user actions
- Breaking long waits into perceivable stages with status updates
- Creating the illusion of speed through optimistic UI updates
- Maintaining context so users understand what's happening
User Patience and Abandonment Rates
Research shows that user tolerance for wait times varies dramatically based on context and feedback:
- 0-100ms: Feels instant. Users perceive no delay.
- 100-300ms: Slight perceptible delay, but still feels responsive.
- 300ms-1s: Noticeable delay. Users need subtle feedback.
- 1-3s: Moderate delay. Clear progress indicators required.
- 3-10s: Long delay. Detailed status updates essential.
- 10s+: Very long delay. Consider background processing with notifications.
Without proper latency management, users may:
- Abandon tasks before completion (dropout rates increase 7% per second of delay)
- Submit duplicate requests, overwhelming the system
- Lose context about what they were doing
- Develop negative perceptions of system reliability
In agentic systems where tasks routinely exceed 3 seconds, effective UX latency patterns are not optional—they're essential for usability.
Concrete Examples
Optimistic UI Updates
Immediately reflect user actions in the interface before server confirmation:
async function sendMessage(content: string) {
// Immediately add message to UI with "sending" state
const optimisticMessage = {
id: generateTempId(),
content,
status: 'sending',
timestamp: Date.now()
};
addMessageToUI(optimisticMessage);
try {
// Send to server in background
const confirmed = await api.sendMessage(content);
updateMessageStatus(optimisticMessage.id, 'sent', confirmed.id);
} catch (error) {
updateMessageStatus(optimisticMessage.id, 'failed');
}
}
Use cases: Chat interfaces, form submissions, agent commands where success is highly likely.
Streaming Responses
Display content as it's generated rather than waiting for completion:
async function streamAgentResponse(prompt: string) {
const stream = await agent.generateStream(prompt);
let accumulated = '';
for await (const chunk of stream) {
accumulated += chunk.text;
updateResponseDisplay(accumulated);
// User sees text appearing in real-time
}
}
Use cases: LLM responses, agent reasoning traces, long-form content generation.
Skeleton Screens
Show content placeholders that match the expected layout:
function AgentResponseSkeleton() {
return (
<div className="space-y-4">
<div className="h-4 bg-gray-200 rounded w-3/4 animate-pulse" />
<div className="h-4 bg-gray-200 rounded w-full animate-pulse" />
<div className="h-4 bg-gray-200 rounded w-5/6 animate-pulse" />
<div className="h-32 bg-gray-200 rounded animate-pulse" />
</div>
);
}
Use cases: Loading agent workspaces, tool execution results, structured data displays.
Progress Indicators
Communicate execution status and remaining time:
function AgentExecutionProgress({ task }: Props) {
return (
<div className="space-y-2">
<div className="flex justify-between text-sm">
<span>{task.currentStep}</span>
<span>{task.completedSteps}/{task.totalSteps}</span>
</div>
<ProgressBar value={(task.completedSteps / task.totalSteps) * 100} />
{task.estimatedTimeRemaining && (
<p className="text-xs text-gray-500">
About {task.estimatedTimeRemaining}s remaining
</p>
)}
</div>
);
}
Use cases: Multi-step agent workflows, batch operations, file processing.
Background Processing with Notifications
Move long-running tasks off the main thread and notify on completion:
async function executeLongRunningAgent(task: AgentTask) {
// Start background job
const jobId = await backgroundQueue.enqueue(task);
// Show non-blocking notification
showToast({
type: 'info',
message: 'Agent task started. You\'ll be notified when complete.',
action: { label: 'View Status', link: `/jobs/${jobId}` }
});
// User can continue working
// Notification appears when job completes
backgroundQueue.onComplete(jobId, (result) => {
showToast({
type: 'success',
message: 'Agent task completed',
action: { label: 'View Results', link: `/results/${result.id}` }
});
});
}
Use cases: Data migrations, batch agent operations, report generation.
Common Pitfalls
Blocking the Entire UI
Problem: Disabling all interaction during background operations.
Why it's bad: Users can't check other information, cancel operations, or multitask.
Solution: Use localized loading states. Only disable the specific component affected by the operation:
// Bad: Blocks everything
{isLoading && <FullPageSpinner />}
// Good: Localized loading
<Button disabled={isExecuting}>
{isExecuting ? 'Executing...' : 'Run Agent'}
</Button>
No Feedback for Long Tasks
Problem: Showing a generic spinner for 30+ second operations.
Why it's bad: Users don't know if the system is frozen, what's happening, or how long to wait.
Solution: Provide granular status updates:
// Bad
setLoading(true);
await longOperation();
setLoading(false);
// Good
setStatus('Initializing agent...');
await initialize();
setStatus('Loading tools (1/5)...');
await loadTools();
setStatus('Executing task (2/5)...');
const result = await execute();
setStatus('Processing results (3/5)...');
// ... etc
Misleading Progress Bars
Problem: Progress bars that don't reflect actual progress or get stuck at 99%.
Why it's bad: Erodes user trust and creates anxiety.
Solution: Use indeterminate loaders when progress is unknown, or calculate genuine progress:
// Use indeterminate when progress is unknown
{!canCalculateProgress && <IndeterminateSpinner />}
// Only show percentage when it's accurate
{canCalculateProgress && (
<ProgressBar
value={completedItems / totalItems * 100}
label={`${completedItems}/${totalItems} items processed`}
/>
)}
Ignoring Network Conditions
Problem: Using patterns optimized for fast connections on slow networks.
Why it's bad: Creates jarring experiences when latency varies.
Solution: Adapt patterns based on connection quality:
const connection = navigator.connection;
const isSlowConnection = connection?.effectiveType === '2g' ||
connection?.effectiveType === 'slow-2g';
if (isSlowConnection) {
// Use aggressive caching and simplified UI
enableOfflineMode();
} else {
// Use rich real-time features
enableRealtimeUpdates();
}
No Cancellation Options
Problem: Forcing users to wait for long operations they no longer need.
Why it's bad: Wastes resources and frustrates users who clicked by mistake.
Solution: Always provide cancellation for operations > 3 seconds:
const abortController = new AbortController();
async function executeCancellableAgent() {
try {
const result = await agent.execute(task, {
signal: abortController.signal
});
return result;
} catch (error) {
if (error.name === 'AbortError') {
showToast('Agent execution cancelled');
}
}
}
// UI shows cancel button
<Button onClick={() => abortController.abort()}>
Cancel
</Button>
Implementation Patterns
Streaming Pattern for Agent Responses
Stream data from the server and update UI incrementally:
// Backend: Stream server-sent events
export async function POST(request: Request) {
const encoder = new TextEncoder();
const stream = new ReadableStream({
async start(controller) {
const agent = new Agent();
for await (const event of agent.executeStream(task)) {
const data = `data: ${JSON.stringify(event)}\n\n`;
controller.enqueue(encoder.encode(data));
}
controller.close();
}
});
return new Response(stream, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive'
}
});
}
// Frontend: Consume stream
async function subscribeToAgentStream(taskId: string) {
const response = await fetch(`/api/agent/stream/${taskId}`);
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const events = chunk.split('\n\n').filter(Boolean);
for (const event of events) {
if (event.startsWith('data: ')) {
const data = JSON.parse(event.slice(6));
handleAgentEvent(data);
}
}
}
}
Chunked Responses with Pagination
Break large datasets into manageable chunks:
function useInfiniteAgentResults(query: string) {
const [results, setResults] = useState<Result[]>([]);
const [cursor, setCursor] = useState<string | null>(null);
const [isLoading, setIsLoading] = useState(false);
const loadMore = async () => {
setIsLoading(true);
const response = await fetch('/api/agent/results', {
method: 'POST',
body: JSON.stringify({ query, cursor, limit: 20 })
});
const data = await response.json();
setResults(prev => [...prev, ...data.results]);
setCursor(data.nextCursor);
setIsLoading(false);
};
return { results, loadMore, hasMore: cursor !== null, isLoading };
}
// UI with infinite scroll
function ResultsList() {
const { results, loadMore, hasMore, isLoading } = useInfiniteAgentResults(query);
return (
<InfiniteScroll
loadMore={loadMore}
hasMore={hasMore}
loader={<SkeletonResults key="loader" />}
>
{results.map(result => <ResultCard key={result.id} {...result} />)}
</InfiniteScroll>
);
}
Status Polling for Long Operations
Poll for updates when server-sent events aren't available:
function usePollingStatus(jobId: string, interval = 2000) {
const [status, setStatus] = useState<JobStatus | null>(null);
useEffect(() => {
const poll = async () => {
const response = await fetch(`/api/jobs/${jobId}/status`);
const data = await response.json();
setStatus(data);
// Stop polling when complete
if (data.status === 'completed' || data.status === 'failed') {
clearInterval(pollInterval);
}
};
poll(); // Initial poll
const pollInterval = setInterval(poll, interval);
return () => clearInterval(pollInterval);
}, [jobId, interval]);
return status;
}
// Adaptive polling: slow down over time
function useAdaptivePolling(jobId: string) {
const [interval, setInterval] = useState(1000);
useEffect(() => {
// Start fast, slow down over time
const timer = setTimeout(() => {
setInterval(prev => Math.min(prev * 1.5, 10000)); // Cap at 10s
}, 30000); // After 30s
return () => clearTimeout(timer);
}, []);
return usePollingStatus(jobId, interval);
}
Background Processing with Web Workers
Offload heavy computation to prevent UI blocking:
// agent-worker.ts
self.addEventListener('message', async (event) => {
const { type, payload } = event.data;
if (type === 'EXECUTE_AGENT') {
try {
// Post progress updates
self.postMessage({ type: 'PROGRESS', step: 'initializing' });
const agent = new Agent();
await agent.initialize();
self.postMessage({ type: 'PROGRESS', step: 'executing' });
const result = await agent.execute(payload.task);
self.postMessage({ type: 'COMPLETE', result });
} catch (error) {
self.postMessage({ type: 'ERROR', error: error.message });
}
}
});
// Main thread
const agentWorker = new Worker('/agent-worker.js');
function executeAgentInBackground(task: AgentTask) {
agentWorker.postMessage({ type: 'EXECUTE_AGENT', payload: { task } });
agentWorker.addEventListener('message', (event) => {
const { type, ...data } = event.data;
switch (type) {
case 'PROGRESS':
updateProgressUI(data.step);
break;
case 'COMPLETE':
handleResult(data.result);
break;
case 'ERROR':
handleError(data.error);
break;
}
});
}
Debouncing and Throttling User Input
Reduce unnecessary agent invocations for real-time features:
import { debounce } from 'lodash';
// Debounce: Wait for user to stop typing
const debouncedSearch = debounce(async (query: string) => {
const results = await agent.search(query);
setResults(results);
}, 300);
// Throttle: Execute at most once per interval
const throttledUpdate = throttle(async (data: FormData) => {
await agent.updateContext(data);
}, 1000);
// Example usage
<input
type="text"
onChange={(e) => debouncedSearch(e.target.value)}
placeholder="Search with agent..."
/>
Key Metrics
Perceived Latency
Definition: How long users feel an operation takes, influenced by feedback quality.
Measurement:
// Track user perception through engagement
const metrics = {
actualDuration: 3500, // ms
feedbackDelay: 150, // time until first feedback
interactionsDuringWait: 5, // user remained engaged
abandonmentRate: 0.02 // 2% abandoned
};
Targets:
- Feedback delay: < 100ms for all operations
- Perceived speed rating: > 4.0/5.0 in user surveys
- Abandonment rate: < 5% for operations < 10s
Time to First Byte (TTFB)
Definition: Time from request initiation to first byte of response received.
Measurement:
const startTime = performance.now();
fetch('/api/agent/execute')
.then(response => {
const ttfb = performance.now() - startTime;
analytics.track('agent.ttfb', { duration: ttfb });
});
Targets:
- TTFB for agent endpoints: < 200ms
- TTFB for streaming responses: < 100ms
- 95th percentile TTFB: < 500ms
Time to First Token (TTFT)
Definition: For streaming responses, time until the first content token is received.
Measurement:
const startTime = performance.now();
let ttft: number | null = null;
for await (const chunk of stream) {
if (ttft === null) {
ttft = performance.now() - startTime;
analytics.track('agent.ttft', { duration: ttft });
}
processChunk(chunk);
}
Targets:
- TTFT for LLM responses: < 500ms
- TTFT for agent reasoning: < 1000ms
- TTFT 99th percentile: < 2000ms
Time to Interactive (TTI)
Definition: Time until the UI is fully responsive and users can interact.
Measurement:
// Using PerformanceObserver
const observer = new PerformanceObserver((list) => {
for (const entry of list.getEntries()) {
if (entry.name === 'first-input-delay') {
analytics.track('agent.tti', {
duration: entry.processingStart - entry.startTime
});
}
}
});
observer.observe({ entryTypes: ['first-input'] });
Targets:
- TTI for agent UI: < 3000ms
- TTI after agent response: < 500ms
- Main thread blocking: < 50ms per frame
Progress Update Frequency
Definition: How often users receive status updates during long operations.
Measurement:
const progressEvents: number[] = [];
const startTime = Date.now();
agentStream.on('progress', () => {
progressEvents.push(Date.now() - startTime);
});
agentStream.on('complete', () => {
const avgInterval = progressEvents.reduce((sum, time, i) => {
return i > 0 ? sum + (time - progressEvents[i-1]) : sum;
}, 0) / (progressEvents.length - 1);
analytics.track('agent.progress_frequency', { avgInterval });
});
Targets:
- Update frequency for operations > 3s: at least every 1-2s
- Updates should show meaningful state changes
- No "stuck" progress for > 5s without update
Related Concepts
- Latency SLO - Service level objectives for response times
- Handoff patterns - Transitioning control between agents and users
- Activation TTFV - Time to first value for agent activation
- Progress indicators - Visual feedback patterns for ongoing operations
Last updated: 2025-10-23