Performance Overhaul: Uploads, Memory, and Cloud Sync
A deep dive into the reliability and performance work we shipped in late March, from fixing memory leaks to rebuilding the upload engine.
We spent the last two weeks of March focused almost entirely on reliability and performance. No new features, just making the existing ones work better. Here’s what we found and fixed.
Memory Leaks
The web app had several memory leaks that caused it to slow down during long sessions. The worst was a Victory.js chart component that was churning SVG DOM nodes at 30MB per second. We also found that Redux DevTools was unconditionally enabled in production, growing unbounded.
The full list of memory fixes:
- Bounded Redux slices. Events, comments, AI chat messages, and user data now have eviction caps. Events cap at 200 per profile and 50 profiles. AI chat caps at 500 messages per chat, 50 chats per namespace.
- Replaced 31 unbounded lodash.memoize calls with an LRU-bounded memoize utility across the codebase.
- Capped the GTM dataLayer at 500 entries to prevent unbounded growth during sessions.
- Fixed stale setState calls after unmount in ImageContents, FileThumbnail, and useFileUrl with cancellation flags and AbortControllers.
- Fixed comment overlay timeouts leaking on unmount, and CloudPickerModal polling intervals not being cleared.
- Moved PDFViewer StyleSheet.create to module level to avoid per-render allocation.
Upload Engine
Large file uploads over 1GB were silently failing. We traced this to several issues:
- The upload engine used a
requestAnimationFrameloop that ran at 60fps even when nothing was happening. We replaced it with a 500ms interval that only runs when uploads are active. - Chunk upload timeouts were fixed at 60 seconds regardless of chunk size. Now they scale dynamically based on the chunk being uploaded.
- Files beyond the max concurrent upload count never started because the queue drain timer was missing.
- Stale thumbnails persisted after file replacement uploads.
- IndexedDB
InvalidStateErrorand localStorageQuotaExceededErrorcrashes during uploads on mobile and Android WebViews are now handled gracefully.
We also added chunk upload streaming end-to-end, which fixed a 1.5GB memory exhaustion issue where the platform was buffering entire chunks in memory before writing them.
WebSocket Stability
The realtime system had a reconnection storm problem. When the server rate-limited a WebSocket connection, the client would immediately try to reconnect, get rate-limited again, and repeat at high frequency.
We split WebSocket and polling backoff into separate strategies, added auth error detection (so expired tokens don’t trigger reconnection spam), and added a managed three-tier backoff system: rate limit backoff, auth failure backoff, and generic reconnection backoff.
We also eliminated all realtime traffic from hidden browser tabs. When you switch away from the tab, the WebSocket disconnects cleanly and reconnects when you return.
Cloud Import Webhooks
Google Drive, Dropbox, Box, and OneDrive imports now support near-real-time sync via webhooks. Previously, imported files only updated on a polling interval. Now when a file changes in your connected cloud provider, a webhook fires and the sync happens within seconds.
The webhook infrastructure includes subscription lifecycle management with auto-renewal, and the cloud import credentials have been moved from environment variables to Vault for better security.
Security Hardening
We audited the entire codebase for sensitive data in logs and found plaintext passwords, API keys, JWT payloads, and bearer tokens being logged in multiple places. All of them have been removed.
We also upgraded several packages for CVE patches: symfony/http-foundation for an authorization bypass, firebase/php-jwt for weak encryption, and google/protobuf for a high-severity DoS vulnerability. The Stripe PHP SDK was migrated from v16 to v20.
Multiplayer Performance
The real-time collaboration system was generating 600 re-renders per second from unstable context references. We:
- Replaced
JSON.stringifyequality checks with shallow comparison in multiplayer heartbeat - Throttled resize and viewport sync handlers at 150ms
- Debounced presence hooks at 50ms
- Added an O(1) clientId cache for the cursor overlay fast path
What’s Next
This kind of work isn’t glamorous but it’s the foundation everything else is built on. The app is measurably faster and more stable. Long sessions no longer degrade. Uploads work reliably at any file size. WebSocket connections don’t flap. We’ll keep doing this work alongside feature development.