Skip to content

Add virtual scrolling and performance optimizations for large buckets (10k+ objects) #39

@aldy505

Description

@aldy505

Currently, Atrium struggles with folders containing thousands of objects. Loading and rendering 10k+ items causes performance issues: slow initial load, sluggish scrolling, high memory usage, and unresponsive UI during sort/filter operations.

This issue addresses large-bucket UX by implementing virtual scrolling, adaptive pagination, skeleton loading states, search debouncing, folder size warnings, and sort performance optimizations.

Target Performance Goals

  • 10k items: Smooth scrolling, <2s initial load, <100ms interaction response
  • 50k items: Usable scrolling, <5s initial load, warning indicators
  • 100k+ items: Functional with degraded features (warnings, limited sort/filter)

Requirements

1. Virtual Scrolling (Critical)

  • Replace current list rendering in ObjectTable.tsx with @tanstack/react-virtual
  • Only render visible rows + small buffer (e.g., 10 rows above/below viewport)
  • Maintain scroll position when navigating back to folder
  • Support variable row heights if needed (folders vs files with previews)
  • Preserve keyboard navigation (arrow keys, tab focus)

2. Adaptive Page Sizes

  • Start with small page size (100 items) for fast initial render
  • Increase page size as user scrolls (250 → 500 → 1000)
  • Prefetch next page when user reaches 80% of current viewport
  • Show "Loading more..." indicator before user hits bottom
  • Update App.tsx pagination logic to support variable page sizes

3. Skeleton Loading States

  • Replace generic spinners with skeleton rows in ObjectTable.tsx
  • Skeleton should match table layout (columns for name, size, date, etc.)
  • Show estimated progress: "Loaded 2,342 of ~50,000 objects" (use S3's KeyCount and IsTruncated)
  • Progressive enhancement: render folders first, then files (folders typically load faster)

4. Search Debouncing

  • Add 300ms debounce to search input in App.tsx
  • Show loading indicator during debounce period
  • For folders >10k items, display warning: "Search limited to loaded items. Use folder navigation or S3 prefix filtering for better results."
  • Clear search when navigating to different folder

5. Folder Size Warnings

  • Before entering a folder with 10k+ objects, show warning badge/indicator
  • Display estimated object count if available from backend
  • Options: "Load first 1,000 only" vs "Load all (may be slow)"
  • Integrate with existing folder size calculation from background job (if enabled)
  • Add folder health indicators in UI based on object count thresholds

6. Sort Performance Optimization

  • Small folders (<1k): Client-side sort, all options available
  • Medium folders (1k-10k): Client-side sort with "Sorting..." indicator
  • Large folders (>10k): Disable client-side sort or show performance warning
  • Default to S3's native lexicographic ordering (no extra sort for name ascending)
  • Add "Sort by name only (S3 native)" option that uses StartAfter for pagination
  • Document that S3 doesn't support server-side sort by size/date

Acceptance Criteria

Virtual Scrolling:

  • @tanstack/react-virtual integrated into ObjectTable.tsx
  • Only visible rows rendered (check DOM with 50k items - should see ~20-50 elements max)
  • Smooth 60fps scrolling through 50k items
  • Scroll position preserved when navigating back
  • Keyboard navigation works (arrow keys, tab)
  • Screen reader announces current position ("Item 145 of 50,000")

Adaptive Pagination:

  • Initial load fetches 100 items
  • Subsequent pages grow (250 → 500 → 1000)
  • Prefetch triggers at 80% viewport scroll
  • "Loading more..." indicator appears before user hits bottom
  • Works seamlessly with virtual scrolling

Skeleton States:

  • Skeleton rows match table layout exactly
  • Progress indicator shows "Loaded X of ~Y objects"
  • Folders render before files (if mixing in same view)
  • No layout shift when real data replaces skeleton

Search Debouncing:

  • Search debounced to 300ms
  • Loading indicator during debounce
  • Warning shown for large folders (>10k)
  • Search clears when changing folders
  • Debounce cancels if user navigates away

Folder Warnings:

  • Badge/icon shows for folders with >10k objects
  • Warning dialog before entering large folder
  • "Load first 1,000" option limits initial fetch
  • "Load all" option available with performance disclaimer
  • Integrates with background bucket size calculation (if available)

Sort Performance:

  • Small folders (<1k): All sort options work instantly
  • Medium folders (1k-10k): Sort with loading indicator
  • Large folders (>10k): Warning or disabled sort for size/date
  • "Name (A-Z)" uses S3 native ordering (no client-side sort)
  • Documentation explains S3 sort limitations

Technical Implementation

Dependencies:

{
  "@tanstack/react-virtual": "^3.0.0"
}

Files to Modify:

  • src/components/ObjectTable.tsx - Add virtual scrolling
  • src/app/App.tsx - Update pagination logic, add adaptive page sizes
  • src/components/FilePreview.tsx - Ensure works with virtualized list
  • src/types.ts - Add types for pagination state, folder metadata
  • Update relevant test files

Virtual Scrolling Example (ObjectTable.tsx):

import { useVirtualizer } from '@tanstack/react-virtual';

function ObjectTable({ objects, onObjectClick }) {
  const parentRef = useRef<HTMLDivElement>(null);
  
  const virtualizer = useVirtualizer({
    count: objects.length,
    getScrollElement: () => parentRef.current,
    estimateSize: () => 48, // Estimated row height in px
    overscan: 10, // Render 10 extra rows above/below viewport
  });
  
  return (
    <div ref={parentRef} style={{ height: '600px', overflow: 'auto' }}>
      <div
        style={{
          height: `${virtualizer.getTotalSize()}px`,
          width: '100%',
          position: 'relative',
        }}
      >
        {virtualizer.getVirtualItems().map((virtualRow) => {
          const object = objects[virtualRow.index];
          return (
            <div
              key={virtualRow.key}
              style={{
                position: 'absolute',
                top: 0,
                left: 0,
                width: '100%',
                height: `${virtualRow.size}px`,
                transform: `translateY(${virtualRow.start}px)`,
              }}
            >
              <ObjectRow object={object} onClick={onObjectClick} />
            </div>
          );
        })}
      </div>
    </div>
  );
}

Adaptive Pagination Logic (App.tsx):

const [pageSize, setPageSize] = useState(100);
const [loadedCount, setLoadedCount] = useState(0);

const calculateNextPageSize = (currentLoaded: number) => {
  if (currentLoaded < 100) return 100;
  if (currentLoaded < 500) return 250;
  if (currentLoaded < 2000) return 500;
  return 1000;
};

const loadNextPage = async () => {
  const nextSize = calculateNextPageSize(loadedCount);
  setPageSize(nextSize);
  // Fetch next page with calculated size
};

Search Debouncing (App.tsx):

import { useDebouncedCallback } from 'use-debounce';

const debouncedSearch = useDebouncedCallback(
  (searchTerm: string) => {
    // Perform search
    filterObjects(searchTerm);
  },
  300 // 300ms delay
);

const handleSearchChange = (e: React.ChangeEvent<HTMLInputElement>) => {
  setSearchTerm(e.target.value);
  debouncedSearch(e.target.value);
};

Testing Requirements

Performance Benchmarks:

  • Generate test bucket with 10k objects - measure load time <2s
  • Generate test bucket with 50k objects - measure load time <5s
  • Generate test bucket with 100k objects - measure usability
  • Monitor browser memory usage (should stay under 500MB for 50k items)
  • Test scroll performance (60fps on modern hardware)
  • Test on lower-end devices (older laptops, tablets)

Edge Cases:

  • Empty folders render correctly
  • Single-item folders don't show pagination
  • Extremely long file names don't break layout
  • Rapid folder navigation doesn't leak memory
  • Search with 0 results shows appropriate message
  • Interrupted loads handle gracefully (user navigates away mid-load)

Browser Compatibility:

  • Chrome/Edge (latest)
  • Firefox (latest)
  • Safari (latest)
  • Mobile browsers (iOS Safari, Chrome Android)

Documentation Updates

README.md additions:

  • Document large bucket performance characteristics
  • Explain adaptive pagination behavior
  • List known limitations (e.g., no server-side sort by size/date)
  • Provide tips for managing very large buckets (use folder organization, lifecycle rules)

current-state.md updates:

  • Update "5k+ buckets" validation notes with new capabilities
  • Document performance targets and test results
  • List any degraded features for 100k+ item folders

Future Enhancements (Out of Scope for This Issue)

  • Memory management with windowed data (only keep 5-10k items in memory)
  • Stale-while-revalidate cache pattern for folder listings
  • Backend S3 prefix filtering for search
  • Bulk selection optimization ("Select visible" vs "Select all")
  • Progressive loading of folder metadata (size, object count)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions