We can do better though — we'll target rendering a million scrollable thumbnails with the strategy detailed in this post. If we can do that, Photon Transfer won't have any trouble using the same strategy to render 20k thumbnails.
Instead, to get the scrolling performance we need, we'll use the GPU directly with Metal APIs / CAMetalLayer, use a custom NSScrollView subclass to handle translation / magnification, and employ a few other tricks to efficiently shuttle thumbnail data from disk to the GPU.
GridLayer
GridLayer
, is tasked with determining which thumbnails are currently visible and arranging for the GPU to render them with Metal. For efficiency, GridLayer
doesn't tell the GPU to render each visible thumbnail individually; rather, it tells the GPU to render groups of thumbnails, where each group consists of a pre-defined number of thumbnails — 2048 at the time of writing. That's because a group of thumbnails can be efficiently passed to the GPU as a so-called texture array.
GridLayer
is also responsible for applying the correct translation and scaling transformation to the thumbnails to match the current scroll offset and magnification level of the NSScrollView. This information is provided by AnchoredScrollView
, discussed below.
GridLayer
uses an LRU cache to hold on to the most recent thumbnail textures, in order to speed up redrawing. As the user scrolls, new thumbnails come into view and are therefore loaded and placed into the cache, thereby evicting the least-recently-used thumbnail textures. AnchoredScrollView
AnchoredScrollView
is a NSScrollView subclass that tweaks NSScrollView's standard behavior: instead of letting NSScrollView handle translating/scaling its document view, AnchoredScrollView
keeps the document view anchored to the visible rect of the NSScrollView and transmits the translation/scaling information to the document, so that the document can handle translation/scaling itself. Specifically that means AnchoredScrollView
supplies the transformation matrix to GridLayer
, and GridLayer
supplies the transformation matrix to the GPU.
This strategy might raise the question: if we're undoing the translation/magnification behavior of NSScrollView, why use NSScrollView at all?
We still want to use NSScrollView to retain its customary behaviors that users expect on macOS:
The combined 106 thumbnails are therefore 57 GB uncompressed, versus 14 GB compressed.
GridLayer
uses the -replaceRegion:
Metal API to perform this decompression, and does so lazily when the thumbnails come into view (assuming the needed textures aren't already in the LRU cache). mmap
this blob into our address space and let the virtual memory subsystem handle shuttling the data between RAM and disk.
Instead of decompressing and loading our thumbnail textures on the main thread, we could instead load them on a background thread and display a placeholder until the texture is loaded. This would prevent the stutters that occur when loading data on the main thread.
This could be further improved by displaying placeholder content that matches the color tone of the thumbnail that's about to appear. (This placeholder color would be stored in the giant texture blob alongside the compressed image data.)
To maximize the chance of a cache hit when requesting thumbnails on the main thread, a background thread could load the thumbnails for regions that are adjacent to the visible region of the scroll view.