Where the 2 ms went: profiling MetaWave on M4 Max
An annotated trace through Instruments. We walk through the wavelet pass, the entropy stage, and the one stupid memory copy we kept missing for three releases. With pictures.
Read →Posts about Metal compute, wavelet math, what Apple Silicon actually does well, and what we got wrong on the way. Written by the people who write the code.
An annotated trace through Instruments. We walk through the wavelet pass, the entropy stage, and the one stupid memory copy we kept missing for three releases. With pictures.
Read →7,856 FPS on FullHD decode. The headline number is fine. The interesting part is what didn't change between M3 Max and M4 Max — and why the bandwidth jump matters less than you'd think.
Read →A short rant about JPEG 2000 in PACS. The codec isn't slow. The implementation choices around it are. We went and watched a clinical workflow for a day. Here's what we learned.
Read →Tile memory is the difference between a 2× speed-up and an 8× speed-up. We share the kernel layout we use for the 5/3 wavelet, and the three layouts we tried first that didn't work.
Read →The latency budget for spatial video is brutal. We explain how we fit JPEG 2000 decode inside a frame on visionOS, and why we still think it's the better long-term codec for archival.
Read →No PCIe bus. No memcpy between CPU and GPU. We measured the real-world impact for a streaming codec workload — it's not the bandwidth that wins, it's the lack of round-trips.
Read →Written from the inside of two submissions. What the reviewers actually want. What you don't need to overthink. The bits where we got it wrong the first time.
Read →We tried the ANE for RGB↔YCbCr because everyone said we should. Mixed results. Here's the honest breakdown of when it helps and when it just costs you a context switch.
Read →A short field report from a digital pathology team. They replaced a small server rack with a single Mac Studio. We kept asking "is this really fine?" The answer kept being yes.
Read →A founder note on the decision to abandon platform portability. The trade-offs are real. The benchmark wins were worth it. Some customers walked away. We can live with that.
Read →