Interesting... haven't done that for a while. Have you tried in-memory with ZFS delayed writes?
Honestly, I haven't setup ZFS in, I wanna say about 5 years. Best thing to ever come out of Solaris. But this machine is my "dev machine" so it gets scrambled all the time, probably worth trying.
ZFS has some interesting performance trends on higher memory bandwidth and high core count mobos... especially when you can delay writebacks if that's a SSD/NVMe problem.
So, for a single mobo, a must! For cluster backends, it depends a lot more... as it's a DDoS game (and highly depends on fabric being IP-based or RDMA).
With ASIC cards and SAS stuff, you can overcome some of these problems, but memory ZFS stuff is becoming way interesting. Especially when comparing with PCIe speeds.
That's def worth looking into. Even with every tweak I could throw at it, ran into a physical bottleneck, the SSD can only eat it so fast. It's pegged at 100% util and "still going to take for fucking ever"
Depending on the SSD, you can also tune the queue and block size to perform to its best.
NVMe's are better because of their RAM cache and controllers' further "atomic" stuff, but SAS enterprise cards' stuff does many more atomic commands, which, when used on SAS controllers, drops the latency of high IO/ps quite a lot. So, depends on what you have...
New BIG NVMe's are a different beast, I am still exploring... and they are almost like a computer! It's going to be a game on PCIe over those...