PL4 GPU benchmarking?

Task Manager in Windows is not accurate for GPU usage, it also show near 0% when running deepprime.
Afterburner which is more accurate show around 50% usage for Rtx 3080 but power draw is around 80% of the limit.
I also need at least 2 simultaneously processed images to reach max Mpix/sec.
I’ll do some more test to see if I can reach higher speed changing some parameters.

That makes sense. I’ve just fired up a test run with Afterburner monitoring, and that’s showing GPU spikes up to around 65% in places. GPU usage definitely looks like it’s in short bursts, so I’m guessing there’s still room for optimization somewhere, whether in terms of hardware or software.

I was planning on upgrading from the 6700K when Zen 3 ships, and going up from 16 to 32GB RAM as well, so I guess it’ll be interesting to see if a faster CPU with more cores can help keep the GPU loaded, or whether I’m purely GPU bound by the 1080Ti at this point.

I added times for my Surface Book 2 notebook w/ GTX1050, both for running on AC and on battery. Not surprisingly, times are better on AC (D850 images 196 sec, Egypte image 64 seconds) than when on battery (D850 images 252 sec, Egypte image 89 seconds). The integrated Intel 620 GPU processor is marked with an asterisk by DxO (i.e., don’t use); sure enough, it creates errors when trying to run.

This was mentioned in the German DSLR-Forum. So some of the forum names are from there.

Thanks; interesting. Google Translate is quite useful.

A Mac check:

I have an iMac Pro (Base) with a Radeon Pro Vega 56 (8GB) and an Intel Xeon W CPU (8 core, 3.2GHz). Processing a very noisy high ISO Sony A7RM4 image (61.7 MB) it took 20 seconds - about 3 mb/sec. I am used to running batches with Prime so I just walk away and let them run. But the GPU performance for Deep Prime is substantially better than what I used to see.

One note: in my experience, Deep Prime is valuable for high ISO noisy A7RM4 images only - no improvement at all in good light at ISO 100. But, I also have lots of old images shot with Canon Digital Rebel cameras, and these really benefit from the Deep Prime processing.

David

1 Like

Hi
It can also help in recovering shadows

2 Likes

I added my results to the spreadsheet. Surprised to see that no other Mac user has shared results so far.

I’ve added my results to the spreadsheet.

To use the GPU I had to expressly select it in the Preferences, as I’d already established that the Auto setting for Preferences / Performance / Deep Prime Acceleration resulted in the GPU not being used.

During the EA Beta testing for PhotoLab 4 I was advised that forcing the use of this GPU might cause errors. So far, I’ve exported tens of images with DeepPrime NR (using the EA and commercial releases) without issue.

Is that 61.7 megaBYTES and 3megaBYTES/sec, or are those megaPIXELS?

Added results from my 2013 Mac Pro with 3.3GHz 8-core Xeon, 1TB SSD, 64GB RAM and dual FirePro D500 3GB GPUs. Too bad PhotoLab 4 can’t use both of the GPUs at once. With these images as well as my own, CPU and GPU times have been identical or very nearly so. My MP/s calculation is based on actual pixel dimensions, not camera spec. The D850 wedding images are 33MP, not the camera’s 45MP max - must have been cropped.

Thinking my 8-core CPU might manage better average times if I fed it more images (to utilize all available cores and threads) I downloaded the next 11 images from PhotographyBlog (those are all 45MP) and exported all 16 images at once. Again, CPU and GPU results were identical, but this time MP/s was 0.69, down from 1.5 for just the 5 wedding images). Looks like the larger 45MP image reduce MP/s performance by more than half. Ouch.
Interestingly, when running GPU, the load was handed off from one to the other partway through. Each ran consistently at 86% while under load. When running CPU, the CPU load was very spiky, remaining low most of the time and peaking up to 1100% for 5 seconds about every 30-40 seconds, all while the GPU continued to crank along at 86% with the load again switched from one GPU to the other partway through. Huh.

My conclusion is that larger batches aren’t handled any more efficiently by the 8-core CPU, and the larger 45MP images of birds substantially reduced the MP/s performance. I’m also left to wonder why the GPU is working just as hard when PL4 is set to CPU-only as when it’s set to GPU-only. The CPU seems to add virtually nothing.

Hi there,
It looks strange to me that export times are absolutely identical between CPU and GPU…
Just to be sure, after you have changed your settings for DeepPRIME, did you reboot PhotoLab? Otherwise, any change will not be taken into account…

Steven.

1 Like

Feel free to open a feature request about this. You’re not the first one with such hardware configuration, so this may get enough votes for DxO to consider it.

Um, no, I didn’t reboot. Doh!

…then it’s time to run another test I guess :wink:

It’s mega Bytes. The A7RM4 has a 61 mega pixel sensor. And the output ARW file is a compressed RAW file - the uncompressed file is larger.

I checked my (2019) Mac Book Pro 16 (Radeon Pro 550M) and it does it in 32 seconds. But I use the iMac Pro for most of my post-processing.

David

OK. Disregard everything I wrote above. I’ve retested, relaunching PL4 every time I switched DP prefs, and I’ve updated my spreadsheet entry. The upshot: Whether processing the five 33MP wedding shots, the mix of those with the next 11 45MP shots from the PhotographyBlog gallery, or just the one 51MP Egypte image, I got roughly 0.66MP/s running DeepPRIME on one FirePro D500 GPU. DP with GPU took 1.60x-2.37x longer than PRIME.

Hi. I’ve added my 2010 Mac Pro figures. CPU times are unsurprisingly slow but GPU times are 14.5 times faster. Using a Radeon RX580 8GB.

1 Like

I’ve just added my results from my 2018 MacBook pro with and without a Vega 64 eGPU
I also added my windows machine with a Vega 64 GPU and GTX970 (just swapped out for the Vega 64) and (ancient) i7-6700K

1 Like

I added the DeepPRIME times from my Windows 10 computer: Intel i7-4790K, 32GB RAM, Nvidia GeForce GTX 1050Ti.

The D850 images batch export took 137 seconds (1.22 MPx/s), the Egypte image export took 47 seconds (1.09 MPx/s). No surprises here.