PL4 GPU benchmarking?

dma · October 21, 2020, 8:26pm

I just upgraded from PL3 to PL4. I shoot a lot of high ISO images so DeepPRIME performance is important to me.

So is there a particular GPU benchmark that we should be looking at to predict performance with the new DeepPRIME noise reduction?

Is DeepPRIME performance expected to be driven by OpenCL, OpenGL, CUDA, etc performance of a given card? Which library applies?

What characteristics of a GPU card will be most dominating and beneficial? Memory amount, memory speed, GPU speed, PCI bus bandwidth, etc.

I currently have a Nvidia 1060 6gb. What would be a cost effective step up?

StevenL · October 21, 2020, 8:34pm

Hi there, and welcome to our forum !

Here you go:

PS/ with your GPU you should expect about 1" per 2Mpx, meaning that exporting an image of 20Mpx shouldn’t be longer than 10-12". Of course, with a more powerful graphic card, you’ll be able to export your images even faster. Have a look at the two links posted above.

Cheers,
Steven.

WitheringtonM · October 21, 2020, 10:36pm

My camera shoots 36mp images and when applying DeepPrime it takes around 13 seconds to process a high ISO image with preferences Auto. I have a Nvidia 1660ti and a latest i7 running windows 10. How many mp are you processing?

uncoy · October 22, 2020, 6:23am

With a Radeon VII on Mojave, I get about twelve to fifteen seconds for D850 ISO 16000 images (45 MP). That’s 1 second per 2 to 4 MP. The lower count is for images which have been somewhat cropped, reducing the MP count. So 3 MP per second. Very impressive, in comparison to the more than one minute per image, sometimes close to two which Prime under Photolab 3 managed.

dma · October 22, 2020, 6:31am

I downloaded the 4 test images located here:

those images = 183MB total.

I loaded the 4 into PL4 with DXO Standard settings (default) with DeepPrime noise reduction turned on.

I exported them to disk. Reading images from SSD and writting to SSD. I have a Ryzen 2700x CPU and a Nvidia 1060/6gb GPU.

Total time to export took 62secs. This included a 5sec or so startup delay while the pipeline was being loaded. So I am averaging about 1 sec to 3Mpx. Which seems similar to the others who have posted here. I can see with gpu-z application that my GPU is definitely getting used, occasionally maxing out to 100% utilization.

As a second test, I downloaded this image:
https://www.dxo.com/media.dxo.com/photolab/deepprime/raw/Egypte---copyright-Corinne_Vachon.raf
Which is 111MB in size. DXO Standard+DeepPrime took 35secs to process for that one image. Also yielding about 3Mpix per sec processing speed.

Will a faster GPU make a difference or am I limited elsewhere in my setup?

StevenL · October 22, 2020, 9:42am

A faster GPU will definitely speed up the rendering process!

kokofresha · October 22, 2020, 10:40am

@StevenL,

Nvidia Geforce RTX 3080 or Nvidia Quadro rtx 4000 ?
Which card will be better now and for the next few years according to the developers?
Which video card resource is most important for a Photo Lab? Cuda cores, memory or something else?

StevenL · October 22, 2020, 10:59am

Hi there,
I’ll transfer your question to our dev teams and I’ll let you know.

Thanks!
Steven.

Savay · October 22, 2020, 12:03pm

I also use a 2700X but with a Sapphire Vega64 Nitro+.

With the same 4 RAWs and Preset (DXO Standard + DeepPrime) i get around 33-35 sek. (From an M.2 SSD to another M.2)

The 111MB GFX50 RAW itself will take around 16-18 sek.

Since the Vega is a very different µArch the comparison is a bit off but It seems nontheless that the RAW FP32 Power of the Cards translate quite well into an almost equal performance gain.

Would be interesting though if the use of FP16 could speed this up on certain GPUs and also if Tensor Cores get utilized.

StevenL · October 22, 2020, 12:10pm

@kokofresha
OK, here you go. Have a look at this chart to get a direct comparison: https://www.videocardbenchmark.net/compare/GeForce-RTX-3080-vs-Quadro-RTX-4000/4282vs4053

A higher computational powered GPU will translate into better DeepPRIME performance.
You can have more info on our FAQ here:

Cheers,
Steven.

dma · October 22, 2020, 1:50pm

Interesting GPU performance blog here:
https://blog.inten.to/hardware-for-deep-learning-part-3-gpu-8906c1644664

Note the FP32 chart about halfway into the blog. It lists my lowly NVIDIA 1060 and many other GPU’s against their FP32 performance. If in fact DeepPRIME is linear with FP32 performance then this chart can help users determine the relative performance of their cards vs possible hardware updates.

It would be helpful to have additional verification from DxO on how closely DeepPRIME truely tracks FP32 performance.

Another interesting site for GPU benchmark comparisions is here:

IanS · October 22, 2020, 4:39pm

Here is a comparison of GPU’s for use with Topaz Sharpen AI in stabilise mode for a Nikon D850 image - DPR Forum. Rough timings because not everybody used the same image file but I would expect something similar with DXO. A similar excercise could be performed on this forum?

Topaz Sharpen GPU figs

KameraD · October 22, 2020, 4:46pm

Can confirm this, although without stopwatch and only felt. Export is definitely faster than with PL 3.3.

uncoy · October 22, 2020, 5:05pm

Though everyone else is talking about Nvidia here, recent versions Apple OS X only run on AMD. Radeon VII rocks on Photolab 4 with about 15 seconds per image with many corrections and DeepPrime (D850, whether cropped to about 30 MP or almost all 45 MP). Finally some hardware acceleration.

Not sure how Radeon cards do on Windows.

mwsilvers · October 22, 2020, 5:25pm

Of course it’s not surprising that Nvidia dominates the conversation since I believe their market share is somewhere around 80% and most people here are probably using one. But it would be great to get more feedback on the performance of various Radeon models on both Macs and PCs.

Mark

jch2103 · October 22, 2020, 5:32pm

Some kind of community-generated benchmark for various GPUs would be useful. But it should be based on a standard set of images available to everyone. I’m not sure if there’s an appropriate set of software to capture this information (including CPU, memory, etc.), either. I’ve tried hand timing results from PL4 but my precision isn’t very good.

IanS · October 22, 2020, 7:04pm

Simply agree which files to download from a web site like

A large file like a Nikon D850 will extend the process time and improve accuracy.

If you use say 5 files and simply batch process them with only the default DXO correction and DeepPrime enabled then DXO gives a readout of how long it took to produce the batch.

People can then report the time and CPU/GPU used.

IanS · October 22, 2020, 7:12pm

DXO will report the time like this:
DXO V4 DeepPrime times

Savay · October 22, 2020, 7:54pm

Well i`ve already given some numbers for a Radeon RX Vega64 under Windows 10 a little bit earlier.
24 32MPIx with my Standard Preset + DeepPrime are finished in around 180seconds.
So It runs somewhere between 3-5 MPx/Sec depending on the files, batch length and corrections applied.

The CPU does still seem to play a role though, since i`ve seen numbers from a 32C Zen2 Threadripper with a GTX1660 crunching a single 50MPix files between 8-10seconds.

dma · October 22, 2020, 7:54pm

I happen to have a RX580/8GB card here left over from my bitcoin mining days. I repeated the test on the Egypte 111MB image.

https://www.dxo.com/media.dxo.com/photolab/deepprime/raw/Egypte—copyright-Corinne_Vachon.raf
DXO Standard+DeepPrime

Windows 10
AMD Ryzen 2700x cpu, RX580/8GB took 29secs to output.
AMD Ryzen 2700x cpu, Nvidia 1060/6GB took 35 secs to output.