PL4 GPU benchmarking?

The results you mentioned are not conflicting!
One run is performed on the GPU…the other one on the NPU (aka AI Accelerator aka the 16 Core Apple Neural Engine or ANE! with it’s ~11TOPS which equals ~11TFLOPS for fp16) Hence also the difference in the coloured cells in the “GFLOP” Rows.

All M1 (and also the A14) have the same 16 Core ANE. The Max and the Pro even the same CPU.
Only for the memory controller there is a slight difference…so it’s no wonder those are basically performing the same while being a tad faster than a regular M1.

In fact the presence of the same NPU in each A14/M1 derivative suggests even a iPhone 12 or an iPad Mini could theoretically perform almost in the same ballpark. (lower Powerlimit and fewer CPU cores and weaker memory subsystem aside…)

In fact this also suggests that DeepPrime itself could also in theory run reasonably well on some ARM SoCs used for Android Devices or Chromebooks or Windows on ARM, since those have pretty strong dedicated AI/ML Accelerators too. (Especially the Snapdragons and Exynos)
Some of the newer ones are almost twice as fast as Apples. (24-26 TOPS)

I added my results for my new pc which has an i5 12600k (just using built in UHD770 graphics) with Windows 10 and 11.

Windows 10 was not using the the performace cores, only the ecores used and as a result took a painstakingly 337s to process ‘Eygpte CPU only’ test. Whereas Windows 11 used all the cores and did it in 72s.

So on the new 12th gen intel chips it is worth while upgrading to Windows 11 if you are using only the cpu to process images.

1 Like

Is this still the latest thread for this benchmark? I just added my results with DeepPRIME for the Egypte 111MB image using Photolab 6 trialware under Windows 10.

GPU (Radeon RX 470 with 4 GB) - 41 seconds
CPU (Intel i7-4770K with 16 GB) - 208 seconds

I just added my results on a 4080/16. I’m surprised to see how close it is to the 4090 in these benchmarks but not sure how my other hardware may be impacting the scores. Would love to see more 4080 and 4090 along with the new AMD 7900 GPUs.

I have added four rows to the spreadsheet with results for my new Apple Macbook Pro 14 (10C CPU, 16C GPU, 16C ANE). The combinations are for DeepPRIME and DeepPRIME XD using acceleration setting of Apple Neural Engine or GPU Apple M1 Pro. Obviously worth changing the setting from the default (GPU) to use ANE.

1 Like

Hello,
I updated the graph I made (see here) showing processing speed with DeepPRIME as a function of GPU FP32 score.
I have added the new results for PhotoLab 4, 5, 6, and 6 with DeepPRIME XD.

gif

Please don’t forget to complete the Google spreadsheet with your own results.

Regards,

There’s definitely a good enough point where PhotoLab works just fine for live editing and export speeds reach a 10 to 15 sec per image. One doesn’t have to buy the most expensive graphic card. Almost any card with 8GB VRAM will put a user in the top bracket. Many of the 4GB cards will do fine too. Sorry, I forgot that some people want to use the very slow Prime XD function to generate artificial textures on large batches of images. I’m sticking to Prime.

Looking more closely at the Google spreadsheet it looks like the arbitrary cutoff point for good Prime processing and no slow down in real time previews would be an nVidia GTX 980/1060 or an AMD 480X/580X. Those are all old cards and available for €200 or less. Trying to use DxO PhotoLab 5 or 6 without a card in this category or close is somewhat masochistic.

1 Like

I recently upgraded my PC (built a new):

  • WIN 11 PRO
  • PL6, v6.2.0
  • Asus PRIME Z790-P WIFI
  • i7-13700KF, 16 cores
  • 32 MB DDR5
  • Asus NVIDIA GeForce RTX 3060, 4095 MB GDDR6

Tested with test image “Egypte” to jpg, no resizing, DxO Standard preset.

  • High Quality 5 seconds
  • Prime 18 seconds
  • Deep Prime 5 seconds
  • Deep Prime XD 14 seconds

Generally, this is a good result considering earlier experiences. I have been able to export several images at the same time and still simultaneously being able to do editing work with the next ones without problems. PL can now be used economically in my workflow. I hope this computer will be good for software improvments a long way into the future.

I am curious about what is the reason that the legacy PRIME seems to be slower than newer versions. I gladly avoid choosing that any more and feel no need to use it.

1 Like

Unlike DeepPRIME and DeepPRIME XD, PRIME makes no use of your graphic card’s GPU. The only reason that PRIME still exists is for those people whose GPU is not supported for DeepPRIME and DeepPRIME XD processing. In that case, PRIME would be a much faster option, especially on an older and slower computer.

Mark

1 Like

It must be something like that.
For me it is good to know that Deep Prime is as fast as the High Quality and could be used for any image unless the XD is wanted.
I am really happy wit this now!

2 Likes

Good Morning,

i want to add the table with my resluts but i can´t download any images.
Does anyone have an archive of all the images?

Conny

I also couldn’t download nikon images in chrome or edge, so i just opened them all in new tabs and changes extension from .jpg to .nef in the address bar and they downloaded

I’m sure some are interested (as was I) so i bit a bullet and got a intel arc a750 as in some topaz benchmarks its really good.
All I can say is don’t bother with it yet for PL or better yet stay away.

Tested on newest PL version, comparing to AMD 5600xt in same system ARC a750 where it did work was very slow, at least 50% slower.
I didn’t even let it complete DeepPrime or DeepPrime XD batches as it just paused after first file and nothing was happening.

What is worst single file exports or first files from a batch which did complete sometimes came out as just mostly black JPGs.
All in all either DXO needs to add support or intel needs to fix their drivers but as of right now and latest “stable” intel drivers the card doesn’t work for PhotoLab DeepPrime or DeepPrimeXD

@MA57k sorry to hear about your ARC A750 woes, elsewhere the black JPGs have been attributed to old NVidia drivers where users have NVidia cards(obviously, sorry).

I am currently in the process of upgrading to a Ryzen system, but with an NVidia RTX3060 card, the processor I chose was a 5600G and was surprised that it could process images in ‘DP’ and ‘DP XD’ albeit slowly

PS. This is repeating the run with 3 simultaneous copies and it shows a issue I have seem throughout my performance testing with the Nikon small batch in particular

The timings should be similar if not better but if you watch the thumbnails while the export is taking place with this batch then either between the third and fourth image and/or the fourth and fifth image DxPL seems to become confused and “stalls”!?

Currently this is DxPL(Win) 6.1.0 running, I will upgrade this particular copy to PL6.3.1 later today, but 25 seconds becomes 45 seconds (the build on the machine is lightweight and nothing is running in the background)!? It is as if with 3 or 4 copies when it can no longer assign multiple executions to the “worker” stacks it loses focus!

With this batch it starts with one image and when complete it starts 2 simultaneously, then it would move to three then four etc. but by the time it gets to 3 images there are not enough images left to move to 3 simultaneous exports.

With longer batches if it suffers this problem it will be at the end of the batch so I need to watch to see if that is happening, other nerdy activities I could undertake are …

New intel drivers got released today and same issue, ARC still not working with DeepPrimeXD, just outputs a black frame, and within PL DeepPrimeXD preview also goes black.

I’d like to contribute my results…

Mac Studio on OS 13.4 (Ventura), 64GB of RAM. This system has 10 CPU cores, 32 Graphic cores.
For what it’s worth, I’m on a 32" display running at 2560x1440 to minimize the scaling issue on Mac OS.
(all with “DXO Standard” selected and then DeepPrimeXD applied. No changes or modifications to any setting or slider). I’m processing 2 images at a time. I’m using PL v 6.7 build 52. This build allows me to select the ANE, which I’ve done for these results. I note that there is no way to easily notate this in the spreadsheet.

Egypt: 25s
Nikon: 73s
Canon d90 263s
Canon r6 167s

Using just DeepPrime (no XD) I get:

Egypt: 9s
Nikon: 21s
Canon d90 67s
Canon r6 57s

I knew that Egypt on CPU would be slow, and it is — but it looks like the Mac/Mx algorithm is single threaded for XD, as it was using <3% of the system while running.

I recently upgraded my computer from an i7-3770K (16GB)/GTX950(2GB) to an i7-13700 (32GB)/RTX 4070 (12GB). Old computer times from spreadsheet used PL4.0; new computer times use PL6.8. Huge processing speed increases in processing speed!

DEEP Prime XD - Old computer - D850 images: 196 sec., Egypte: 67 sec.
DEEP Prime XD - New computer - D850 images: 20 sec., Egypte: 9 sec.
(New computer times using High Quality: 15 sec./5 sec.)
(New computer times using Prime: 43 sec./19 sec.)
(New computer times using DEEP Prime: 11 sec./4 sec.)

PL 6 Prime XD clearly takes full advantage of modern GPU hardware. Using Prime is counter-productive and there’s little speed benefit in processing time with HQ.

Very interesting!

Hi, I’ve just send a request to access the file.
AMD Ryzen 7600, Nvidia 4070, G.Skill Flare X5, DDR5-6000, CL36, AMD Expo - 32 GB Dual-Kit
Win 11, PL6.7.
Egypt : PrimeXD : 7s, CPU : 241 ; Prime : 3s,CPU : 63s

Looking at the various result with 3090, 4090, 7900XTX, the nvidia cards make good use of their IA tensor unit.

Hi All,

I thought I would test out my new MacBook Pro with M3 Max (14", 14 core CPU/30GPU, 36GB RAM, 1TB SSD)

PhotoLab 7.1.0 - DeepPRIME XD

D850 Images: 52s
Egypt: 15s
Egypt (CPU): 222s
R6: 106s
90D: 179s

1 Like