The results you mentioned are not conflicting!
One run is performed on the GPU…the other one on the NPU (aka AI Accelerator aka the 16 Core Apple Neural Engine or ANE! with it’s ~11TOPS which equals ~11TFLOPS for fp16) Hence also the difference in the coloured cells in the “GFLOP” Rows.
All M1 (and also the A14) have the same 16 Core ANE. The Max and the Pro even the same CPU.
Only for the memory controller there is a slight difference…so it’s no wonder those are basically performing the same while being a tad faster than a regular M1.
In fact the presence of the same NPU in each A14/M1 derivative suggests even a iPhone 12 or an iPad Mini could theoretically perform almost in the same ballpark. (lower Powerlimit and fewer CPU cores and weaker memory subsystem aside…)
In fact this also suggests that DeepPrime itself could also in theory run reasonably well on some ARM SoCs used for Android Devices or Chromebooks or Windows on ARM, since those have pretty strong dedicated AI/ML Accelerators too. (Especially the Snapdragons and Exynos)
Some of the newer ones are almost twice as fast as Apples. (24-26 TOPS)
I added my results for my new pc which has an i5 12600k (just using built in UHD770 graphics) with Windows 10 and 11.
Windows 10 was not using the the performace cores, only the ecores used and as a result took a painstakingly 337s to process ‘Eygpte CPU only’ test. Whereas Windows 11 used all the cores and did it in 72s.
So on the new 12th gen intel chips it is worth while upgrading to Windows 11 if you are using only the cpu to process images.
Is this still the latest thread for this benchmark? I just added my results with DeepPRIME for the Egypte 111MB image using Photolab 6 trialware under Windows 10.
GPU (Radeon RX 470 with 4 GB) - 41 seconds
CPU (Intel i7-4770K with 16 GB) - 208 seconds
I just added my results on a 4080/16. I’m surprised to see how close it is to the 4090 in these benchmarks but not sure how my other hardware may be impacting the scores. Would love to see more 4080 and 4090 along with the new AMD 7900 GPUs.
I have added four rows to the spreadsheet with results for my new Apple Macbook Pro 14 (10C CPU, 16C GPU, 16C ANE). The combinations are for DeepPRIME and DeepPRIME XD using acceleration setting of Apple Neural Engine or GPU Apple M1 Pro. Obviously worth changing the setting from the default (GPU) to use ANE.
Hello,
I updated the graph I made (see here) showing processing speed with DeepPRIME as a function of GPU FP32 score.
I have added the new results for PhotoLab 4, 5, 6, and 6 with DeepPRIME XD.
Please don’t forget to complete the Google spreadsheet with your own results.
There’s definitely a good enough point where PhotoLab works just fine for live editing and export speeds reach a 10 to 15 sec per image. One doesn’t have to buy the most expensive graphic card. Almost any card with 8GB VRAM will put a user in the top bracket. Many of the 4GB cards will do fine too. Sorry, I forgot that some people want to use the very slow Prime XD function to generate artificial textures on large batches of images. I’m sticking to Prime.
Looking more closely at the Google spreadsheet it looks like the arbitrary cutoff point for good Prime processing and no slow down in real time previews would be an nVidia GTX 980/1060 or an AMD 480X/580X. Those are all old cards and available for €200 or less. Trying to use DxO PhotoLab 5 or 6 without a card in this category or close is somewhat masochistic.
Tested with test image “Egypte” to jpg, no resizing, DxO Standard preset.
High Quality 5 seconds
Prime 18 seconds
Deep Prime 5 seconds
Deep Prime XD 14 seconds
Generally, this is a good result considering earlier experiences. I have been able to export several images at the same time and still simultaneously being able to do editing work with the next ones without problems. PL can now be used economically in my workflow. I hope this computer will be good for software improvments a long way into the future.
I am curious about what is the reason that the legacy PRIME seems to be slower than newer versions. I gladly avoid choosing that any more and feel no need to use it.
Unlike DeepPRIME and DeepPRIME XD, PRIME makes no use of your graphic card’s GPU. The only reason that PRIME still exists is for those people whose GPU is not supported for DeepPRIME and DeepPRIME XD processing. In that case, PRIME would be a much faster option, especially on an older and slower computer.
It must be something like that.
For me it is good to know that Deep Prime is as fast as the High Quality and could be used for any image unless the XD is wanted.
I am really happy wit this now!
I also couldn’t download nikon images in chrome or edge, so i just opened them all in new tabs and changes extension from .jpg to .nef in the address bar and they downloaded
I’m sure some are interested (as was I) so i bit a bullet and got a intel arc a750 as in some topaz benchmarks its really good.
All I can say is don’t bother with it yet for PL or better yet stay away.
Tested on newest PL version, comparing to AMD 5600xt in same system ARC a750 where it did work was very slow, at least 50% slower.
I didn’t even let it complete DeepPrime or DeepPrime XD batches as it just paused after first file and nothing was happening.
What is worst single file exports or first files from a batch which did complete sometimes came out as just mostly black JPGs.
All in all either DXO needs to add support or intel needs to fix their drivers but as of right now and latest “stable” intel drivers the card doesn’t work for PhotoLab DeepPrime or DeepPrimeXD
@MA57k sorry to hear about your ARC A750 woes, elsewhere the black JPGs have been attributed to old NVidia drivers where users have NVidia cards(obviously, sorry).
I am currently in the process of upgrading to a Ryzen system, but with an NVidia RTX3060 card, the processor I chose was a 5600G and was surprised that it could process images in ‘DP’ and ‘DP XD’ albeit slowly
PS. This is repeating the run with 3 simultaneous copies and it shows a issue I have seem throughout my performance testing with the Nikon small batch in particular
The timings should be similar if not better but if you watch the thumbnails while the export is taking place with this batch then either between the third and fourth image and/or the fourth and fifth image DxPL seems to become confused and “stalls”!?
Currently this is DxPL(Win) 6.1.0 running, I will upgrade this particular copy to PL6.3.1 later today, but 25 seconds becomes 45 seconds (the build on the machine is lightweight and nothing is running in the background)!? It is as if with 3 or 4 copies when it can no longer assign multiple executions to the “worker” stacks it loses focus!
With this batch it starts with one image and when complete it starts 2 simultaneously, then it would move to three then four etc. but by the time it gets to 3 images there are not enough images left to move to 3 simultaneous exports.
With longer batches if it suffers this problem it will be at the end of the batch so I need to watch to see if that is happening, other nerdy activities I could undertake are …
New intel drivers got released today and same issue, ARC still not working with DeepPrimeXD, just outputs a black frame, and within PL DeepPrimeXD preview also goes black.
Mac Studio on OS 13.4 (Ventura), 64GB of RAM. This system has 10 CPU cores, 32 Graphic cores.
For what it’s worth, I’m on a 32" display running at 2560x1440 to minimize the scaling issue on Mac OS.
(all with “DXO Standard” selected and then DeepPrimeXD applied. No changes or modifications to any setting or slider). I’m processing 2 images at a time. I’m using PL v 6.7 build 52. This build allows me to select the ANE, which I’ve done for these results. I note that there is no way to easily notate this in the spreadsheet.
I knew that Egypt on CPU would be slow, and it is — but it looks like the Mac/Mx algorithm is single threaded for XD, as it was using <3% of the system while running.
I recently upgraded my computer from an i7-3770K (16GB)/GTX950(2GB) to an i7-13700 (32GB)/RTX 4070 (12GB). Old computer times from spreadsheet used PL4.0; new computer times use PL6.8. Huge processing speed increases in processing speed!
DEEP Prime XD - Old computer - D850 images: 196 sec., Egypte: 67 sec.
DEEP Prime XD - New computer - D850 images: 20 sec., Egypte: 9 sec.
(New computer times using High Quality: 15 sec./5 sec.)
(New computer times using Prime: 43 sec./19 sec.)
(New computer times using DEEP Prime: 11 sec./4 sec.)
PL 6 Prime XD clearly takes full advantage of modern GPU hardware. Using Prime is counter-productive and there’s little speed benefit in processing time with HQ.