PL5 - DeepPrime and performance gain on Windows?

abgestumpft · October 23, 2021, 1:03pm

In the new features (https://www.dxo.com/dxo-photolab/new-features/ it is written:

Enjoy optimized processing times that are up to four times faster on Mac computers equipped with Apple Silicon and up to 1.5 times faster on the latest Windows machines.

What does latest mean here?
When I compare on my PC (Win10 - AMD 3700x + GTX1060 6GB - 64GB RAM) exporting images with DeepPrime is ca. 5% slower with PL5 (vs PL4.3.3). On dpreview forum someone also wrote that on his machine (Windows 10 PC with an Intel i9-9900X 3.5Ghz processor, NVIDIA GeForce GTX 1080 graphics card, and 64GB of RAM) the export with PL5 is a little slower (3:11min vs 3.25min)

Lucas · October 23, 2021, 1:41pm

Hello,

First of all, you should ensure that your graphics drivers are up to date. Some optimizations do not work before drivers from last June.

As for which GPU, we do not have all GPUs so cannot give an exhaustive list. However we can guarantee a speed-up for NVidia GPUs starting from GeForce RTX 20 Series (RTX 2060 and above) as well as AMD RX 5500 and above, compared to DxO PhotoLab 4.

Lucas

abgestumpft · October 23, 2021, 2:20pm

Hi Lucas,
thanks for this information!

MouseAT · October 23, 2021, 6:48pm

I decided to benchmark DeepPRIME on the PL5 demo vs. PL4, as a 50% performance increase might be worth a discounted update, despite PL5 bringing nothing else of value to the table for me. Unfortunately, my system (AMD 5600X, 32GB RAM, and a 1080Ti) was actually slightly but consistently slower on the same hardware with the same images, and the same settings.

300 20MP RAW images from a Sony RX100 VI would process in around 21 minutes on PL4, but around 23 on PL5. That’s disappointing. Suffice to say, I won’t be rushing to upgrade, and I recommend that anyone who’s looking for a potential performance boost to try before you buy. I’m sure some systems will see a significant performance uplift, but it’s definitely not universal across the board.

uncoy · October 23, 2021, 10:07pm

I’m surprised that DxO would not optimise for the 970/980/980Ti, 1070/1080/1080Ti series. They were enormously popular and many are still in use as except for game play or DaVinci Resolve there’s no real reason to upgrade.

MouseAT · October 24, 2021, 9:42am

I reckon the old PL4 code path was probably about as optimal as it could be on the older GPUs, and that the newer code path probably takes advantage of features that are only available on the newer hardware wherever possible, at the expense of a minor efficiency loss on the older cards. It’s probably worthwhile in the long run, as who wouldn’t want a 50% performance bump via. pure software optimization on the same hardware? Unfortunately, it’s a step back for those of us who are still on the older generations of GPU.

The timing really couldn’t be worse, given the state of the GPU market. I would actually like an upgrade for my main rig, but that’s just not feasible at the moment when availability is pretty much non-existent, and prices are through the roof. Out of interest, I decided to see how much it would cost to get a modern card with roughly the same level as performance as the 1080Ti, and the answer is basically “what I paid for the 1080Ti on launch day, four and a half years ago”.

The sad part is, my laptop actually contains a 2060, so that would probably see a performance uplift to be roughly on par with my desktop, if DxO’s estimated numbers are in the ballpark. That could be useful enough when I’m away from home to warrant an upgrade, if the price had been right, but I don’t crunch enough images on my laptop to warrant paying to make my desktop machine slower.

uncoy · October 24, 2021, 10:02am

It’s ironic that apart from export performance you don’t find the features in PhotoLab 5 compelling enough to upgrade. I’m extremely interested in U-points turning into proper luminance and chroma masks and would also like Fuji X-Trans support. The half-baked DAM/library/metadata functions don’t interest me at all but at least they can be turned off (so they are not a negative).

Basically I’m clamouring to upgrade and locked out, despite my main computer running a recent OS (OS X 10.14). I don’t worry about the performance increases as PhotoLab 4 is plenty fast and wouldn’t care if PhotoLab 5 didn’t get most of the performance improvements as long as I got the other features. You’d love to upgrade for performance but are also locked out despite a very powerful graphics card on board.

For a company which wants a wider market share and more update revenue, DxO isn’t going about it very well.

MouseAT · October 24, 2021, 12:08pm

Yeah, it’s clear that different parts of the community are pulling in different directions.

I’m looking for camera/lens support, and whilst my Sony’s are all well covered, I’m still annoyed by the complete neglect of modern smartphones. Whilst I don’t use them often, my iPhone is the one camera that I always have with me. Support for high end Apple or Android devices could have been a killer feature here, even if it was only a small selection of devices. Batch processing performance is a big deal, as I’ll often come back from a weekend away with 1000+ images, and the faster I can run them through DeepPRIME, the better. PL4 is fast, and whilst I don’t need faster performance, I’ll take it if I can get it for a reasonable cost.

I use an external DAM, so tend to use PhotoLab as a RAW processor and for applying simple adjustments. I don’t tend to use it for complex edits. I tend to get the basics right in camera, bulk export JPEGs with DeepPRIME on and default corrections, and use PhotoLab to crop, level, and make basic adjustments to the likes of levels and white balance. For the most part, I’m happy with the results, which feel like an enhanced version of the out of camera JPEG, with much better noise handling.

If I really need to do more with an image, I’ll export it as a TIFF, and pull it into Photoshop Elements or Affinity for post-processing. That’s fairly rare for me, as I’m not a pro, and I’m my focus is more on working through large volumes of photos quickly, and trying to salvage the odd important moment that didn’t come out so well, rather than trying to get the absolute best out of a small number of keepers.

daveywilson · October 25, 2021, 4:38am

@uncoy I don’t think it’s a question of whether they want to optimize for those older GPUs or not. Those older GPUs are way past their prime and simply do not have the computational power to keep up.

daveywilson · October 25, 2021, 4:40am

@Lucas Noticeable speed improvement, but sadly the program still seems to only use my RTX 3090 at about 1/2 capacity, and I’m running the latest Nvidia Studio Driver.

MouseAT · October 25, 2021, 9:48am

I hear you on the 1080Ti lacking the hardware for further enhancement, it’s just a shame to have a performance hit, at a time when better GPUs aren’t widely available. The former is fine, the latter not so much. It’s good to see things moving forward, though, in the sense that I’ll eventually get my hands on a faster GPU, even if it’s not for a year or two, and then be able to get a really major performance boost.

As for the GPU utilization, that may in part be down to your CPU. When I moved from an Intel 6700K to an AMD 5600X, I saw around a 50% boost in DeepPRIME performance with the 1080Ti, and saw some increase in GPU utilization, although it’s not exactly running flat out all of the time. I think that really fast GPUs can sometimes end up being held back by the CPU, as I’m fairly sure not all parts of the export process are GPU accelerated.

IanS · October 25, 2021, 10:27am

I tested 7 Sony A7rIV images and the process time per image went from 17.3s to 10.7s, so an improvement of around 40% which isn’t too bad.
Nvidia 2070 Studio Drivers

uncoy · October 25, 2021, 10:28am

I’d be surprised if an nVidia 1080Ti is unable to compete. In fact, an nVideo 1080Ti scores 61029 on OpenCL in Geekbench. Ahead of a Radeon Pro Vega 56 and the Radeon RX 5700 XT and a nVidia RTX 2060.

What’s the next excuse for the lack of support?

IanS · October 25, 2021, 10:32am

I didn’t read it as lack of DXO support just whether the GPU routines used by DXO are provided in the Nvidia driver for the 1080Ti?

uncoy · October 25, 2021, 10:36am

Different question. What DaveyWilson wrote was:

Those older GPUs are way past their prime and simply do not have the computational power to keep up.

Sadly, I’m starting to feel this is a forum of hardware junkies and not photographers.

abgestumpft · October 25, 2021, 11:17am

The Nvidia GTX 1000 series is based on Pascal microarchitecture. RTX 2000 Series (Turing) and RTX 3000 Series (Ampere) use different microarchitecture. Starting with Turing Nvidia introduced Tensor AI Cores (used to support AI, deep learning,…). E.g. if DXO makes use of them to speed-up the DeepPrime processing, there is no chance to have this on gtx series, since the HW for that is missing here…

daveywilson · October 25, 2021, 2:36pm

@MouseAT I’m running an AMD Ryzen 9 3900XT which is no slouch, and had no issues with my Radeon Pro W5700 which the NVIDIA RTX 3900 Founders Edition replaced.

uncoy · October 25, 2021, 2:45pm

That’s a far cry from

Those older GPUs are way past their prime and simply do not have the computational power to keep up.

If DxO is basing everything they do on the very latest libraries, using their software is going to quickly become very expensive.

DeepPrime from PhotoLab 4 runs plenty fast on good graphic cards in my opinion so the improved speed is roughly irrelevant. It’s a pity that DeepPrime is a bit slower in PhotoLab 5 for the older cards but the difference is no big deal. Just makes updating for speed improvements in DeepPrime irrelevant. I’m sold on the update for the improvement to local adjustments with variable chroma/luma and Control Lines (instead of existing Control Circles).

Unfortunately due to DxO’s draconian OS -1 support policy on Mac, I can’t buy or use PhotoLab 5. Mojave 10.14 is apparently not enough even though CaptureOne manages to support High Sierra 10.13.

BHAYT · November 1, 2021, 5:24pm

MouseAT I am very late to this “party” but while I welcomed improvements to the Prime processing in PL4 with the introduction of DeepPrime I “resented” having to pay for the privilege of the speed improvements particularly at the beginning of 2021 just as graphics cards had become scarce, prices had risen and even bottom end cards had become expensive!

I “resented” it because I believe, rightly or wrongly, that while developing RAW photos with “big” skies the various PL4 (and PL3 before it) settings were creating noise in the clouds which Prime (and then DeepPrime) managed to remove, i.e. if I wanted the effects I had to buy a card to make render (export) times more palatable!

So I managed to purchase a 4GB 1050TI for my beta test machine (£160) and a 2GB 1050 (£110 second-hand) for my main machine. I just ran a test on the main machine and PL4 took 27 seconds and PL5 took 24 seconds to render the same photo, essentially maintaining the existing speed.

However, the current marketplace appears to have none of the low end cards at all and mostly starts at £430 for an RX6600 rising to £1700 for an RTX 3080 and £2,400 for an RTX 3090; here’s hoping the Bitcoin miners have a “rock fall/roof collapse”, to bring the price back where it should be and help save the planet from the energy being consumed !!!

If DxO are using such high end cards to make the numbers look good that is one thing but if they are completely ignoring the fact that many of the users are not using such “exotic” hardware then that is another!!

Why is so little GPU being used from my “Weedy” Graphics card?

One problem that I do have is the very small amount of graphics processor that PL actually consumes while exporting. The reason for this might have been discussed elsewhere on the forum but I would be interested to know the reason and whether the PL4 testing of graphics cards is going to be rerun for PL5 with a comparison of results between PL5 and PL4.

Make the Noise Option an Export parameter & enable “Contact Sheet” output:-

Possibly because of my physical (eyes) or mental make up (photographic retention) I cannot successfully compare images unless the transition time between them is essentially none, i.e. the time it takes to move from one image to the next in PL is too long for me to effectively compare which of my alternative option choices are better, worse of indifferent!!

This time difference is too great even between images and virtual copies. Hence, the only alternative is to export the photos to enable the comparison in a browser/editor that is not attempting to render at the same time as presenting, plus products like FastStone Viewer and FastRaw Viewer can provide multiple image comparisons, if required. Currently that means changing the Noise reduction from DeepPrime to HQ to speed up the export process and then change it back when I want to export the final JPG!

If the noise reduction process to be applied could be defined in the export process then life would be so much easier. I could create an export profile with HQ selected in addition to the one with either DP selected or where it is left blank and it defaults to the settings for the photo and I get to be able to export for review (my contact sheet if you like) before my production exports when I have decided which options selection is the “best”.

An alternative would be a smart snapshot function in PL5 itself but that is probably going too far?

Lucas · November 2, 2021, 11:37am

AI has progressed a lot in the past years, and hardware has followed. Unfortunately this means that GPUs before these improvements are not so much well suited for AI. We optimized for the latest GPUs because we saw that there were opportunities thanks to specific hardware improvements. To improve speed on older GPUs, we would need to change the neural network that is behind DeepPRIME, with a high probability of reducing image quality at the same time.

I might be a bit biased as I like Apple products, but at the moment one of the cheapest options looks to be Apple M1 Mac Mini thanks to its Neural Engine. Base price is 700$ and as you can see in DxO DEEPPrime Processing Times - Google Sheets it performs quite well (we’re still lacking PL5 updated times for high end GPUs though).

Assuming that your GPU is chosen by “Auto” mode in Advanced preferences for DeepPRIME acceleration or that you explicitly chose your GPU there, you need to be careful with the tool you use to check GPU usage. For instance Task Manager displays by default activity for 3D work, while what is used by PL for DeepPRIME is what’s shown under “Compute” entries:
Screen Shot 2020-10-01 at 09.02.47