PL5 keywords, one step closer, many steps still to go

I tried adding hierarchical keywords to an image I had already keyworded with LrC.

The original had something along these lines (totally invented for this example).
Subject: animal, mammal, cat, lion, Africa, Kenya, Nairobi
Hierarchical subject: animal|mammal|cat|lion, Africa|Kenya|Nairobi

I then added something like animal > mammal > ungulate > giraffe.

The result was the following additional keywords in the file (the existing ones were untouched).
Subject: ungulate, giraffe
Hierarchical subject: animal|mammal|ungulate, animal|mammal|ungulate|giraffe

So those in the (non-hierarchical) Subject field behaved as I would expect with the additional keywords added and deduplicated. But the Hierarchical subject got an extra animal|mammal|ungulate that was superfluous as it is inherent in the longer chain of animal|mammal|ungulate|giraffe.

My actual example was much worse than that because I added what was effectively A > B > C > D > E and that yielded all of:
A|B
A|B|C
A|B|C|D
A|B|C|D|E

In addition to these problems, because the keyword list is built by reading files, any files ever seen without the proper structure of keywords in them will pollute the keyword list.

Also, the keyword list allows me to untick C leaving D and E which is non-obvious on how it should behave, and in fact despite C remaining unticked, it still appears in the file after the metadata is written again.

I hate to refer back to LrC again, but it is the gold standard for keywords. It clearly shows in the keyword list which keywords are inherited and which are explicitly chosen.

In short, there are still too many rough edges to make PL keywords any more than a basic function. LrC lives on.

Actually, the MWG guidelines specify that each hierarchical keyword mentioned should be fully specified, thus the inclusion of animal|mammal|ungulate to fully specify ungulate in its hierarchical context and animal|mammal|ungulate|giraffe to fully specify giraffe in its hierarchical context.

To my mind, unselecting keywords from the middle of a hierarchy is something that should not be allowed and confuses other software.

I have some test hierarchies:

  • Colour|Orange
  • Entreprise|TĂ©lĂ©communication|Orange
  • Fruit|Orange|Satsuma

If I add all those hierarchies fully to an image, I get


Capture d’écran 2021-10-21 Ă  10.28.09


 where the keyword Orange appears to be duplicated but, in fact, each token points to one of the three hierarchical contexts that include Orange.

If I run Exiftool on the XMP sidecar for the affected file, I get


Subject : Colour, Entreprise, Fruit, Orange, Satsuma, Télécommunication
Hierarchical Subject : Colour, Colour|Orange, Entreprise, Entreprise|Télécommunication, Entreprise|Télécommunication|Orange, Fruit, Fruit|Orange, Fruit|Orange|Satsuma


 which is correct.

Maybe it’s just me but I find the triplication in the keywords field confusing. It also is different behaviour from other tools I have tested. Adobe Bridge “normalises” the field to show


Capture d’écran 2021-10-21 Ă  10.38.51


 and my app shows


Capture d’écran 2021-10-21 Ă  10.48.44

If I untick the two “middle” keywords, PL5 shows me


Capture d’écran 2021-10-21 Ă  10.57.43

ExifTool now gives me


Subject  : Colour, Entreprise, Fruit, Orange, Satsuma, Télécommunication
Hierarchical Subject  : Colour, Colour|Orange, Entreprise, Entreprise|Télécommunication|Orange, Fruit, Fruit|Orange|Satsuma


 which is strictly correct but, even though I have unticked Télécommunication, it still gets written to the Subject tag, even though it is not a leaf keyword; which, in turn, confuses searches in software external to PL5.

If I use Adobe Bridge and untick Télécommunication


Capture d’écran 2021-10-21 Ă  11.09.28

ExifTool then gives


Subject : Colour, Entreprise, Fruit, Satsuma, Orange
Hierarchical Subject : Colour, Colour|Orange, Entreprise|Télécommunication|Orange, Entreprise, Fruit, Fruit|Orange|Satsuma


 correctly omitting Télécommunication from the Subject tag.

However, it must be said that Bridge also has enormous problems if I want to untick Orange from any part of the hierarchical list as it insists on also removing it from all three hierarchies.

The truth is that it really shouldn’t be permitted to remove a “middle” keyword from a hierarchy as it is integral to the definition of any of its child nodes.

Maybe someone from DxO can comment on this?

1 Like

I use Photo Supreme 5 and there is a limited compatibility. Most of the keywords appeared in PL5 but nothing at all added in PL5 is picked up in Photo Supreme 5.There are some odd omissions in how PL5 was populated, in PS5 I have a number of organisations, e.g. U3A these are not listed in PL5 I added one from an event processed today and its shown as a keyword but shown in the list. It wasn’t transferred to PS5 when I added to that the processed image’s and PL5 didn’t pick up the changes made in PS5. Pretty useless for me so far.

I have asked Photo Supreme but as it workes with other programs don’t hold up much hope of the problem being with them.

Can you show us screenshots of what you see for the same image in both Photo Supreme and PL5?

Thanks I think I have found what the problem is, I think I need to turn the option to synchronize metadata between XPM and data base on. I can’t see why but I have done this and been trying adding new terms in PL5 and these now come up in PS5 and terms added in PS5 are appearing in PL5 even the missing terms are now there. As said I don’t understand why it needs this as all I want is the XPM side bars not the data base. I an concerned if the database gets messed up the sidebars may get effected and DxO promised years ago they were going to introduce some form of database manternce but never have.

As a programmer I find this unsettling. The ungulate keyword is fully specified in a hierarchy by means of animal|mammal|ungulate|giraffe. The Subject field has the job of noting all of the points on the path in their own right, so the hierarchical subject need not over-specify things in this case. I get that working groups are there to establish standards but the LrC team also clearly does not agree with this either.

Regarding Orange appearing three times, it could be argued that is OK if it is shown in each of the three contexts. In LrC, when there is such a clash, it shows e.g. Orange < Colour in the keyword box and not just Orange even though Orange is what’s written to the Subject field. In other words it tells you what you need to know to understand exactly what keywords you have applied and no more.

I get the people who say “don’t just make it another Lightroom” but in this one area I really think it has to be, or it should be wound back to basic, non-hierarchical keywords.

I didn’t even mention above the other missing features that I rely on in LrC. Namely synonyms and non-exporting keywords.

1 Like

Every time i am seeing keywording transfering between programs i get an itch.
Why?
It cost a enormens amount of time to tag a archieve which is the base of a good searching capacity. Once you setup a certain way with say Adobe Bridge with a tree of keywords and a iptc preset fill in that’s the master. All other applications needs to be slaves. If they don’t follow the hierachical way of, in my case Bridge, master and overwrite unasked keywords (not funny) or view it in a strange way which cause when you change something a hierachical change then i uncheck sychronise. Make it read only.

Iptc gets written in xmp wile keywords are stored in dopfile by dxo right?

Anyway i think i keep sync unchecked and use indexing so i can use the search and viewing function wile controling xmp by Bridge for now.
When i start to see good results in search and view of exif iptc keywords consistenly then i check sync. Not to change written one’s but to just add in empty fields.
Better save then sorry. :crazy_face:

1 Like

Yes, Peter, it’s good practice to manage metadata in one app and set all others to read-only.

Switching the “metadata master” looks like something that one does not really want to do. If we’d want to switch from Bridge to PhotoLab, we’d have to write metadata into all files in Bridge and then index the lot in PhotoLab, because PL does NOT have a keyword-list-import feature.

Totally agree!

You wrote: “Every time i am seeing keywording transfering between programs i get an itch.”

I would not stop there! Everytime I hear people having problems with two way metadata communication I ask myself why people insist in going there. ItÂŽs not a good practise to build integrations where no system owns the master metadata. It is to ask for problems and a lot of extra costs when building integration between different systems. ItÂŽs a good rule to keep it simple.

I have used PhotoMechanic a year now without any metadata confusions together with Photolab 4. It has worked perfectly fine because PM Plus 6 has owned the metadata exclusively. This has worked despite the ITPC-metadata was ghosted in PL 4 that couldnÂŽt display it. As I have described in an other DxO Forum tread already PL4 was despite of that handling the metadata from the RAW XMP-sidecar correctly when reading it from the RAW-master XMP and exporting it to JPEG, DNG TIFF derivates.

Now when Photolab 5 has got a very much improved support for IPTC it has suddenly been very tempting to update metadata in both Lightroom and PL5. If I was a Lightroom user and was using PL5 too of some reason, despite it®s “cake on top of a cake”, I would continue to maintain my metadata in Lightroom. I should turn on sync in Photolab in order to get Photolab to update it®s database automatically but never edit metadata in PL5. The reason not to do it the other way around is that PL5 is 1.0 when it comes to real IPTC-support and Lightroom from what I know has to import metadata changes manually from other applications and that is in my eyes a draw back. Let PL5 read the metadata instead. From what I have seen that will almost work. (I still have some problems with getting PL 5 display the four or five IPTC Image elements, but luckily that doesn®t seem to affect the export of IPTC-data to JPEG, DNG or TIFF.)

I have experience of working with two way communication in Enterprise DAM and that was a major headache and it made the developing cost multiplying for the system integration. DonÂŽt go there!

IsnÂŽt a discussion like this just a good example of an unnecessary cost in wasted time?

Not knowing much about the subject I had enabled PL5 to fully work with Photo Supreme after reading what’s here I am turning it off
I had been worried about turning it on tied it to the data base and those are always have the potential for going wrong (and the promised maintance tool never appeared). So thanks for the warnings and it for me would have been better if DxO had not developed into an integrated DAM or kept it as a separate modual as it looks like a lot of the development time had gone onto the DAM as many feared DxO would be forced to do if going down this path


I’m not even sure what my system is.

I enter keywords in LrC and write them to the files, so if LrC goes away my photos are the master of the data (as much as they can be).

Then I edit the photos in PL and export, but this does not export what LrC would export. So I then also export small thumbnail-sized copies of the same images from LrC, which writes the correct keywords. An automation on my computer spots the thumbnails and copies LrC’s version of keywords into the JPEGs exported by PL.

Easy!

I am not an expert in any way on metadata but i suppose it’s very easy to make a hieracical system:
checkbox 1 : master
checkbox 2 : slave => read only for updating purposes visialisation and every change would be prompted with question "change xxxxxx for yyyyy?
if all metadata program’s has this you can setup and change as you like.

(master: point of origin metadata, changing means auto updating files.)

simple right?

by the way “synchronise” means nothing else then read/write both way’s
in my backup system i use mirror one way only.
otherwise it sync’s old files back in you dayly used hardrive :thinking: :sob:

And, as a programmer myself, who has just spent two years writing a keyword management app for Mac, I can reassure you that it takes some getting your head around metadata standards and compatibility.

In brief, it’s a mess but, after extensive research and testing, I can assure you that this is neither over-specified nor unnecessary, especially from a point of view of compatibility with other software, which can behave equally badly.

If we take the case of non-hierarchical keywords, it is simply a matter of adding each required keyword to the xmp-dc:subject tag.

But, with hierarchical keywords, we also need to populate the xmp-lr:hierarchicalSubject tag. In order to do this, we need to write each hierarchical keyword there as well as to the xmp-dc:subject tag. But it is not just a matter of including all keywords in a hierarchy once in an all-encompassing “phrase”.

In your example, you want to use both ungulate and giraffe, but in so doing, you are referring to two different hierarchical keywords:

  • animal|mammal|ungulate
  • animal|mammal|ungulate|giraffe

These hierarchical keywords are integral and indivisible entities - they are not lists of keywords and one cannot be “derived” from the other.

Thus the requirement for both to be included in the xmp-lr:hierarchicalSubject tag.

PL5 effectively does this but it is hidden behind a tooltip that (eventually) appears when you hover over one of the tokens. If duplication is deemed necessary, then, like you, I would rather see the Lr/Bridge style of explicit definition of each hierarchy.

Fine. As long as you donÂŽt change the metadata also in Photolab Photo Library and just stick to edit the pictures you are in a one way flow. I just donÂŽt understand why you have to export even smaller sized files carrying the metadata. IsnÂŽt it possible just to open the Lightroom RAW XMP-sidecars associated with your RAW-files directly in Photolab without having to rely on the smaller files? I donÂŽt understand.

If I update metadata on my RAW-files in Photo Mechanic the associated XMP-sidecars gets updated correctly, Photolab will automatically be able to red it as soon as the same RAW is opened in Photolab. If I then export a JPEG from Photolab with that RAW as a metadata master, all that metadata will be transferred to JPEG, TIFF or DNG without any further problem or any other act necessary from my side.

ItÂŽs a long time since I gave up on Lightroom but I donÂŽt understand why you need the thumbnails. ShouldnÂŽt the XMP-sidecars to your RAW be sufficient XMP-data containers by themselves? I thought it would be enough to turn on the switch in Lightroom that activates the XMP-sidecar file funktion.

could you elaborate?
xmp files are side cars for rawfiles because writing in the rawcontainer, nikon seems to apply that, isn’t a smart thing to do. Produce LR different files then xmp?

The DxOdopfiles are “1 image database pieces” soly for DxO-application use.
reading with a notpad you can see what’s in it.
The reason that exifdata are found inthere is it’s DataBase copy for 1 image edit state.
So 1 image 2 file’s.
xmp and dop.

This means that organising, moving images around should be done inside DxOPL in order to keep side cars stick to image. (Bridge does show all file so when move you can selected all en move around but a error is made quickly.)

That I can accept. But


I disagree. It is simplicity itself to split on the | character. In fact, one could argue that in the perfect world that would be the only field needed as the individual keywords could all be derived.

I would agree with the statement that “these lists of keywords do not get split by software.” But they absolutely could. If you’ve been anywhere near databases you will know about normalisation.

However
 the basic hierarchy function as implemented by PL is not my end game. LrC takes things further (and without needing to duplicate the hierarchy at all in the files themselves). Which brings me to


It took me a while to understand, too, but the key point I haven’t explained well enough so far is the additional capability LrC has:

  • synonyms
  • non-exporting keywords

Consider this real hierarchy I have.

  • !places > Oceania > New Zealand > Wellington > City

When LrC writes the keywords to the DNG files (or XMP files) you will see every entry that I have listed. However, when it exports files, things change.

What I have not shown above are the extra capabilities.

!places is marked as non-exporting, which means it is there purely for structure in the keyword manager in LrC. There is no keyword I want on my published photos that would logically encompass every region of the world. (I’m not one of the 18 people who might reasonably use ‘Earth’ and ‘Moon’.) I considered just having each region at the top level but then I have to scroll for miles to get to Oceania and then miles back in the other direction for Africa. The !places keyword holds together the related regions and their descendants, and by means of having the initial ! character it is always at the top of my keyword list.

New Zealand has a synonym Aotearoa which is the indigenous name for my country. Again, I could work around this by having both keywords individually, or perhaps one hierarchical under the other, but the concept of a synonym is exactly what represents the relationship. Every photo that has New Zealand or Aotearoa on it should have both. With separate keywords or as a hierarchy it is possible to mess that up.

So in LrC if I add City < Wellington then a PL5 export will have exactly the list I have shown above in the Subject field (unrelated). Whereas a LrC export will have these:

  • Oceania, New Zealand, Aotearoa, Wellington, City

The classic case for synonyms for me is airports and airfields. My local airport is named Wellington International Airport, it has an ICAO code of NZWN and an IATA code of WLG. Those latter two are unique in the world, where the former may be but I’m sure there are many airports in the world with the same common names. I have all the airports and airfields under an airfield keyword which is set not to export because it’s not a useful keyword to have on the final product. And when I want to add my airport I just type NZWN and I get Wellington International Airport ‘for free’ without having to deal with suggestions for all other variations of Wellington in my keywords. I currently do not include the IATA codes but I probably should for completeness. While I would have to re-export and re-publish my photos, actually adding WLG to the thousands that already have NZWN is a simple edit of the NZWN keyword to add WLG as a synonym.

Turning off the option to two way communication results in no keywords are being shown.image
image

As I have said before, it’s not so much a matter of what can be done, it’s much more about being able to edit keywords in one piece of software and have another piece of software understand them and use them.

Before DxO starting doing anything about a DAM, I started writing my own app (macOS) to fill in the gap.

It has been a nightmare! Trying to read and write metadata that works for every other piece of software is meant to be based on standards and the best source of the most universal standard is the MWG (Metadata Working Group) guidelines. The problem is, Adobe, who participated in their drawing up, have decided to diverge from them - to the extent of having two apps, Lr and Bridge, which default to different rules and, what’s more, offer not very easily findable ways of changing how they write stuff!

Then you have all the other software which sort of, possibly, maybe, adhere to the guidelines - at least according to how they have managed to interpret them. And, I must add, that includes my app which, I hope, does its best to bridge those gaps.

To start with, my app has a dedicated Keyword Manager, where keywords can be: added, removed, renamed, arranged in hierarchies, etc. This is key to ease of use, allowing searching for keywords, in their hierarchies, and directly adding them to the keywords box in the main window for one or more images simply by double-clicking on a word in the manager popup window.

Of course, now, @platypus pointed out that a user would have had to enter all their keywords, one at a time, into my app - which, when you see the size of some of the ready-made lists that are available, could take quite some time and effort. So, I am now in the middle of adding in being able to import ready-made lists and, so far, what I have done is working quite well but your post has caused me to think seriously about adding in non-exporting and synonym functionality.

Questions

For synonyms, how do you determine which word “takes the lead”?

Take this example from the Foundation List:

	[~Characteristic]
		beautiful
		cute
			{adorable}
			{lovable}
			{sweet}
		short
		tall
		ugly


 where there are three synonyms for cute and, presumably, are, sort of, “children” and not “siblings”

In your example of airports, which is a domain I have worked in, ICAO, IATA and names could all qualify for being the “master”. Or do we have to have a mutual lookup from each one to the other?

Do you always write the name, or could there be a requirement to write IATA or ICAO? And who decides?

Or would you have simply have two synonyms (for IATA and ICAO) linked to that airport name?

Hi, Interesting thread. My own limited research into meta data specifically keywords, rating and IPTC fields with the aim of creating a Rosetta Stone type app just made me realise that it is all rather a mess. Some apps update jpeg, tiff and dngs with keywords others just create an xmp file which might meet the requirements of one application but not another. Other apps read but do not write and some get confused if there is metadata in the file along with an xmp file. In the end I gave up and deleted all of the xmp sidecar files and started adding keywords directly to the file names. While this is limited in someways at least my collection of images may be searched using almost any recent computer OS you care to mention.

I’m sure that hierarchal have their uses but I am not sure what problem they are trying to solve so have concluded that I don’t need them. For those who do - good luck.

Simon

sounds very much like → Time stamp expansion for image --DXO PL4 Elite - #2 by Wolfgang