XMP-files gets F*cked up due hierachical mismanagement in dual management

a exported txt file sufficient?
keyword list.txt (1,1 KB)
right:
step one
FRV: create xmp. bij star and colorlabel.
_1080786.xmp (560 Bytes)
_1080791.xmp (560 Bytes)
Then step 2:
Bridge 2022:
meta iptc preset. and keywords: one as main and sub visible in checkbox:
visible row
and one as hide main show sub:
no visible main
_1080786 step 2.xmp (4,6 KB)
_1080791 step 2.xmp (4,7 KB)
Then open in Plv5.2
fill in some fields and add keyword “planten” from list. (always synchronize is active)
seen in bridge
_1080786 step 3.xmp (8,3 KB)
_1080791 step 3.xmp (8,4 KB)
Sins i have always sync active i don’t add new keywords in database of dxopl because i polute then this database.
(if it’s necessary for you to have that also then i
switch off allways sync.
add something to this two files (786/791) and manual export to metadata(xmp) and then delete database and modified xmp’s and re- index my archive to ingest all metadata again. (i don’t add keywords in dxo for now unless i know its safe to do so. :wink: )
if you need a bigger test i will. :slight_smile:

Peter

@joanna Sorry about the legibility they are actually not spreadsheets they are “windows” into 4 simultaneous windows from PIE put together in real-time by 'Sneaky Previews" doing it this way saves me trouble and time trying to copy to the spreadsheet and I have found the colour combinations available in PIE not particularly good when using screen grabs - sorry!

Are any of the following easier to read?

The keywords in this file are not being written according to MWG guidelines. For the given hierarchy, they should be…

   <dc:subject>
    <rdf:Bag>
     <rdf:li>woning</rdf:li>
     <rdf:li>Dahliastraat</rdf:li>
     <rdf:li>Type</rdf:li>
     <rdf:li>Kunst en beelden</rdf:li>
    </rdf:Bag>
   </dc:subject>
  …
   <lr:hierarchicalSubject>
    <rdf:Bag>
     <rdf:li>woning</rdf:li>
     <rdf:li>woning|Dahliastraat</rdf:li>
     <rdf:li>Type</rdf:li>
     <rdf:li>Type|Kunst en beelden</rdf:li>
    </rdf:Bag>
   </lr:hierarchicalSubject>

After adding the extra keyword, the XMP should read…

   <dc:subject>
    <rdf:Bag>
     <rdf:li>woning</rdf:li>
     <rdf:li>Dahliastraat</rdf:li>
     <rdf:li>Type</rdf:li>
     <rdf:li>Kunst en beelden</rdf:li>
     <rdf:li>Overige</rdf:li>
     <rdf:li>Planten</rdf:li>
    </rdf:Bag>
   </dc:subject>
  …
   <lr:hierarchicalSubject>
    <rdf:Bag>
     <rdf:li>woning</rdf:li>
     <rdf:li>woning|Dahliastraat</rdf:li>
     <rdf:li>Type</rdf:li>
     <rdf:li>Type|Kunst en beelden</rdf:li>
     <rdf:li>Overige</rdf:li>
     <rdf:li>Overige|Planten</rdf:li>
    </rdf:Bag>
   </lr:hierarchicalSubject>

Now, what are you getting from PL when you export?

Unfortunately, nowhere can I see the hierarchical structure written explicitly. Am I to assume it is indeed Animal > Mammal > Bear > Black Bear ?

If a “DAM” doesn’t write the full structure, as I have outlined, then it is not MWG compliant and it is no wonder that PL is having problems. My app is fully compliant, as I gather is Capture One, and the resulting exports from such files are fine.

What would be useful is to see screenshots of the input dialogs for each app, to see how the hierarchies are determined and chosen from.

@joanna The keywords entered were

Photo 1 - animal, mammal, bear
Photo 2 - animal|mammal|bear
Photo 3 - animal|mammal|bear|Black Bear (but missing on some)
Photo 4 - animal, mammal, bear, animal|mammal|bear

into all packages but one of the things I did discover were that was output could (would) be influenced by certain options generally stored in the ‘Preferences’ section for each package. As a consequence it would be possible for two users of the same package to have a different keyword structure output depending on the choices they made!

My tests originally included JPG but I abandoned that in favour of RAWs because most if not all packages can be configured to use xmp sidecar files or simply do not provide an option to update the embedded xmp in the RAW file itself!

With the xmp sidecar files there is a greater and more easily accessed transparency!

However, I can access JPG using Beyond Compare which uses various add-ons to access the internals of images (including ExifTool) and it threw up this when comparing a PL5.1.2 output JPG with a PL5.1.4 (and PL5.2.0) output JPG with respect to IPTC data?

This indicates three standalone keywords with no hierarchical relationship and should produce…

  <dc:subject>
   <rdf:Bag>
    <rdf:li>animal</rdf:li>
    <rdf:li>mammal</rdf:li>
    <rdf:li>bear</rdf:li>
   </rdf:Bag>
  </dc:subject>

This either indicates one keyword animal|mammal|bear or it can indicate an implicit hierarchy, depending on the software. I have seen it wrongly interpreted as…

  <dc:subject>
   <rdf:Bag>
    <rdf:li>animal|mammal|bear</rdf:li>
   </rdf:Bag>
  </dc:subject>
  …
  <lr:hierarchicalSubject>
   <rdf:Bag>
    <rdf:li>animal|mammal|bear</rdf:li>
   </rdf:Bag>
  </lr:hierarchicalSubject>

… which is horrendously wrong in so many ways. Or this…

  <dc:subject>
   <rdf:Bag>
    <rdf:li>animal</rdf:li>
    <rdf:li>mammal</rdf:li>
    <rdf:li>bear</rdf:li>
   </rdf:Bag>
  </dc:subject>
  …
  <lr:hierarchicalSubject>
   <rdf:Bag>
    <rdf:li>animal|mammal|bear</rdf:li>
   </rdf:Bag>
  </lr:hierarchicalSubject>

… which can be acceptable but doesn’t fully transmit the nature of the hierarchy for the upper nodes.

… suffers from the same problem of possible differing interpretations as found in Photo 2, just with another level.

This is simply a mess and a disaster waiting to happen. Essentially, the “natural” way to interpret this would be…

  <dc:subject>
   <rdf:Bag>
    <rdf:li>animal</rdf:li>
    <rdf:li>mammal</rdf:li>
    <rdf:li>bear</rdf:li>
    <rdf:li>animal|mammal|bear</rdf:li>
   </rdf:Bag>
  </dc:subject>
  …
  <lr:hierarchicalSubject>
   <rdf:Bag>
    <rdf:li>animal</rdf:li>
    <rdf:li>animal|mammal</rdf:li>
    <rdf:li>animal|mammal|bear</rdf:li>
   </rdf:Bag>
  </lr:hierarchicalSubject>

… that assumes that animal|mammal|bear is a standalone keyword that just happens to contain pipe symbols. I presume the inclusion of the “combined” keyword is an attempt to convey hierarchical structure within the confines of the dc:subject tag - something which should be a definite no-no.

The big problem with allowing all these non-standard variations to be entered in an app is that it is then up to each app to decide how to write them to the file’s metadata. Which subsequently means that every other app has to know about all possible variants and take an “educated” guess as to which one was intended.

On the other hand, if all DAM writers not only properly adhered to MWG standards but enforced them at the point of entry, we wouldn’t be in the par-less mess we find ourselves in today.

At which point I am going to blow my own trumpet again in stating that my app (unfortunately Mac only) does adhere to those standards and I haven’t yet found another app that misinterprets the metadata I write.

The key to the integrity that my app ensures is that the management of keywords and the construction of hierarchies is taken care of in a separate “Keyword Manager” popup window, which provides the only means of constructing hierarchies, it being impossible to imply a hierarchy just by entering a delimited list of words in the keywords entry field. The keywords entry field provides a chooser, populated from the dictionary that is managed by the Keyword Manager, so that it is more difficult to mis-spell a word or choose one from the wrong hierarchy. This dictionary is then used in the construction of the metadata that gets written into the XMP.

What DxO have done is to try and emulate other software in putting both keyword assignment and organisation in an impossibly small tool palette - something which can actively dissuade folks from properly organising a “dictionary” or “thesaurus” in the first place. Have you tried dragging a keyword to be a child of another?

Then you have the problem of being forced to see multiple versions of the same keyword in the entry field if you happen to need the same keyword from more than one hierarchy, with the only way of determining which is which being to hover over the tokens in anticipation of a tooltip to elucidate.

And, in closing this overly long post, we still haven’t got to the limitations the current DxO hierarchical structures place on any chance of a search that can do more than simple AND conditions - but I’ll leave that for another rant :stuck_out_tongue:

@joanna just a quick reply for the minute (I am in the middle of some DIY) with respect to the animal, mammal, bear, animal|mammal|bear “horror” combo which was intended to be just that, a “horror” to see what the various bits of software would make of it, if anything!

I think I managed something even worse but never reported or posted it!

I was testing ‘null’ keys and corrupt ‘null’ keys which DxPL won’t allow to be entered but I was “hacking” the xmp sidecar file and it would accept that.

The nulls then made their way to the front of the keyword list and can then be assigned to an image (or could, haven’t tested it lately). In trying to navigate to those keys I inadvertently picked up a key (I am obviously not a good mouser) and dragged a keyword from its location onto a hierarchy with the ‘null’ keys and then had great difficulty undoing the damage!!

You really like putting your head in a hornet’s nest and then shaking about a bit, don’t you :face_holding_back_tears:

@Joanna if you are going to test software you might as well “torture” the software as well as the tester (me). Well written software should be able to stand up to such aggressive testing (although I think that my testing showed that you can enter blank keys into Lightroom but not into PhotoLabs!).

I always remind myself of an incident when I was doing a year in industry. I had to write a program to print, on expensive multipart paper, the yearly summaries for customers. I came in one morning to find a mountain of paper (many boxes) and a message that my program had failed!

I checked my code and it was correct but the results were not! Until I looked at a statement in my code that checked that the account numbers were going up in sequence and if they didn’t the program would fail! The preceding program’s SORT had not been executed correctly and it had strung the quarters instead of merging and sorting.

Two lessons

  1. Trust no-one, even yourself!

  2. If you insist on doing things properly, finish the job! In this case an intelligent report to the operators and a log were needed alongside my test for a truly professional job.

It’s a shame you aren’t on Mac - I could do with that kind of testing for my keywording app.

Especially yourself :wink:

Ah yes, error reporting and handling - wassat ? :crazy_face:


P.S. I’d really love to see some screenshots of those other apps and how their keyword input boxes accept and verify hierarchies.

Step 1 doesn’t contain keywords only starrating and color label.

Step 2 file is updated/created by Adobe Bridge 2022.
Step 3 file is updated by dxopl after adding a keyword “planten”.
And 3 fields filled. Inside dxopl.

As i read you correct then FRV doesn’t create the right xmp type?
I can make a new test and let Adobe bridge create the xmp to see if it’s different.

No, FRV creates the XMP file perfectly - no problem there.

The problems start when Bridge tries to add Kunst en beelden without its “parent” Type. presumably because it regards Type as a category and ignores it as a parent keyword.

The problem is then exacerbated when PL5 then adds Planten without its parent Overige.

The lr:hierarchicalSubject tag contains the parent keywords but they are not repeated in the dc:subject tag, as required by MWG.

The root of all this is due to the full hierarchy list containing categories (as they are called in Lr-speak)

When I import such lists into my software, I regard categories as root keywords and synonyms as child keywords - I find it makes life simpler.

I can set “type” as hide in bridge so it don’t show in the keywordlist (it’s a drawer so to speak not a item), if i don’t use hide it shows “type” also in keyword listing.
Then its show: drawer , screwdriver not only screwdriver.

Same thing for “overige” and “plants” by the way.

Little did I realise after posting my experience would I see such a response which tells me a few things.

  1. I am in a different time zone to most of you being in New Zealand.
  2. I know very little about keyword structures and how to programme anything.
  3. A lot of DAM suppliers claim to do it “right” but actually don’t
  4. There are a number of ways we can build a hierarchal system

But having read all of this I can say one thing if you are an IMatch user. PL5.2 seems to have written a system that is now compatible (albeit I had to reload PL to get it working - another odd thing I can’t explain) with IMatch. My structure parent | child | child seems to be retained and I am happy with that. I have however tested it where my IMatch structure has a synonym which PL takes and then puts it as a new child under the correct parent. For me I can deal with this but agree it may not be ideal for everyone. Maybe I need to learn a bit more about keywording and getting my structure right. Lots for me still to learn. Also, a synonym is a child keyword so I guess PL is dealing with this correctly although I am not in a position to comment on this, and IMatch just has an unique way of dealing with this.

Those who wish to retain their sanity and retain keyword integrity across years, even decades, and multiple applications, will avoid hierarchical keywords. Sooner or later your hard work will simply get blown up. You probably won’t notice for three or four months and it will be nigh impossible to clean up the mess.

If anyone wants to play in the deep end with hiearchical keywords, s/he should be using an application like PhotoMechanic which is dedicated to handling metadata and resolving metadata conflicts.

That said, I’ve come around to like the idea of metadata editing in PhotoLab (one work bench, like in the good old days of Aperture) but the caveat that the metadata is stored where it belongs in the .xmp sidecar in standard format in real time.

Good luck with the hierarchical keywords! I’d love to be proven wrong.

1 Like

@mwsilvers
I’m with you on this, my experience mirrors yours, Long live simplicity!!!

1 Like