Duplicate keywords appear at top of hierarchy

I am doing something to cause a problem with my keywords and I cannot figure out what it is. I’m busy installing PL 5.0.1 to get a precise set of steps to reproduce the problem. In the meantime, I will describe the symptom here:

Suppose I have a top-level keyword “People” in my hierarchical keyword list with a sub-keyword “Fred Mertz”. That is, I have a keyword “People > Fred Mertz” which is attached to, say, 1000 images.

When I work on a collection of 10 photos which are tagged as Fred Mertz, then suddenly sub-keyword “Fred Mertz” is replaced with a new top-level keyword “Fred Mertz”. When I look at the count of photos in the Keywords section of the Photolibrary, I see that 10 photos are tagged with the new fake top-level keyword, while 990 remain tagged with the correct sub-keyword.

When I search for “Fred Mertz”, I see there are two keywords with that spelling - the top-level one and the hierarchical one.

I think this is a bug but, more importantly, I need to understand what I am doing to cause this so I can avoid gestures and get on with my life. Right now I am in a hellish cycle of creating duplicate keywords at the top level, working hard to them, only to have more duplicates appear.

Do you realise that a keyword can exist at more than one level? For example:

Orange
Fruit | Orange | Satsuma
Colour | Orange
Enterprise | Telecommunications | Orange

The way that DxO have decided to implement this is that each keyword is identified in its hierarchical context. So looking for Orange alone will only return those images that have been tagged with the standalone keyword.

In order to look for Colour | Orange you will have to specify that exact hierarchy in the search.

At the moment, the search engine cannot do sophisticated searches for “Orange, wherever it appears” and you need to take care to check how you are marking images in order to get things to work according to the present implementation.

What you are seeing is not a bug, more an unexpected behaviour. Could you post a screenshot of the keywords section of PL, showing the hierarchies listed?

What are the things that you do before the keywords are shifted up?

  • Customising details?
  • Export and review?
  • Interactions with other applications?
  • Other?

I tried to reproduce - without success - the issue you found.

2 Likes

This is exactly what I am trying to figure out! Thank you for your interest in helping me. I still haven’t found a way to knowlingly reproduce the problem. I do have one odd behavior that might be related but I still working to clarify this.

The odd behavior is this: there are two ways to add a keyword, best as I can tell. 1) Begin typing into the “Add keywords” box and then arrow down to the desired keyword and hit Enter. 2) Open all nodes by clicking twisties until the desired keyword is displayed, followed by checking the box.

In the first case, only the leaf keyword is added. In the second case, all intervening keywords are added. As a real example using places, the first case assigns one keyword, such as “Bertha K. Russel Preserve” to the image. The second case adds: “Places”, “Bertha K. Russel Preserve”, “North America”, “Portsmouth”, “Rhode Island”, “United States” because the keyword I clicked on is actually Places > North American > United States > Rhode Island > Portsmouth > Bertha K. Russel Preserve.

I don’t understand why those two gestures behave so differently and I wonder if it is related to the issue I encounter where top-level keywords are errantly created that duplicate the spellings of leaf keywords.

I should note that the second case adds keywords correctly. In this case “Places”, “Places > North America”, etc. Hence this does NOT duplicate the problem.

Thanks so much for taking an interest. I do understand that the spelling of a keyword is not unique across the list of keywords. This is exactly the problem because PL creates duplicates of my leaf keywords at the top level. Now I am cleaning up 9 images that got messed up. What was I doing when the mess occured? I wish I knew!

I wonder if this is important: The images with messed up keywords are old. They were shot with a Pentax *istDS back in 2009. I was working on them to get past the idiotic problem that PL doesn’t support the camera. I could not open the DNG files so I went back into my subscription software and did a mass export to TIFF format. The messed up images are all tiff files displaying an icon that states that “No DxO Optics Module available for this image”.

As far as I read from your answers, the issue came up when you opened the TIFFs that you exported from Lightroom?

My guess 1: Keywords added in Lr: YES
My guess 2: Keywords added in Lr: NO

Hint: Be as specific as possible in your answers…

I had found this problem between the 2 ways to add keywords, here.

Oy! You want specific? I’ll try my best.

The original keywords were definitely added in Lr. However, I just remembered what I was doing when the error occurred. Well, not exactly what I was doing but generally.

I had misspelled a keyword and I was working on the effected images, adding the correct spelling and removing the incorrect one. I was at the Newport Folk & Jazz Festival and tagged many images with:

People > Celebrities > Pete Seger

I already had a keyword with the correct spelling of Pete’s name, so I was adding the correct keyword to a bunch of images and removing the incorrectly spelled keyword. Well after I completed the task, I noticed that all my place tags for Newport, RI had become top-level tags.

Places > North America > Rhode Island > Newport

had become:

Places
North America
Rhode Island
Newport

I have been playing around with these types of actions on recent images and I cannot reproduce the problem. I will to compare and contrast PL’s behavior with new vs. old images.

Eureka! The hierarchical keywords were added in Lr. The duplicated keyword problem occurs when I visit the folder containing the files. This is a disaster for me. What do I do? Can I fix this in Lr? I’ve cancelled my subscript (hooray!) but I have it for another two weeks.

Ooookayyy…let me think about it…

  1. Create an empty folder on your desktop.
  2. Direct Lightroom, PL5, PL4 etc. to that empty folder
  3. Quit out of Lightroom, PL5, PL4 etc.
  4. Do something else!

Wait.

Mark, can you possibly either post one of your affected files here or send it to me via a DM? Assuming they are RAW files and have XMP sidecars, both would be useful. I have a couple of ideas about sorting this mess out and need to see what you have at the moment.

Here is what I’d do to restore keyword sanity if I were in your shoes

READ through all before acting. Ask questions before acting.

  1. Open Lightroom, keeping it pointed at the empty folder (preventing further oddities with your files)
  2. Expand the Keyword List - stay away from everything else during all that follows
  3. Check, which keyword hierarchies you have are still intact
  4. Fix damaged hierarchies (make sure to stay in the keyword list) and don’t touch anything correct
  5. Repeat steps 3 and 4 until the keyword list is cured
  6. Wait a while
  7. Find any rogue keywords (that might have been created during your “excursion”
  8. Select all images that have a rogue keyword
  9. Delete the rogue keyword and fill in the correct keyword from the keyword list
  10. Repeat steps 7 to 9 until the compromised files are cured
  11. Tell Lightroom to write metadata to files (if you use Lr in its default settings)
  12. Wait until finished, quit Lightroom.

Trash the PhotoLab 5 database (look for files named DOPDatabaseV5)

  1. Open PhotoLab 5 and let it rest on the empty folder
  2. Expand the Keyword List and delete the rogue keywords from the database
  3. Tell PhotoLab to index the folder(s) that had images with rogue keywords
  4. When done, check the keyword list, it should now be free of rogue keywords
  5. Quit out of DPL if things look okay

If you’ve not customised images in PL5, that’s it.

If you have changed settings under the customize tab, redo the changes tomorrow.
If you have changed settings in PL4, make yourself noticed here.

I’ll check in again in a few hours.

Thanks for hanging in there with me. (I just split several trees for winter stove fuel so I’m not doing anything I may regret!

I believe that I understand your advice. I have created an empty folder and PL 5 and LrC are both active there. I have quite both PL 5 and LrC. When I restart Lrc, I should fix all rogue keywords. (Rogue keywords are the ones that have been duplicated in error. I have verified that these rogue keywords are visible in LrC and can be corrected there.) I must note the folder(s) containing images with rogue keywords to use when I am back in PL later in the process.

When I am done eliminating the rogue keywords and re-assigning the proper hierarchical keywords, I then set LrC to write all metadata to images. I think this step requires me to select all images in the LrC database and perform a Metadata - Save Metadata to File (Ctrl-S) command. (This will take a boatload of time for 100,000 images.)

When LrC is done writing metadata to files, then I quit Lrc and say goodbye to Adobe forever.

Then I will delete the PL5 database to start over with a new database.

Next I will restart PL and delete any rogue keywords. I will then revisit the folders that contained the images with (now removed) rogue keywords and re-index those folders. I don’t exactly understand this but I’ll look up how to re-index folders.

At this point, all should be good. If I understand this properly, it occurred because I did not instruct LrC to write all metadata to files.

Hi Joanna, I am planning to follow platypus’s advice, although I’m not sure why the problem wouldn’t recur. Do you have any insight? Basically platpus would fix keywords in LrC, write all metadats to images in LrC, then start up with a new PL database. Thoughts?

Mark, having originally use LR for keywording and then migrated to PL without issues, I would do what platypus suggests as it makes perfect sense. PL reads LR keywords, including hierarchical keywords just fine.

I think that the issue was brought in, in PL5, when you added keywords (in the keyword tool) that were already present in a hierarchy as shown in the keyword list tool. Maybe going to and fro between Lr and PL, both set to automatically sync sidecar files, added to the mess too.

I set Lr and PL to NOT sync metadata automatically. This puts me in charge of metadata transfers.
My keyword master is Lightroom and I only use PL to edit metadata with copies of images for testing. Automatic metadata sync will add issues rather than to solve them…

@markinlcri, have you used PL4 for extensive customising, before installing PL5?
If so, we’ll also have to consider to bring PL4 settings data over to PL5…

I shoot Fuji and only switched to PL 5 recently. Never used earlier versions. I am spent so will pick this up tomorrow. It is clear that I have been a bit naughty with updating keywords with PL and LrC, though I intended to shut down my use of LrC as soon as my evaluation of PL was over.

I do have another question that might be relevant. In LrC I added leaf nodes but never added the parent keywords in the hierarchy. Should I have? LrC had a setting to allow searches based on parents that would include all children. I used that feature and decided to avoid tagging parent keywords. I suspect that I should get LrC to add all parents to all images before switching to PL.

Another way to phrase my previous question: LrC implicitly assigns parent keywords. Should parent keywords be explicitly assigned before migrating over to PL?

I don’t think that is necessary as PL reads the hierarchy just fine with only the leaf nodes assigned. You can assign parents and all intermediate nodes in PL later if you wish.

I personally prefer to just have the leaf nodes assigned and I do not like the addition of all combinations of the hierarchy cluttering my keywords.

…you’d also want to delete the .dop files before indexing again. PL5 stores keywords in these files too and might therefore reintroduce the rogue keywords…