Skip to content

Conversation

@labkey-adam
Copy link
Contributor

@labkey-adam labkey-adam commented Jan 11, 2026

Rationale

Improve the core.Documents query by joining to each document's parent and providing ParentDescription and Orphaned columns. This makes it easier to find each parent document and to identify attachments that have been orphaned. https://github.com/LabKey/internal-issues/issues/711

Also, fix the extraction of entity IDs from data class LSIDs. At some point, the extraction method got out-of-sync with the LSIDs we produce, which resulted in data class attachment rows not getting correctly populated with their parent type. Once we've tested this PR, I'll add a new SQL script to re-run the upgrade code that populates the ParentType column.

Some notes and caveats:

  • All orphaned attachments have a null parent description, so filtering on ParentDescription IS BLANK is a reasonable way to find orphans. However, it may be possible for a parent to have a null description (e.g., a parent document with a null title); the hidden boolean field Orphaned provides a definitive check.
  • For now, "global" attachments all resolve to "Global Attachment Parent" instead of the specific Notebook or Notebook Entry. This will be reconciled once the global attachment approach is integrated into the standard attachment service.
  • Some attachments (e.g., look and feel resources, file system snapshots) use the container as their parent, so the parent description is simply the name of the container (or <Root>).

Changes

  • Add AttachmentParentType.getSelectEntityIdAndDescriptionSql() that returns SQL that selects the parents' EntityId and a Description that identifies that parent. Rework getSelectParentEntityIdsSql() to simply wrap the new method, selecting only EntityId (to allow use within an IN clause).
  • Implement getSelectEntityIdAndDescriptionSql() on every attachment parent. Stop overriding getSelectParentEntityIdsSql() and disallow return of null.
  • Fix Lsid.getSqlExpressionToExtractObjectId() so it correctly extracts EntityIds from data class LSIDs. Add a junit test to keep this working.
  • Add LabKeyCollectors.joining(SQLFragment) that joins a Stream<SQLFragment> into a single SQLFragment with a delimiter.
  • Simplify and adjust SQLFragment.join() to correctly handle corner cases like CTEs and temp tokens.

Tasks 📍

  • Manual test AdjudicationAssayResultType @cnathe
  • Manual test PanoramaPublicLogoResourceType & CatalogImageAttachmentType @labkey-jeckels
  • Manual test ExpDataClassType @labkey-jeckels
  • Testing a specific attachment parent type typically entails:
    1. Add one or more attachments of the appropriate type
    2. Clear the core.Documents.ParentType column via direct DB tool (e.g., UPDATE core.Documents SET ParentType = NULL)
    3. Visit Admin Console -> SQL Scripts -> Upgrade Code -> Invoke "Core: populateAttachmentParentTypeColumn (@DeferredUpgrade)"
    4. View core.Documents to ensure all attachments of that type are tagged with the appropriate ParentType and show a reasonable ParentDescription
  • General manual testing @labkey-tchad
    • Ensure orphaned attachments are highlighted (likely requires direct database delete of parent rows)
    • Exercise Attachments action from admin console. This is also available as (hidden query) core.DocumentsGroupedByParentType, which allows querying specific projects and folders with a container filter applied.
    • Exercise core.Documents query and verify reasonable parent types and descriptions
    • Verify that hidden action admin-findAttachmentParents.view resolves attachments (except orphaned and global)
  • Needs Automation - junit test added for verifying getSqlExpressionToExtractObjectId(). Tests already exist for core.Documents.
  • Teamcity review and merge

@labkey-adam labkey-adam self-assigned this Jan 12, 2026
@labkey-adam labkey-adam added this to the 25.03 milestone Jan 12, 2026
@labkey-adam labkey-adam removed this from the 25.03 milestone Jan 13, 2026
@labkey-robert labkey-robert added this to the 26.02 milestone Jan 13, 2026
@labkey-jeckels
Copy link
Contributor

@labkey-adam I tested those attachment parent types and they seemed to work as I'd expect. However, I don't understand your suggested validation steps. When I null out ParentType via direct SQL, it doesn't come back as your third step implies. I wouldn't expect it to. Am I misunderstanding the final step?

Also, FWIW, the core.Documents grid renders very slowly now, taking 20-30 seconds. It's doing tens of thousands of queries against exp.DomainDescriptor with stacks like this. I have enough containers that it must be thrashing the cache. Maybe we don't care because the adjudication module is not commonly deployed, but perhaps this should be switched to query for all domains that match instead of asking each container if it has one.

at org.labkey.api.data.dialect.StatementWrapper.afterExecute(StatementWrapper.java:2821)
at org.labkey.api.data.dialect.StatementWrapper.executeQuery(StatementWrapper.java:1260)
at org.labkey.api.data.SqlExecutingSelector$ExecutingResultSetFactory.executeQuery(SqlExecutingSelector.java:501)
at org.labkey.api.data.SqlExecutingSelector$ExecutingResultSetFactory.handleResultSet(SqlExecutingSelector.java:403)
at org.labkey.api.data.BaseSelector.getArrayList(BaseSelector.java:113)
at org.labkey.api.data.BaseSelector.getArrayList(BaseSelector.java:108)
at org.labkey.api.exp.OntologyManager.fetchDomainDescriptorFromDB(OntologyManager.java:176)
at org.labkey.api.exp.OntologyManager.fetchDomainDescriptorFromDB(OntologyManager.java:162)
at org.labkey.api.exp.OntologyManager.lambda$static$0(OntologyManager.java:156)
at org.labkey.api.cache.BlockingCache.load(BlockingCache.java:190)
at org.labkey.api.data.DatabaseCache$BlockingDatabaseCache.load(DatabaseCache.java:86)
at org.labkey.api.cache.BlockingCache.get(BlockingCache.java:162)
at org.labkey.api.cache.BlockingCache.get(BlockingCache.java:90)
at org.labkey.api.cache.BlockingCache.get(BlockingCache.java:228)
at org.labkey.api.exp.OntologyManager.getDomainDescriptor(OntologyManager.java:2482)
at org.labkey.experiment.api.property.PropertyServiceImpl.getDomain(PropertyServiceImpl.java:139)
at org.labkey.experiment.api.property.PropertyServiceImpl.getDomain(PropertyServiceImpl.java:132)
at org.labkey.experiment.api.property.PropertyServiceImpl.getDomain(PropertyServiceImpl.java:106)
at org.labkey.adjudication.AdjudicationSchema.getAssayResultsDomainIfExists(AdjudicationSchema.java:168)
at org.labkey.adjudication.AdjudicationAssayResultType.lambda$getSelectEntityIdAndDescriptionSql$0(AdjudicationAssayResultType.java:54)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1118)
at org.labkey.adjudication.AdjudicationAssayResultType.getSelectEntityIdAndDescriptionSql(AdjudicationAssayResultType.java:53)
at org.labkey.core.query.DocumentsTable.lambda$getFromSQL$0(DocumentsTable.java:49)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:214)
at java.base/java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1788)

@labkey-adam
Copy link
Contributor Author

@labkey-adam I tested those attachment parent types and they seemed to work as I'd expect. However, I don't understand your suggested validation steps. When I null out ParentType via direct SQL, it doesn't come back as your third step implies. I wouldn't expect it to. Am I misunderstanding the final step?

Also, FWIW, the core.Documents grid renders very slowly now, taking 20-30 seconds. It's doing tens of thousands of queries against exp.DomainDescriptor with stacks like this. I have enough containers that it must be thrashing the cache. Maybe we don't care because the adjudication module is not commonly deployed, but perhaps this should be switched to query for all domains that match instead of asking each container if it has one.

at org.labkey.api.data.dialect.StatementWrapper.afterExecute(StatementWrapper.java:2821)
at org.labkey.api.data.dialect.StatementWrapper.executeQuery(StatementWrapper.java:1260)
at org.labkey.api.data.SqlExecutingSelector$ExecutingResultSetFactory.executeQuery(SqlExecutingSelector.java:501)
at org.labkey.api.data.SqlExecutingSelector$ExecutingResultSetFactory.handleResultSet(SqlExecutingSelector.java:403)
at org.labkey.api.data.BaseSelector.getArrayList(BaseSelector.java:113)
at org.labkey.api.data.BaseSelector.getArrayList(BaseSelector.java:108)
at org.labkey.api.exp.OntologyManager.fetchDomainDescriptorFromDB(OntologyManager.java:176)
at org.labkey.api.exp.OntologyManager.fetchDomainDescriptorFromDB(OntologyManager.java:162)
at org.labkey.api.exp.OntologyManager.lambda$static$0(OntologyManager.java:156)
at org.labkey.api.cache.BlockingCache.load(BlockingCache.java:190)
at org.labkey.api.data.DatabaseCache$BlockingDatabaseCache.load(DatabaseCache.java:86)
at org.labkey.api.cache.BlockingCache.get(BlockingCache.java:162)
at org.labkey.api.cache.BlockingCache.get(BlockingCache.java:90)
at org.labkey.api.cache.BlockingCache.get(BlockingCache.java:228)
at org.labkey.api.exp.OntologyManager.getDomainDescriptor(OntologyManager.java:2482)
at org.labkey.experiment.api.property.PropertyServiceImpl.getDomain(PropertyServiceImpl.java:139)
at org.labkey.experiment.api.property.PropertyServiceImpl.getDomain(PropertyServiceImpl.java:132)
at org.labkey.experiment.api.property.PropertyServiceImpl.getDomain(PropertyServiceImpl.java:106)
at org.labkey.adjudication.AdjudicationSchema.getAssayResultsDomainIfExists(AdjudicationSchema.java:168)
at org.labkey.adjudication.AdjudicationAssayResultType.lambda$getSelectEntityIdAndDescriptionSql$0(AdjudicationAssayResultType.java:54)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1118)
at org.labkey.adjudication.AdjudicationAssayResultType.getSelectEntityIdAndDescriptionSql(AdjudicationAssayResultType.java:53)
at org.labkey.core.query.DocumentsTable.lambda$getFromSQL$0(DocumentsTable.java:49)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:214)
at java.base/java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1788)

Thanks... I omitted the important step of invoking the upgrade code manually. I've updated the steps to include that.

I'll look at querying the domains globally, although I didn't see a service method for that.

@labkey-adam
Copy link
Contributor Author

@labkey-adam I tested those attachment parent types and they seemed to work as I'd expect. However, I don't understand your suggested validation steps. When I null out ParentType via direct SQL, it doesn't come back as your third step implies. I wouldn't expect it to. Am I misunderstanding the final step?
Also, FWIW, the core.Documents grid renders very slowly now, taking 20-30 seconds. It's doing tens of thousands of queries against exp.DomainDescriptor with stacks like this. I have enough containers that it must be thrashing the cache. Maybe we don't care because the adjudication module is not commonly deployed, but perhaps this should be switched to query for all domains that match instead of asking each container if it has one.

at org.labkey.api.data.dialect.StatementWrapper.afterExecute(StatementWrapper.java:2821)
at org.labkey.api.data.dialect.StatementWrapper.executeQuery(StatementWrapper.java:1260)
at org.labkey.api.data.SqlExecutingSelector$ExecutingResultSetFactory.executeQuery(SqlExecutingSelector.java:501)
at org.labkey.api.data.SqlExecutingSelector$ExecutingResultSetFactory.handleResultSet(SqlExecutingSelector.java:403)
at org.labkey.api.data.BaseSelector.getArrayList(BaseSelector.java:113)
at org.labkey.api.data.BaseSelector.getArrayList(BaseSelector.java:108)
at org.labkey.api.exp.OntologyManager.fetchDomainDescriptorFromDB(OntologyManager.java:176)
at org.labkey.api.exp.OntologyManager.fetchDomainDescriptorFromDB(OntologyManager.java:162)
at org.labkey.api.exp.OntologyManager.lambda$static$0(OntologyManager.java:156)
at org.labkey.api.cache.BlockingCache.load(BlockingCache.java:190)
at org.labkey.api.data.DatabaseCache$BlockingDatabaseCache.load(DatabaseCache.java:86)
at org.labkey.api.cache.BlockingCache.get(BlockingCache.java:162)
at org.labkey.api.cache.BlockingCache.get(BlockingCache.java:90)
at org.labkey.api.cache.BlockingCache.get(BlockingCache.java:228)
at org.labkey.api.exp.OntologyManager.getDomainDescriptor(OntologyManager.java:2482)
at org.labkey.experiment.api.property.PropertyServiceImpl.getDomain(PropertyServiceImpl.java:139)
at org.labkey.experiment.api.property.PropertyServiceImpl.getDomain(PropertyServiceImpl.java:132)
at org.labkey.experiment.api.property.PropertyServiceImpl.getDomain(PropertyServiceImpl.java:106)
at org.labkey.adjudication.AdjudicationSchema.getAssayResultsDomainIfExists(AdjudicationSchema.java:168)
at org.labkey.adjudication.AdjudicationAssayResultType.lambda$getSelectEntityIdAndDescriptionSql$0(AdjudicationAssayResultType.java:54)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1118)
at org.labkey.adjudication.AdjudicationAssayResultType.getSelectEntityIdAndDescriptionSql(AdjudicationAssayResultType.java:53)
at org.labkey.core.query.DocumentsTable.lambda$getFromSQL$0(DocumentsTable.java:49)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:214)
at java.base/java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1788)

Thanks... I omitted the important step of invoking the upgrade code manually. I've updated the steps to include that.

I'll look at querying the domains globally, although I didn't see a service method for that.

@labkey-jeckels I added PropertyService.getContainersWithDomains() and used it for both lists and adjudication assay results. Let me know if that improved query performance for you.

@labkey-jeckels
Copy link
Contributor

@labkey-adam I tested those attachment parent types and they seemed to work as I'd expect. However, I don't understand your suggested validation steps. When I null out ParentType via direct SQL, it doesn't come back as your third step implies. I wouldn't expect it to. Am I misunderstanding the final step?
Also, FWIW, the core.Documents grid renders very slowly now, taking 20-30 seconds. It's doing tens of thousands of queries against exp.DomainDescriptor with stacks like this. I have enough containers that it must be thrashing the cache. Maybe we don't care because the adjudication module is not commonly deployed, but perhaps this should be switched to query for all domains that match instead of asking each container if it has one.

at org.labkey.api.data.dialect.StatementWrapper.afterExecute(StatementWrapper.java:2821)
at org.labkey.api.data.dialect.StatementWrapper.executeQuery(StatementWrapper.java:1260)
at org.labkey.api.data.SqlExecutingSelector$ExecutingResultSetFactory.executeQuery(SqlExecutingSelector.java:501)
at org.labkey.api.data.SqlExecutingSelector$ExecutingResultSetFactory.handleResultSet(SqlExecutingSelector.java:403)
at org.labkey.api.data.BaseSelector.getArrayList(BaseSelector.java:113)
at org.labkey.api.data.BaseSelector.getArrayList(BaseSelector.java:108)
at org.labkey.api.exp.OntologyManager.fetchDomainDescriptorFromDB(OntologyManager.java:176)
at org.labkey.api.exp.OntologyManager.fetchDomainDescriptorFromDB(OntologyManager.java:162)
at org.labkey.api.exp.OntologyManager.lambda$static$0(OntologyManager.java:156)
at org.labkey.api.cache.BlockingCache.load(BlockingCache.java:190)
at org.labkey.api.data.DatabaseCache$BlockingDatabaseCache.load(DatabaseCache.java:86)
at org.labkey.api.cache.BlockingCache.get(BlockingCache.java:162)
at org.labkey.api.cache.BlockingCache.get(BlockingCache.java:90)
at org.labkey.api.cache.BlockingCache.get(BlockingCache.java:228)
at org.labkey.api.exp.OntologyManager.getDomainDescriptor(OntologyManager.java:2482)
at org.labkey.experiment.api.property.PropertyServiceImpl.getDomain(PropertyServiceImpl.java:139)
at org.labkey.experiment.api.property.PropertyServiceImpl.getDomain(PropertyServiceImpl.java:132)
at org.labkey.experiment.api.property.PropertyServiceImpl.getDomain(PropertyServiceImpl.java:106)
at org.labkey.adjudication.AdjudicationSchema.getAssayResultsDomainIfExists(AdjudicationSchema.java:168)
at org.labkey.adjudication.AdjudicationAssayResultType.lambda$getSelectEntityIdAndDescriptionSql$0(AdjudicationAssayResultType.java:54)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1118)
at org.labkey.adjudication.AdjudicationAssayResultType.getSelectEntityIdAndDescriptionSql(AdjudicationAssayResultType.java:53)
at org.labkey.core.query.DocumentsTable.lambda$getFromSQL$0(DocumentsTable.java:49)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:214)
at java.base/java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1788)

Thanks... I omitted the important step of invoking the upgrade code manually. I've updated the steps to include that.
I'll look at querying the domains globally, although I didn't see a service method for that.

@labkey-jeckels I added PropertyService.getContainersWithDomains() and used it for both lists and adjudication assay results. Let me know if that improved query performance for you.

Perf is much better now. <1 sec to render the grid across all containers.

Adding that missing repro step repopulates ParentType and enables the join to the Description as expected.

@labkey-tchad
Copy link
Member

"Ensure orphaned attachments are highlighted"

This just means "orphaned=true", right? Not that there's some conditional formatting on the grid or something.

@labkey-adam
Copy link
Contributor Author

"Ensure orphaned attachments are highlighted"

This just means "orphaned=true", right? Not that there's some conditional formatting on the grid or something.

Correct

@labkey-tchad
Copy link
Member

findAttachmentParents seems confused by "LookAndFeelResource" attachments. It ties them to multiple tables.

image

@labkey-tchad
Copy link
Member

Verify that hidden action admin-findAttachmentParents.view resolves attachments (except orphaned and global)

Both orphaned and global attachments appear on findAttachmentParents. Not sure whether this is actually a concern.

@labkey-adam
Copy link
Contributor Author

labkey-adam commented Jan 15, 2026

Verify that hidden action admin-findAttachmentParents.view resolves attachments (except orphaned and global)

Both orphaned and global attachments appear on findAttachmentParents. Not sure whether this is actually a concern.

Not a concern; they're meant to appear on the page but won't resolve to any tables.

@labkey-adam
Copy link
Contributor Author

labkey-adam commented Jan 15, 2026

findAttachmentParents seems confused by "LookAndFeelResource" attachments. It ties them to multiple tables.

image

@labkey-tchad Good catch. Attachments with a Container as parent were resolving to any table with rows in that container (via that table's Container column or similar). I now exclude the container columns to avoid these spurious joins. Try updating and building the latest. There's still a bit of duplication (e.g., data class attachments will be associated with exp.Data in addition to the data class provisioned table), but this hidden action is really just an orphan debugging tool, so it doesn't need to be perfect.

@cnathe
Copy link
Contributor

cnathe commented Jan 15, 2026

@labkey-adam Issue found with core.Documents query grid view locally. Per chat with Adam, this looks to be a list issue with a casing discrepancy for the key name:
Screenshot 2026-01-15 at 8 47 25 AM

@cnathe
Copy link
Contributor

cnathe commented Jan 15, 2026

@labkey-adam Issue found with core.Documents query grid view locally. Per chat with Adam, this looks to be a list issue with a casing discrepancy for the key name: Screenshot 2026-01-15 at 8 47 25 AM

Fix verified. Looks good. Tested in PG and MSSQL

@labkey-tchad
Copy link
Member

@labkey-tchad Good catch. Attachments with a Container as parent were resolving to any table with rows in that container (via that table's Container column or similar). I now exclude the container columns to avoid these spurious joins. Try updating and building the latest. There's still a bit of duplication (e.g., data class attachments will be associated with exp.Data in addition to the data class provisioned table), but this hidden action is really just an orphan debugging tool, so it doesn't need to be perfect.

Fix verified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants