From 865bb91f7ff9364d96dd4c2f06df911d23ed10a7 Mon Sep 17 00:00:00 2001 From: Michael Gutteridge Date: Thu, 23 Apr 2026 18:27:16 -0700 Subject: [PATCH 1/2] Add section for regulated data --- _compdemos/tmpdir.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/_compdemos/tmpdir.md b/_compdemos/tmpdir.md index df149888..78b2c80e 100644 --- a/_compdemos/tmpdir.md +++ b/_compdemos/tmpdir.md @@ -32,6 +32,14 @@ As part of the job allocation process, Slurm will set up a bespoke temporary dir Login nodes- the rhino and maestro hosts- don't have any special handling of temporary files. In most cases, temporary files are created in `/tmp`. This path is larger than on the compute nodes, but is still limited in size. Since these hosts aren't rebooted frequently, this path can fill up. Please take care when running your applications on these nodes- considerusing networked file systems like [temp](/scicomputing/store_temp/), particularly as the size of your datasets gets larger. +## Regulated Data and TMPDIR + +When using regulated data you'll need to determine how to handle temporary files. If a dataset requires storage in a specific location (as NIH GDS covered data does), you'll want to make sure to configure your environment and tools to use directories in those paths. This isn't always necessary depending on the nature of the temporary data and the regulations covering that data. + +### NIH GDS Covered Data + +Data covered by the NIH GDS needs to be stored in the _regulated_ storage service. Temporary files generated from those datasets may be also need to be stored in _regulated_. In those cases, there is a `temp` subdirectory configured for storage of this data. Each user with access to a regulated data set will have a temporary directory configured in _regulated_. + ## Application Specific Notes > This section describes how to configure temporary directories for applications which don't honor the TMPDIR environment variable convention. From c1212a81baf5412bfcc1ce50dd2db6addff9bbf4 Mon Sep 17 00:00:00 2001 From: Michael Gutteridge Date: Thu, 23 Apr 2026 18:34:18 -0700 Subject: [PATCH 2/2] Add note on proof --- _compdemos/tmpdir.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/_compdemos/tmpdir.md b/_compdemos/tmpdir.md index 78b2c80e..30e42e93 100644 --- a/_compdemos/tmpdir.md +++ b/_compdemos/tmpdir.md @@ -40,6 +40,8 @@ When using regulated data you'll need to determine how to handle temporary files Data covered by the NIH GDS needs to be stored in the _regulated_ storage service. Temporary files generated from those datasets may be also need to be stored in _regulated_. In those cases, there is a `temp` subdirectory configured for storage of this data. Each user with access to a regulated data set will have a temporary directory configured in _regulated_. +PROOF can automatically handle temporary file placement for regulated data- more information on using GDS covered data in PROOF can be found [here](https://sciwiki.fredhutch.org/datademos/proof-regulated/) + ## Application Specific Notes > This section describes how to configure temporary directories for applications which don't honor the TMPDIR environment variable convention.