Skip to content

Commit f9bb485

Browse files
authored
Update passing_files_to_DCP.md
1 parent 851e295 commit f9bb485

File tree

1 file changed

+8
-5
lines changed

1 file changed

+8
-5
lines changed

documentation/DCP-documentation/passing_files_to_DCP.md

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,12 @@
22

33
Distributed-CellProfiler can be told what files to use through LoadData.csv, Batch Files, or file lists.
44

5+
## Metadata use in DCP
6+
7+
Distributed-CellProfiler requires metadata and grouping in order to split jobs. This means that, unlikely a generic CellProfiler workflow, the inclusion of metadata and grouping are NOT optional for pipelines you wish to use in Distributed-CellProfiler.
8+
- If using LoadData, this means ensuring that your input CSV has some metadata to use for grouping and "Group images by metdata?" is set to "Yes".
9+
- If using batch files or file lists, this means ensuring that the Metadata and Groups modulles are enabled, and that you are extracting metadata from file and folder names _that will also be present in your remote system_ in the Metadata module in your CellProfiler pipeline. You can pass additional metadata to CellProfiler by `Add another extraction method`, setting the method to `Import from file` and setting Metadata file location to `Default Input Folder`. Metadata of either type can be used for grouping.
10+
511
## Load Data
612

713
![LoadData.csv](images/LoadDataCSV.png)
@@ -58,15 +64,12 @@ Note that if you do not follow our standard file organization, under **#not proj
5864

5965
## File lists
6066

61-
You can also simply pass a list of absolute file paths (not relative paths) with one file per row in `.txt` format.
62-
Note that file lists themselves do not associate metadata with file paths (in contrast to LoadData.csv files where you can enter any metadata columns you desire.)
63-
Therefore, you need to extract metadata for Distributed-CellProfiler to use for grouping by extracting metadata from file and folder names in the Metadata module in your CellProfiler pipeline.
64-
You can pass additional metadata to CellProfiler by `Add another extraction method`, setting the method to `Import from file` and setting Metadata file location to `Default Input Folder`.
67+
You can also simply pass a list of absolute file paths (not relative paths) with one file per row in `.txt` format. These must be the absolute paths that Distributed-CellProfiler will see, aka relative to the root of your bucket (which will be mounted as `/bucket`.
6568

6669
### Creating File Lists
6770

6871
Use any text editing software to create a `.txt` file where each line of the file is a path to a single image that you want to process.
6972

7073
### Using File Lists
7174

72-
To use a file list with submitJobs, put the path to the `.txt` file in **data_file:**.
75+
To use a file list with submitJobs, put the path to the `.txt` file in **data_file:**.

0 commit comments

Comments
 (0)