=================================== Workflow Parameters =================================== The workflow parameters should be included in a configuration file, an example of which can be found at https://raw.githubusercontent.com/mriffle/nf-carafe-ai-ms/main/resources/pipeline.config The parameters in this file should be changed to indicate the locations of your data, the options you'd like to use for the software included in the workflow, and the capabilities and configuration for the system on which you are running the workflow steps. The configuration file is roughly organized as: .. code-block:: groovy params { ... } profiles { ... } mail { ... } - The ``params`` section includes locations of data and configuration options for a specific run of the workflow. - The ``profiles`` sections includes parameters that describe the capabilities of the systems that run the steps of the workflow. For example, if running on your local system, this will include things like how many cores and how much RAM may be used by the steps of the workflow. This will not need to be changed for each run of the workflow. - The ``mail`` section includes configuration options for sending email. This is optional and only necessary if you wish to send emails when the workflow completes. This will not need to be changed for each run of the workflow. Below is a complete description of all parameters that may be included in these sections. .. note:: This workflow can process files stored in **PanoramaWeb**. When specifying directories or file locations, any paths that begin with ``https://`` will be interpreted as being PanoramaWeb locations. For example, to process a single raw file stored in PanoramaWeb, you would have the following in your pipeline.config file: .. code-block:: bash spectra_file = 'https://panoramaweb.org/_webdav/path/to/@files/RawFiles/my_file.raw' To process multiple files from a PanoramaWeb directory: .. code-block:: bash spectra_dir = 'https://panoramaweb.org/_webdav/path/to/@files/RawFiles' spectra_dir_glob = '*.raw' Where the URL is the WebDav URL of the file or directory on the Panorama server. **Bruker data on PanoramaWeb:** Only ``.d.zip`` files can be downloaded from PanoramaWeb (not ``.d`` directories). Use a glob pattern like ``*.d.zip`` when processing Bruker data from PanoramaWeb. The ``params`` Section ^^^^^^^^^^^^^^^^^^^^^^^ .. list-table:: Parameters for the ``params`` section :widths: 5 20 75 :header-rows: 1 * - Req? - Parameter Name - Description * - ✓ - ``carafe_fasta_file`` - FASTA file used by Carafe to generate final spectral library. * - \* - ``spectra_file`` - Path to a single spectra file or directory to process. Supported types: Thermo RAW (``.raw``), mzML (``.mzML``), Bruker raw directory (``.d``), or Bruker zipped raw (``.d.zip``). May be a local path, S3 URI, or PanoramaWeb URL. Note: Bruker ``.d`` directories cannot be downloaded from PanoramaWeb; use ``.d.zip`` files instead. Mutually exclusive with ``spectra_dir``. * - \* - ``spectra_dir`` - Path to a directory containing spectra files (local path or PanoramaWeb WebDAV URL). Supported file types: ``.raw``, ``.mzML``, ``.d``, or ``.d.zip``. Note: Bruker ``.d`` directories cannot be downloaded from PanoramaWeb; use ``.d.zip`` files instead. Mutually exclusive with ``spectra_file``. Use with ``spectra_dir_glob`` to select which files to process. * - - ``spectra_dir_glob`` - Glob pattern to select files from ``spectra_dir``. All matched files must be the same type (``.raw``, ``.mzML``, ``.d``, or ``.d.zip``). Default: ``'*.raw'`` * - - ``output_format`` - The final output format of the generated spectral library. Must be one of ``'diann'`` or ``'encyclopedia'``. Default: ``'diann'`` * - - ``cli_options`` - Command line options to pass to Carafe. The default includes sensible settings for most general DIA searches. Do not set the ``-mode``, ``-varMod``, ``-maxVar``, ``-ms``, ``-db``, ``-i``, ``-se``, ``-lf_type``, or ``-device`` parameters, these are handled by the workflow. See https://github.com/Noble-Lab/Carafe for more details. * - - ``include_phosphorylation`` - Set to ``true`` to include phosphorylation (STY) as a variable modification in the Carafe spectral library. Default: ``false``. * - - ``include_oxidized_methionine`` - Set to ``true`` to include oxidized methionine (M) as a variable modification in the Carafe spectral library. Default: ``false``. * - - ``max_mod_option`` - The maximum number of variable modifications allowed per peptide, specified as a Carafe CLI argument. Ignored if no variable modifications are enabled. Default: ``'-maxVar 1'``. * - - ``diann_fasta_file`` - The FASTA file used by DIA-NN. If not set ``carafe_fasta_file`` will be used. Default: not set. * - - ``diann_params`` - The command line parameters passed to DIA-NN. Default: ``'--unimod4 --qvalue 0.01 --cut \'K*,R*,!*P\' --reanalyse --smart-profiling'`` * - - ``peptide_results_file`` - The path to a .TSV or .parquet file output by DIA-NN containing peptide identifications. If this parameter is set, the DIA-NN search will be skipped and this file used. Default: none (run DIA-NN). * - - ``msconvert.do_demultiplex`` - If starting with raw files, this is the value used by ``msconvert`` for the ``do_demultiplex`` parameter. Default: ``true``. * - - ``msconvert.do_simasspectra`` - If starting with raw files, this is the value used by ``msconvert`` for the ``do_simasspectra`` parameter. Default: ``true``. * - - ``email`` - The email address to which a notification should be sent upon workflow completion. If no email is specified, no email will be sent. To send email, you must configure mail server settings (see below). The ``profiles`` Section ^^^^^^^^^^^^^^^^^^^^^^^^ The example configuration file includes this ``profiles`` section: .. code-block:: groovy profiles { // "standard" is the profile used when the steps of the workflow are run // locally on your computer. These parameters should be changed to match // your system resources (that you are willing to devote to running // workflow jobs). standard { params.max_memory = '8.GB' params.max_cpus = 4 params.max_time = '240.h' params.mzml_cache_directory = '/data/mass_spec/nextflow/nf-carafe-ai-ms/mzml_cache' params.panorama_cache_directory = '/data/mass_spec/nextflow/panorama/raw_cache' } } These parameters describe the capability of your local computer for running the steps of the workflow. Below is a description of each parameter: .. list-table:: Parameters for the ``profiles/standard`` section :widths: 5 20 75 :header-rows: 1 * - Req? - Parameter Name - Description * - ✓ - ``params.max_memory`` - The maximum amount of RAM that may be used by steps of the workflow. Default: 8 gigabytes. * - ✓ - ``params.max_cpus`` - The number of cores that may be used by the workflow. Default: 4 cores. * - ✓ - ``params.max_time`` - The maximum amount of a time a step in the workflow may run before it is stopped and error generated. Default: 240 hours. * - ✓ - ``params.mzml_cache_directory`` - When ``msconvert`` converts a RAW file to mzML, the mzML file is cached for future use. This specifies the directory in which the cached mzML files are stored. * - ✓ - ``params.panorama_cache_directory`` - If the RAW files to be processed are in PanoramaWeb, the RAW files will be downloaded to and cached in this directory for future use. The ``mail`` Section ^^^^^^^^^^^^^^^^^^^^^^^ This is a more advanced and entirely optional set of parameters. When the workflow completes, it can optionally send an email to the address specified above in the ``params`` section. For this to work, the following parameters must be changed to match the settings of your email server. You may need to contact your IT department to obtain the appropriate settings. The example configuration file includes this ``mail`` section: .. code-block:: groovy mail { from = 'address@host.com' smtp.host = 'smtp.host.com' smtp.port = 587 smtp.user = 'smpt_user' smtp.password = 'smtp_password' smtp.auth = true smtp.starttls.enable = true smtp.starttls.required = false mail.smtp.ssl.protocols = 'TLSv1.2' } Below is a description of each parameter: .. list-table:: Parameters for the ``profiles/standard`` section :widths: 5 20 75 :header-rows: 1 * - Req? - Parameter Name - Description * - ✓ - ``from`` - The email address **from** which the email should be sent. * - ✓ - ``smtp.host`` - The internet address (host name or ip address) of the email SMTP server. * - ✓ - ``smtp.port`` - The port on the host to connect to. Most likely will be ``587``. * - - ``smtp.user`` - If authentication is required, this is the username. * - - ``smtp.password`` - If authentication is required, this is the password. * - ✓ - ``smtp.auth`` - Whether or not (true or false) authentication is required. * - ✓ - ``smtp.starttls.enable`` - Whether or not to enable TLS support. * - ✓ - ``smtp.starttls.required`` - Whether or not TLS is required. * - ✓ - ``smtp.ssl.protocols`` - SSL protocol to use for sending SMTP messages.