Description of the bug
Hi there. It looks like the DCC module expects an output from circtools-detect named "_tmp_circtools/tmp_printcirclines*", but it appears to me that circtools is only producing this file when paired-end fastqs are analyzed. Thus, the pipeline fails at line 42 of circrna/modules/local/dcc/dcc/main.nf when input files are single-end:
mv: can't rename '_tmp_circtools/tmp_printcirclines.[0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z]': No such file or directory
I'm not too familiar with DCC/circtools so not sure if there is another output from DCC/circtools that would be better used. Here were the output specifications of circtools-detect.. https://github.com/dieterich-lab/circtools/blob/c8b7f8447faa2d8081fcbbb13e91cb8e8f18a88c/docs/Detect.rst#output-files
Output files
The output of circtools detect consists of the following four files: CircRNACount, CircCoordinates, LinearCount and CircSkipJunctions.
-
CircRNACount: a table containing read counts for circRNAs detected. First three columns are chr, circRNA start, circRNA end. From fourth column on are the circRNA read counts, one sample per column, shown in the order given in your samplesheet.
-
CircCoordinates: circular RNA annotations in BED format. The columns are chr, start, end, genename, junctiontype (based on STAR; 0: non-canonical; 1: GT/AG, 2: CT/AC, 3: GC/AG, 4: CT/GC, 5: AT/AC, 6: GT/AT), strand, circRNA region (startregion-endregion), overall regions (the genomic features circRNA coordinates interval covers).
-
LinearCount: host gene expression count table, same setup with CircRNACount file.
-
CircSkipJunctions: circSkip junctions. The first three columns are the same as in LinearCount/CircRNACount, the following columns represent the circSkip junctions found for each sample. circSkip junctions are given as chr:start-end:count, e.g. chr1:1787-6949:10. It is possible that for one circRNA multiple circSkip junctions are found due to the fact the the circular RNA may arise from different isoforms. In this case, multiple circSkip junctions are delimited with semicolon. A 0 implies that no circSkip junctions have been found for this circRNA.
Command used and terminal output
nextflow run nf-core/circrna -r dev -profile test,singularity --tools dcc --input samples_single.csv
-[nf-core/circrna] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_CIRCRNA:CIRCRNA:BSJ_DETECTION:DCC:MAIN (fust1_3)'
Caused by:
Process `NFCORE_CIRCRNA:CIRCRNA:BSJ_DETECTION:DCC:MAIN (fust1_3)` terminated with an error exit status (1)
Command executed:
printf "paired.junctions" > samplesheet
circtools detect @samplesheet -D -an gtf.filtered.gtf -F -M -k -Nr 1 1 -A chrI.fa -N -T 4
mv _tmp_circtools/tmp_printcirclines.[0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z] fust1_3_reads.junctions
mv CircCoordinates fust1_3_coordinates.tsv
mv CircRNACount fust1_3_counts.tsv
cat <<-END_VERSIONS > versions.yml
"NFCORE_CIRCRNA:CIRCRNA:BSJ_DETECTION:DCC:MAIN":
circtools: $(circtools -V)
END_VERSIONS
Command exit status:
1
Command output:
Output folder ./ already exists, reusing
circtools 2.0 started
28 CPU cores available, using 4
started circRNA detection from file paired.junctions
=> locating circRNAs (unstranded mode) [paired.junctions]
=> sorting circRNAs (unstranded mode) [paired.junctions]
finished circRNA detection from file paired.junctions
WARNING: non-stranded data, the strand of circRNAs guessed from the strand of host genes
Combining individual circRNA read counts
Using files _tmp_DCC/tmp_circCount and _tmp_DCC/tmp_coordinates for filtering
Filtering by read counts
Remove ChrM
Count CircSkip junctions
Command error:
INFO: Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
INFO: Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
Output folder ./ already exists, reusing
circtools 2.0 started
28 CPU cores available, using 4
started circRNA detection from file paired.junctions
=> locating circRNAs (unstranded mode) [paired.junctions]
=> sorting circRNAs (unstranded mode) [paired.junctions]
finished circRNA detection from file paired.junctions
WARNING: non-stranded data, the strand of circRNAs guessed from the strand of host genes
Combining individual circRNA read counts
Using files _tmp_DCC/tmp_circCount and _tmp_DCC/tmp_coordinates for filtering
Filtering by read counts
Remove ChrM
Count CircSkip junctions
mv: can't rename '_tmp_circtools/tmp_printcirclines.[0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z][0-9A-Z]': No such file or directory
Work dir:
/*****/projects/nf_circ_test/bug/work/15/7ec763644d95e985dbf218370d3707
Container:
/*****/scratch/nxf_sing_cache/depot.galaxyproject.org-singularity-circtools-2.0--pyhdfd78af_0.img
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`
-- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting
-- Check '.nextflow.log' file for details
Relevant files
samples_single.csv
System information
No response
Description of the bug
Hi there. It looks like the DCC module expects an output from circtools-detect named "_tmp_circtools/tmp_printcirclines*", but it appears to me that circtools is only producing this file when paired-end fastqs are analyzed. Thus, the pipeline fails at line 42 of circrna/modules/local/dcc/dcc/main.nf when input files are single-end:
I'm not too familiar with DCC/circtools so not sure if there is another output from DCC/circtools that would be better used. Here were the output specifications of circtools-detect.. https://github.com/dieterich-lab/circtools/blob/c8b7f8447faa2d8081fcbbb13e91cb8e8f18a88c/docs/Detect.rst#output-files
Command used and terminal output
Relevant files
samples_single.csv
System information
No response