nf-core #oncoanalyser

Ryan

04/14/2025, 4:45 PM

Hi! I was very excited to see that version 2.0.0 was released recently and I happened to have a few tumor only DNA samples to run on this pipeline. I ran the test profile first and had no issue. I then ran some WGS data through it and encountered an issue with LINX Visualizer. I ran the pipeline as so:

Copy code

nextflow run nf-core/oncoanalyser   -profile docker   -revision 2.0.0   --mode wgts   --input samplesheet.csv   --outdir output_OncoAnalyzer -c ../oncoanalyzer_ref.config --genome GRCh38_hmf -resume -c ../OA_resource.config -c ../nfcore_resource.config

Here is the error I receive:

Copy code

-[nf-core/oncoanalyser] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_ONCOANALYSER:WGTS:LINX_PLOTTING:LINX_VISUALISER (C42B_9DR_beta)'

Caused by:
  Process `NFCORE_ONCOANALYSER:WGTS:LINX_PLOTTING:LINX_VISUALISER (C42B_9DR_beta)` terminated with an error exit status (1)


Command executed:

  # NOTE(SW): the output plot directories are always required for ORANGE, which is straightfoward to handle with POSIX
  # fs but more involved with FusionFS since it will not write empty directories to S3. A placeholder file can't be
  # used in the plot directory to force FusionFS to create the directory as ORANGE will treat the placeholder as a PNG
  # and fail. Optional outputs are possible but requires further channel logic and output to detect when complete.
  # Instead I place the two plot output directories under a parent directory, only operating on that to allow use of a
  # placeholder and support empty outputs when using FusionFS. Handling missing/non-existent directories are deferred
  # to downstream processes, bypassing the need to implement further channel operations.

  mkdir -p plots/

  # NOTE(SW): LINX v1.24.1 require trailing slashes for the -plot_out and -data_out arguments since no filesystem
  # separator is used when constructing fusion plot output filepaths.

  # <https://github.com/hartwigmedical/hmftools/blob/linx-v1.24.1/linx/src/main/java/com/hartwig/hmftools/linx/visualiser/circos/ChromosomeRangeExecution.java#L22-L29>
  # <https://github.com/hartwigmedical/hmftools/blob/linx-v1.24.1/linx/src/main/java/com/hartwig/hmftools/linx/visualiser/circos/FusionExecution.java#L18-L23>

  # Generate all chromosome and cluster plots by default

  linx \
      -Xmx12884901888 \
      com.hartwig.hmftools.linx.visualiser.SvVisualiser \
       \
      -sample C42B_9DR_beta \
      -vis_file_dir linx_somatic \
      -ref_genome_version 38 \
      -ensembl_data_dir ensembl_data \
      -circos $(which circos) \
      -threads 16 \
      -plot_out plots/all/ \
      -data_out data/all/

  # Rerun LINX to render only reportable cluster plots in a separate directory. While this is regenerating existing
  # cluster plots, the number of reportable plots is generally very small and I prefer to rely on the internal LINX
  # logic to determine whether a cluster is reportable rather than attempting to infer manually to copy out target
  # plot files.

  # The ORANGE report receives only reportable clusters while the gpgr LINX report receives chromosome and all cluster
  # plots.

  # <https://github.com/hartwigmedical/hmftools/blob/linx-v1.24.1/linx/src/main/java/com/hartwig/hmftools/linx/visualiser/SampleData.java#L220-L236>

  linx \
      -Xmx12884901888 \
      com.hartwig.hmftools.linx.visualiser.SvVisualiser \
       \
      -sample C42B_9DR_beta \
      -vis_file_dir linx_somatic \
      -ref_genome_version 38 \
      -ensembl_data_dir ensembl_data \
      -circos $(which circos) \
      -plot_reportable \
      -threads 16 \
      -plot_out plots/reportable/ \
      -data_out data/reportable/

  # Create placeholders to force FusionFS to create parent plot directory on S3
  if [[ $(ls plots/ | wc -l) -eq 0 ]]; then
      touch plots/.keep;
  fi;

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_ONCOANALYSER:WGTS:LINX_PLOTTING:LINX_VISUALISER":
      linx: $(linx -version | sed -n '/^Linx version / { s/^.* //p }')
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
      intersect

  The following objects are masked from 'package:IRanges':

      collapse, desc, intersect, setdiff, slice, union

  The following objects are masked from 'package:S4Vectors':

      first, intersect, rename, setdiff, setequal, union

  The following objects are masked from 'package:BiocGenerics':

      combine, intersect, setdiff, union

  The following objects are masked from 'package:stats':

      filter, lag

  The following objects are masked from 'package:base':

      intersect, setdiff, setequal, union

  Linking to ImageMagick 7.1.1.36
  Enabled features: cairo, fontconfig, freetype, fftw, heic, rsvg, webp, x11
  Disabled features: ghostscript, lcms, pango, raw
  03:44:50.635 [FATAL] error executing R script
  03:44:50.636 [WARN ] error adding chromosomal context
  03:44:50.637 [INFO ] generating C42B_9DR_beta.cluster-333.sv2.003.png via command: /usr/local/bin/circos -nosvg -conf data/all/C42B_9DR_beta.cluster-333.sv2.circos.003.conf -outputdir plots/all -outputfile C42B_9DR_beta.cluster-333.sv2.003.png
  03:44:54.010 [INFO ] generating C42B_9DR_beta.cluster-335.sv2.000.png via command: /usr/local/bin/circos -nosvg -conf data/all/C42B_9DR_beta.cluster-335.sv2.circos.000.conf -outputdir plots/all -outputfile C42B_9DR_beta.cluster-335.sv2.000.png
  03:44:56.508 [INFO ] generating C42B_9DR_beta.cluster-338.sv2.005.png via command: /usr/local/bin/circos -nosvg -conf data/all/C42B_9DR_beta.cluster-338.sv2.circos.005.conf -outputdir plots/all -outputfile C42B_9DR_beta.cluster-338.sv2.005.png
  03:44:56.524 [INFO ] generating C42B_9DR_beta.cluster-339.sv2.001.png via command: /usr/local/bin/circos -nosvg -conf data/all/C42B_9DR_beta.cluster-339.sv2.circos.001.conf -outputdir plots/all -outputfile C42B_9DR_beta.cluster-339.sv2.001.png
  03:44:56.921 [INFO ] generating C42B_9DR_beta.cluster-349.sv2.005.png via command: /usr/local/bin/circos -nosvg -conf data/all/C42B_9DR_beta.cluster-349.sv2.circos.005.conf -outputdir plots/all -outputfile C42B_9DR_beta.cluster-349.sv2.005.png
  03:44:57.463 [INFO ] generating C42B_9DR_beta.cluster-350.sv2.001.png via command: /usr/local/bin/circos -nosvg -conf data/all/C42B_9DR_beta.cluster-350.sv2.circos.001.conf -outputdir plots/all -outputfile C42B_9DR_beta.cluster-350.sv2.001.png
  03:44:57.518 [INFO ] generating C42B_9DR_beta.cluster-356.sv2.005.png via command: /usr/local/bin/circos -nosvg -conf data/all/C42B_9DR_beta.cluster-356.sv2.circos.005.conf -outputdir plots/all -outputfile C42B_9DR_beta.cluster-356.sv2.005.png
  03:44:59.249 [INFO ] generating C42B_9DR_beta.cluster-360.sv3.003.png via command: /usr/local/bin/circos -nosvg -conf data/all/C42B_9DR_beta.cluster-360.sv3.circos.003.conf -outputdir plots/all -outputfile C42B_9DR_beta.cluster-360.sv3.003.png
  03:45:00.152 [INFO ] generating C42B_9DR_beta.cluster-363.sv2.000.png via command: /usr/local/bin/circos -nosvg -conf data/all/C42B_9DR_beta.cluster-363.sv2.circos.000.conf -outputdir plots/all -outputfile C42B_9DR_beta.cluster-363.sv2.000.png
  03:45:00.362 [INFO ] generating C42B_9DR_beta.cluster-367.sv2.007.png via command: /usr/local/bin/circos -nosvg -conf data/all/C42B_9DR_beta.cluster-367.sv2.circos.007.conf -outputdir plots/all -outputfile C42B_9DR_beta.cluster-367.sv2.007.png
  03:45:02.119 [INFO ] generating C42B_9DR_beta.cluster-371.sv2.003.png via command: /usr/local/bin/circos -nosvg -conf data/all/C42B_9DR_beta.cluster-371.sv2.circos.003.conf -outputdir plots/all -outputfile C42B_9DR_beta.cluster-371.sv2.003.png
  java.util.concurrent.ExecutionException: java.lang.Exception: plotting error
        at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
        at com.hartwig.hmftools.linx.visualiser.SvVisualiser.run(SvVisualiser.java:127)
        at com.hartwig.hmftools.linx.visualiser.SvVisualiser.main(SvVisualiser.java:379)
  Caused by: java.lang.Exception: plotting error
        at com.hartwig.hmftools.linx.visualiser.SvVisualiser.createImageFrame(SvVisualiser.java:362)
        at com.hartwig.hmftools.linx.visualiser.SvVisualiser.lambda$submitFrame$29(SvVisualiser.java:331)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)

Work dir:
  /home/ryan/NGS_Data/L_WGS_4-10-25/work/11/8205a42a4ccf8ffdfd6fc5cd3dbb27

Container:
  <http://quay.io/biocontainers/hmftools-linx:2.0--hdfd78af_0|quay.io/biocontainers/hmftools-linx:2.0--hdfd78af_0>

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: <https://nf-co.re/docs/usage/troubleshooting>

 -- Check '.nextflow.log' file for details

I looked around the work dir and see data in data/all but I don't know where to look further to troubleshoot this! Thanks for any help you can provide!

Timo Eberhardt

04/15/2025, 11:55 AM

Hi there, my name is Timo and in our group we are establishing a branch to offer cancer patients a better medical care. We will design a custom targeted panel and consider oncoanalyser as our pipeline. Therefor we have a few questions and it would be very nice if the developers could answer them. A) In the training procedure doc you say that you recommend an initial sample set of at least 20 and that the gene median copy number does not have to deviate too much. Do you have a number of samples where you would say that more samples don't have a big effect? B) Is my understanding correct that the input samples for the training have to be somatic ffpe samples if the analysed samples will also be somatic ffpe samples? And is it okay if these are all from the same entity and not normal tissue?

valleinclan

04/16/2025, 4:14 PM

Hi all, is there any specific reason why Teal is not included in oncoanalyser?

valleinclan

04/17/2025, 6:32 PM

I was eager to try OA 2.0 before the long weekend, but got:

Copy code

N E X T F L O W   ~  version 24.10.3
Project `nf-core/oncoanalyser` contains uncommitted changes -- Cannot switch to revision: 2.0.0

from command:

Copy code

nextflow run nf-core/oncoanalyser \
    -r 2.0.0 \
    -profile $profile \
    --mode wgts \
    --genome GRCh38_hmf \
    -c path/to/refdata.local.config \
    --input test.oncoanalyser.csv \
    --outdir test/ \
    --max_fastq_records 0 \
    -c /path/to/oa.nextflow.config \
    -w test/work

Something wrong on my side, or is anyone else having problems? Happy long weekend if you have it!

Pierre Levy

05/09/2025, 9:14 AM

Hello all! I ran OA 2.0 on 3 tumor-normal WGS pairs. Everything seems to have gone well but I'm concerned by this warning in the log that I don't think I got before:

May-09 07:44:28.080 [main] WARN  nextflow.util.ThreadPoolHelper - Exiting before file transfers were completed -- Some files may be lost

The context:

Copy code

May-09 07:34:33.064 [main] INFO  nextflow.util.ThreadPoolHelper - Waiting for file transfers to complete (4 files)
May-09 07:35:33.066 [main] DEBUG nextflow.util.ThreadPoolHelper - Waiting for file transfers to complete (4 files)
May-09 07:36:33.067 [main] DEBUG nextflow.util.ThreadPoolHelper - Waiting for file transfers to complete (4 files)
May-09 07:37:33.069 [main] DEBUG nextflow.util.ThreadPoolHelper - Waiting for file transfers to complete (4 files)
May-09 07:38:33.070 [main] DEBUG nextflow.util.ThreadPoolHelper - Waiting for file transfers to complete (4 files)
May-09 07:39:33.072 [main] DEBUG nextflow.util.ThreadPoolHelper - Waiting for file transfers to complete (4 files)
May-09 07:40:33.073 [main] DEBUG nextflow.util.ThreadPoolHelper - Waiting for file transfers to complete (4 files)
May-09 07:41:33.075 [main] DEBUG nextflow.util.ThreadPoolHelper - Waiting for file transfers to complete (4 files)
May-09 07:42:33.076 [main] DEBUG nextflow.util.ThreadPoolHelper - Waiting for file transfers to complete (4 files)
May-09 07:43:33.078 [main] DEBUG nextflow.util.ThreadPoolHelper - Waiting for file transfers to complete (4 files)
May-09 07:44:28.080 [main] WARN  nextflow.util.ThreadPoolHelper - Exiting before file transfers were completed -- Some files may be lost
May-09 07:44:28.080 [main] DEBUG nextflow.util.ThreadPoolManager - Thread pool 'PublishDir' shutdown completed (hard=false)
May-09 07:44:28.117 [main] INFO  nextflow.Nextflow - -[0;35m[nf-core/oncoanalyser][0;32m Pipeline completed successfully[0m-
May-09 07:44:28.152 [main] DEBUG n.trace.WorkflowStatsObserver - Workflow completed > WorkflowStats[succeededCount=84; failedCount=0; ignoredCount=0; cachedCount=0; pendingCount=0; submittedCount=0; runningCount=0; retriesCount=0; abortedCount=0; succeedDuration=50d 7h 11m 6s; failedDuration=0ms; cachedDuration=0ms;loadCpus=0; loadMemory=0; peakRunning=12; peakCpus=114; peakMemory=684 GB; ]
May-09 07:44:28.152 [main] DEBUG nextflow.trace.TraceFileObserver - Workflow completed -- saving trace file
May-09 07:44:28.206 [main] DEBUG nextflow.trace.ReportObserver - Workflow completed -- rendering execution report
May-09 07:44:41.649 [main] DEBUG nextflow.trace.TimelineObserver - Workflow completed -- rendering execution timeline
May-09 07:44:44.033 [main] DEBUG nextflow.cache.CacheDB - Closing CacheDB done
May-09 07:44:44.111 [main] INFO  org.pf4j.AbstractPluginManager - Stop plugin 'nf-schema@2.3.0'
May-09 07:44:44.112 [main] DEBUG nextflow.plugin.BasePlugin - Plugin stopped nf-schema
May-09 07:44:44.113 [main] DEBUG nextflow.util.ThreadPoolManager - Thread pool 'FileTransfer' shutdown completed (hard=false)
May-09 07:44:44.115 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye

It stayed more than 1h on the

Waiting for file transfers to complete (4 files)

step. Looks like I would miss 4 files in the output folder? One other odd thing is that the slurm job that I used to run the pipeline is not ending, even if the pipeline says it's complete and there are no process still running. This is my slurm script I used to run OA:

Copy code

#!/bin/bash
#SBATCH --job-name=oncoanalyser
#SBATCH --mem=12gb
#SBATCH -c 1 # max nr threads
#SBATCH --time=10-00:00:00
#SBATCH --output=R-%x.%j.out
#SBATCH --error=R-%x.%j.err
#SBATCH --partition=highmem

# Nextflow Run (OA v2.0.0)
nextflow run nf-core/oncoanalyser \
  -profile singularity \
  --mode wgts \
  -revision 2.0.0 \
  -params-file /mnt/bioinfnas/immuno/plevy/proj/hmftools/hmf38.yaml \
  --genome GRCh38_hmf \
  --ref_data_hmf_data_path /mnt/bioinfnas/immuno/Jonatan/References/hmftools_ref/hmf_pipeline_resources.38_v6.0--2 \
  --input input.csv \
  --outdir /mnt/petasan_immuno/.pierre/RToledo_WGS/output \
  --max_cpus 30 \
  -c /mnt/bioinfnas/immuno/plevy/proj/hmftools/hmf.local.config

Could the file transfer issue be due to the fact that my outdir is in another drive? Thanks! Pierre

George Seed

05/13/2025, 8:43 AM

Odd issue: a recent OA run with virusbreakend fails to find HPV33 in my sample, but it's definitely present as I've aligned reads to a hybrid human-virus genome and the coverage is wildly high. In my HPV16 cases they are all detected just fine.

Anders Sune Pedersen

06/04/2025, 7:03 AM

Hi guys. When do you expect the next release of oncoanalyser to be out?

Anders Sune Pedersen

06/04/2025, 12:14 PM

I’m trying to run

dev-2.2.0-beta.10

and I bump into the following problem:

Copy code

singularity pull  --name docker.io-hartwigmedicalfoundation-sage-4.1-beta.8.img.pulling.1749038749594 <docker://docker.io/hartwigmedicalfoundation/sage:4.1-beta.8> 
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
INFO:    Fetching OCI image...
FATAL:   While making image from oci registry: error fetching image: while building SIF from layers: conveyor failed to get: while checking OCI image: GET <https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/d0/d04792c0388bee37289b2387038cbb5a6fae8919b8869ac55388a9862824da9f/data?expires=REDACTED&signature=REDACTED&version=REDACTED>: unexpected status code 301 Moved Permanently: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>PermanentRedirect</Code><Message>The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.</Message><Endpoint><http://undefined.s3-us-west-2.amazonaws.com|undefined.s3-us-west-2.amazonaws.com></Endpoint><Bucket>undefined</Bucket><RequestId>686ARQM30EJSBN6X</RequestId><HostId>vrm67e5FsL88aCey+uqFcBRcWUxvmytFHC62f5ia+nwbnMUdZlqJGdA0P0KCG/xp25K/JIzihKI=</HostId></Error>

How would I get around that?

George Seed

06/05/2025, 1:50 PM

Running from OA-produced .bam files, I want to do a rerun to tweak some parameters for CNA calling - my samplesheet is 2 patients, 4 bams that have gone through alignment and markdups by OA already. I tried to do

process_exclude markdups

(like in the documentation) but I don't get any output at all which is a bit weird. The pipeline starts and doesn't submit any jobs.

Sophie Herbst

06/06/2025, 1:55 PM

I am trying to run oncoanalyser but it always crashes. I get the message _sambamba-sort: /local/userID/39444752/cluster_tmp: Read-only file system_, so I assume that sambamba is trying to write into a file path that I don’t have write access to. I am running this on a cluster and don’t have write access to the folder /local/. I tried setting $TMPDIR before submitting the job, but it keeps trying to use /local/userID as temporary directory, even though, to my knowledge the default in sambamba is that it uses the system temporary directory. Is there a way in the oncoanalyser workflow to change this or am I missing/misunderstanding something? Thank you very much for your help!

Anders Sune Pedersen

06/06/2025, 2:25 PM

Hi guys. I’m new to nf-core/oncoanalyser. Thanks for all the time and energy your putting into developing the pipeline. Just a detail - the error message could be a bit more informative in the following case:

Copy code

$ nextflow run nf-core/oncoanalyser --profile test,singularity --outdir foo

 N E X T F L O W   ~  version 25.04.3

Pulling nf-core/oncoanalyser ...
 downloaded from <https://github.com/nf-core/oncoanalyser.git>
Launching `<https://github.com/nf-core/oncoanalyser>` [small_moriondo] DSL2 - revision: d1218d24a1 [master]

WARN: Access to undefined parameter `genome_version` -- Initialise it to a default value eg. `params.genome_version = some_value`
ERROR ~ currently only the GRCh37_hmf and GRCh38_hmf genomes are supported but got GATK.GRCh38, please adjust the --genome argument accordingly or override with --force_genome.

 -- Check '.nextflow.log' file for details

I guess that might be improved when you implement the new template.

George Seed

06/11/2025, 8:26 AM

Has anyone explored tuning COBALT to avoid the massive oversegmentation it sometimes falls into? Some of my samples are experiencing 'high copy number noise', but a run from an alternative tool produces entirely normal and usable profiles. The hardcoded kmin parameter of 1 is a bit of an odd design choice, as it permits the smallest segment to be a single bin - a change from the default. Also the log2ratios don't seem to be corrected by the reference values, as far as I can tell... missing a trick there I think.

Andrew Wallace

06/13/2025, 7:54 PM

Hi all -- I have a quick question on thresholds for use with

*.purple.cnv.somatic.tsv

(or the gene-level file for that matter). Does anyone have any suggestions on

copyNumber

values or values of other fields to restrict the output to true somatic copy number alterations? At a certain point it becomes obvious, but if, for instance, I have a focal CN estimate of 2.15, for instance (without a supporting SV) I'm not sure what to make of it. The PURPLE docs mention using 1.8 -> 2.2 as diploid-defining thresholds at certain decision points in the algorithm, but I'm not sure if these are still the right values for thinking about the final output. Any advice would be much appreciated!

Peter Priestley

06/16/2025, 9:23 PM

Hi Andrew. PURPLE does not round any copy numbers. Deviations from integer values may be noise or may indicate subclonal copy number events. Lower purity, lower depth, FFPE or damaged samples are all associated with increased noise. Our approach at Hartwig is to simply round to the nearest integer number for most applications. We normally define gain or loss as a multiple of ploidy, and we define homozygous loss as CN < 0.5.

Adam Talbot

06/17/2025, 10:31 AM

Hey guys, I was trying to run oncoanalyzer and I find the samplesheet a bit weird. Having arbitrary data separated by semi-colons makes it so much harder to validate and type the samplesheet let alone write it in the first place. I know it's a breaking change but is there any reason we can't use 1 column = 1 value (a la tidy data)? fastq input:

Copy code

group_id,subject_id,sample_id,sample_type,sequence_type,filetype,library,lane,fastq_1
PATIENT1,PATIENT1,PATIENT1-T,tumor,dna,fastq,S1,001,/path/to/PATIENT1-T_S1_L001_R1_001.fastq.gz,/path/to/PATIENT1-T_S1_L001_R2_001.fastq.gz
PATIENT1,PATIENT1,PATIENT1-T,tumor,dna,fastq,S1;002,/path/to/PATIENT1-T_S1_L002_R1_001.fastq.gz,/path/to/PATIENT1-T_S1_L002_R2_001.fastq.gz

BAM input:

Copy code

group_id,subject_id,sample_id,sample_type,sequence_type,filetype,bam,bai
PATIENT1,PATIENT1,PATIENT1-T,tumor,dna,bam,/path/to/PATIENT1-T.dna.bam,/path/to/PATIENT1-T.dna.bam.bai

This would be much simpler to validate early (shift left), allow simple validation of data types and be much, much easier to write, even with automation. It would also help make splitting and grouping easier.

Chiamaka Jessica Okeke

06/20/2025, 9:32 AM

Hey everyone, I'm running into an issue with oncoanalyser, specifically during the 'virusbreakend' step for one sample. The rest of my samples finished successfully. Here's the error file that I'm getting:

Chiamaka Jessica Okeke

06/20/2025, 9:33 AM

Copy code

Thu Jun 19 17:33:18 SAST 2025: Full log file is: ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/virusbreakend.20250619_173318.srvrochpc110.uct.ac.za.89.log
Thu Jun 19 17:33:18 SAST 2025: Found /usr/bin/time
Thu Jun 19 17:33:18 SAST 2025: Using GRIDSS jar /env/share/gridss-2.13.2-3/gridss.jar
Thu Jun 19 17:33:18 SAST 2025: Using reference genome "GRCh38_masked_exclusions_alts_hlas.fasta"
Thu Jun 19 17:33:19 SAST 2025: Using output VCF PATIENT10-T.virusbreakend.vcf
Thu Jun 19 17:33:19 SAST 2025: Using 12 worker threads.
Thu Jun 19 17:33:19 SAST 2025: Using input file PATIENT10-T.redux.bam
Thu Jun 19 17:33:19 SAST 2025: Found /env/bin/kraken2
Thu Jun 19 17:33:19 SAST 2025: Found /env/bin/gridss
Thu Jun 19 17:33:19 SAST 2025: Found /env/bin/gridss_annotate_vcf_kraken2
Thu Jun 19 17:33:19 SAST 2025: Found /env/bin/gridss_annotate_vcf_repeatmasker
Thu Jun 19 17:33:19 SAST 2025: Found /env/bin/samtools
Thu Jun 19 17:33:19 SAST 2025: Found /env/bin/bcftools
Thu Jun 19 17:33:19 SAST 2025: Found /env/bin/java
Thu Jun 19 17:33:19 SAST 2025: Found /env/bin/bwa
Thu Jun 19 17:33:19 SAST 2025: Found /env/bin/Rscript
Thu Jun 19 17:33:19 SAST 2025: Found /env/bin/RepeatMasker
Thu Jun 19 17:33:19 SAST 2025: Found /env/bin/gridsstools
Thu Jun 19 17:33:19 SAST 2025: gridsstools version: gridsstools 1.0
Thu Jun 19 17:33:19 SAST 2025: samtools version: 1.19.2+htslib-1.19.1
Thu Jun 19 17:33:19 SAST 2025: R version: Rscript (R) version 4.3.1 (2023-06-16)
Thu Jun 19 17:33:19 SAST 2025: bwa Version: 0.7.17-r1188
Thu Jun 19 17:33:19 SAST 2025: Kraken version 2.1.3
Thu Jun 19 17:33:19 SAST 2025: time version: /usr/bin/time: unrecognized option '--version'
BusyBox v1.32.1 (2021-04-13 11:15:36 UTC) multi-call binary.

Usage: time [-vpa] [-o FILE] PROG ARGS

Run PROG, display resource usage when it exits

	-v	Verbose
	-p	POSIX output format
	-f FMT	Custom format
	-o FILE	Write result to FILE
	-a	Append (else overwrite)
Thu Jun 19 17:33:19 SAST 2025: bash version: GNU bash, version 5.0.3(1)-release (x86_64-pc-linux-gnu)
Thu Jun 19 17:33:19 SAST 2025: java version: openjdk version "20.0.2-internal" 2023-07-18	OpenJDK Runtime Environment (build 20.0.2-internal-adhoc..src)	OpenJDK 64-Bit Server VM (build 20.0.2-internal-adhoc..src, mixed mode, sharing)	
Thu Jun 19 17:33:19 SAST 2025: Identifying viral sequences
Thu Jun 19 17:33:19 SAST 2025: Treating 1 reference sequences as viral sequences.
Loading database information...15:33:20.413 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/env/share/gridss-2.13.2-3/gridss.jar!/com/intel/gkl/native/libgkl_compression.so
[Thu Jun 19 15:33:20 GMT 2025] SubsetToTaxonomy --INPUT /dev/stdin --OUTPUT ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.readnames.txt.tmp --FORMAT READ_NAME --NCBI_NODES_DMP virusbreakend/taxonomy/nodes.dmp --TAXONOMY_IDS 10239 --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 5 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
[Thu Jun 19 15:33:20 GMT 2025] Executing as <mailto:t0090720@srvrochpc110.uct.ac.za|t0090720@srvrochpc110.uct.ac.za> on Linux 5.14.0-427.24.1.el9_4.x86_64 amd64; OpenJDK 64-Bit Server VM 20.0.2-internal-adhoc..src; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: Version:2.13.2-gridss
INFO	2025-06-19 15:33:20	KrakenClassificationChecker	Loading NCBI taxonomy from virusbreakend/taxonomy/nodes.dmp
INFO	2025-06-19 15:33:23	SubsetToTaxonomy	Performing taxonomy lookup on /dev/stdin
 done.
728923 sequences (87.07 Mbp) processed in 7.409s (5902.7 Kseq/m, 705.10 Mbp/m).
  545315 sequences classified (74.81%)
  183608 sequences unclassified (25.19%)
[Thu Jun 19 15:33:40 GMT 2025] gridss.kraken.SubsetToTaxonomy done. Elapsed time: 0.34 minutes.
Runtime.totalMemory()=536870912
Thu Jun 19 17:33:40 SAST 2025: Identifying viral taxa in sample based on kraken2 summary report
15:33:41.479 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/env/share/gridss-2.13.2-3/gridss.jar!/com/intel/gkl/native/libgkl_compression.so
[Thu Jun 19 15:33:41 GMT 2025] IdentifyViralTaxa --INPUT_KRAKEN2_REPORT ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.kraken2.report.all.txt --OUTPUT ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.summary.taxa.tsv --REPORT_OUTPUT ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.kraken2.report.viral.txt --SUMMARY_REPORT_OUTPUT ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.kraken2.report.viral.extracted.txt --TAXONOMY_ID_LIST ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.host_taxids.txt --NCBI_NODES_DMP virusbreakend/taxonomy/nodes.dmp --SEQID2TAXID_MAP virusbreakend/seqid2taxid.map --KRAKEN_REFERENCES virusbreakend/library/viral/library.fna --KRAKEN_REFERENCES virusbreakend/library/added/rzW5MDJa1g.fna --MIN_SUPPORTING_READS 50 --TAXONOMIC_DEDUPLICATION_LEVEL Genus --TAXONOMY_IDS 10239 --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 5 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
[Thu Jun 19 15:33:41 GMT 2025] Executing as <mailto:t0090720@srvrochpc110.uct.ac.za|t0090720@srvrochpc110.uct.ac.za> on Linux 5.14.0-427.24.1.el9_4.x86_64 amd64; OpenJDK 64-Bit Server VM 20.0.2-internal-adhoc..src; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: Version:2.13.2-gridss
INFO	2025-06-19 15:33:42	IdentifyViralTaxa	Loading taxonomy IDs of interest from ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.host_taxids.txt
INFO	2025-06-19 15:33:42	IdentifyViralTaxa	Loaded 2933 taxonomy IDs
INFO	2025-06-19 15:33:42	IdentifyViralTaxa	Loading seqid2taxid.map from virusbreakend/seqid2taxid.map
INFO	2025-06-19 15:33:42	IdentifyViralTaxa	Loading NCBI taxonomy from virusbreakend/taxonomy/nodes.dmp
INFO	2025-06-19 15:33:45	IdentifyViralTaxa	Parsing Kraken2 report from ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.kraken2.report.all.txt
INFO	2025-06-19 15:33:45	IdentifyViralTaxa	Writing abridged report to ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.kraken2.report.viral.txt
INFO	2025-06-19 15:33:45	IdentifyViralTaxa	Writing summary kraken report to ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.kraken2.report.viral.extracted.txt
INFO	2025-06-19 15:33:45	IdentifyViralTaxa	Found viral presence for 1 genera. Writing summary to  ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.summary.taxa.tsv
[Thu Jun 19 15:33:45 GMT 2025] gridss.kraken.IdentifyViralTaxa done. Elapsed time: 0.07 minutes.
Runtime.totalMemory()=2663383040
Thu Jun 19 17:33:45 SAST 2025: Extracting viral reads	PATIENT10-T.redux.bam
Found 14340 distinct read name in ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.readnames.txt
Thu Jun 19 17:33:53 SAST 2025: Determining viral references to use
15:33:53.414 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/env/share/gridss-2.13.2-3/gridss.jar!/com/intel/gkl/native/libgkl_compression.so
[Thu Jun 19 15:33:53 GMT 2025] ExtractBestViralReference --INPUT_SUMMARY ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.summary.taxa.tsv --INPUT_VIRAL_READS /dev/fd/63 --OUTPUT ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.kraken2.fa --OUTPUT_SUMMARY ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.summary.references.tsv --OUTPUT_MATCHING_KMERS ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.viral.kmercounts.tsv --NCBI_NODES_DMP virusbreakend/taxonomy/nodes.dmp --SEQID2TAXID_MAP virusbreakend/seqid2taxid.map --KRAKEN_REFERENCES virusbreakend/library/viral/library.fna --KRAKEN_REFERENCES virusbreakend/library/added/rzW5MDJa1g.fna --KMER 16 --STRIDE 16 --CONTIGS_PER_TAXID 1 --FAVOUR_EARLY_KRAKEN_REFERENCES true --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 5 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
[Thu Jun 19 15:33:53 GMT 2025] Executing as <mailto:t0090720@srvrochpc110.uct.ac.za|t0090720@srvrochpc110.uct.ac.za> on Linux 5.14.0-427.24.1.el9_4.x86_64 amd64; OpenJDK 64-Bit Server VM 20.0.2-internal-adhoc..src; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: Version:2.13.2-gridss
INFO	2025-06-19 15:33:54	ExtractBestViralReference	Loading seqid2taxid.map from virusbreakend/seqid2taxid.map
INFO	2025-06-19 15:33:54	ExtractBestViralReference	Loading NCBI taxonomy from virusbreakend/taxonomy/nodes.dmp
INFO	2025-06-19 15:33:59	ExtractBestViralReference	Parsing ./PATIENT10-T.virusbreakend.vcf.virusbreakend.working/PATIENT10-T.virusbreakend.vcf.summary.taxa.tsv
INFO	2025-06-19 15:33:59	ExtractBestViralReference	Identifying best viral reference genomes from /dev/fd/63
[Thu Jun 19 15:33:59 GMT 2025] gridss.kraken.ExtractBestViralReference done. Elapsed time: 0.11 minutes.
Runtime.totalMemory()=2847932416
Exception in thread "main" htsjdk.samtools.SAMException: Missing Quality Line at line 41 in fastq /dev/fd/63
	at htsjdk.samtools.fastq.FastqReader.checkLine(FastqReader.java:190)
	at htsjdk.samtools.fastq.FastqReader.readNextRecord(FastqReader.java:126)
	at htsjdk.samtools.fastq.FastqReader.next(FastqReader.java:152)
	at gridss.kraken.ExtractBestViralReference.doWork(ExtractBestViralReference.java:123)
	at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
	at gridss.kraken.ExtractBestViralReference.main(ExtractBestViralReference.java:257)

Chiamaka Jessica Okeke

06/20/2025, 9:36 AM

After inspecting the generated 'PATIENT10-T.redux.bam.viral.R1.fq', I found at least one read read missing the 4th line (quality string). This likely makes the FASTQ file invalid and causes 'virusbreakend' to crash. Has anyone seen this before or have tips on deal with this?

Rike Hanssen

06/23/2025, 12:35 PM

Adding reports to the tower.yml: https://github.com/nf-core/oncoanalyser/pull/220

excited dog 1

🙌 1

Ning Wang

06/28/2025, 9:02 AM

Hi everyone, I am running oncoanalyser 2.0.0, and my sample is WGS, tumour-only, ctDNA, Illumina paired-end, BAM files. I have 30 samples, all the other samples were ok with oncoanalyser, but 2 samples were failed with LILAC. I noticed virusbreakend may be not able to complie correctly in our computer cluster, so it is the only process that I manually excluded. How should I solve this problem? Thanks everyone here! The error message is:

Copy code

-[nf-core/oncoanalyser] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_ONCOANALYSER:WGTS:LILAC_CALLING:LILAC (SAMPLE_3)'

Caused by:
  Process `NFCORE_ONCOANALYSER:WGTS:LILAC_CALLING:LILAC (SAMPLE_3)` terminated with an error exit status (1)


Command executed:

  lilac \
      -Xmx28991029248 \
       \
      -sample SAMPLE_3-T \
       \
      -tumor_bam SAMPLE_3-T.redux.bam \
       \
      -purple_dir purple \
      -ref_genome GRCh38_masked_exclusions_alts_hlas.fasta \
      -ref_genome_version 38 \
      -resource_dir lilac_resources \
      -threads 6 \
      -output_dir lilac/
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_ONCOANALYSER:WGTS:LILAC_CALLING:LILAC":
      lilac: $(lilac -version | sed -n '/^Lilac version / { s/^.* //p }')
  END_VERSIONS

Command exit status:
  1

Command output:
  01:32:43.532 [INFO ] Lilac version 1.6
  01:32:43.556 [INFO ] loaded 230 allele frequencies from file(lilac_resources/lilac_allele_frequencies.csv)
  01:32:43.572 [INFO ] key parameters:
  01:32:43.572 [INFO ] sample(SAMPLE_3-T) inputs: referenceBam(false) tumorBam(true) somaticVCF(true) geneCopyNumber(true) rnaBam(false)
  01:32:43.573 [INFO ] reading nucleotide file: lilac_resources/hla_ref_nucleotide_sequences.csv
  01:32:45.654 [INFO ] loaded 17540 sequences from file lilac_resources/hla_ref_nucleotide_sequences.csv
  01:32:45.655 [INFO ] reading protein file: lilac_resources/hla_ref_aminoacid_sequences.csv
  01:32:46.221 [INFO ] loaded 13196 sequences from file lilac_resources/hla_ref_aminoacid_sequences.csv
  01:32:46.224 [INFO ] loaded 128 common alleles
  01:32:46.282 [INFO ] finding read support in tumor bam SAMPLE_3-T.redux.bam
  01:32:46.883 [INFO ] totalFrags(179) minEvidence(2.0) minHighQualEvidence(0.1)
  01:32:47.072 [INFO ] gene(HLA-A) determining un-phased candidates from frags(143)
  01:32:47.102 [WARN ]   no candidates after amino acid filtering - reverting to common allele gene candidates
  01:32:47.103 [INFO ] gene(HLA-B) determining un-phased candidates from frags(131)
  01:32:47.118 [WARN ]   no candidates after amino acid filtering - reverting to common allele gene candidates
  01:32:47.118 [INFO ] gene(HLA-C) determining un-phased candidates from frags(139)
  01:32:47.156 [WARN ]   no candidates after amino acid filtering - reverting to common allele gene candidates
  01:32:47.212 [INFO ] gene(HLA-A) has 35 candidates after phasing: A*01:01, A*01:02, A*02:01, A*02:02, A*02:03, A*02:05, A*02:06, A*02:07, A*02:20, A*03:01, A*03:02, A*11:01, A*23:01, A*24:02, A*24:03, A*24:07, A*25:01, A*26:01, A*26:08, A*29:01, A*29:02, A*30:01, A*30:02, A*30:04, A*31:01, A*32:01, A*33:01, A*33:03, A*34:01, A*34:02, A*66:01, A*68:01, A*68:02, A*69:01, A*74:01
  01:32:47.217 [INFO ] gene(HLA-B) has 61 candidates after phasing: B*07:02, B*07:05, B*07:06, B*08:01, B*13:01, B*13:02, B*14:01, B*14:02, B*15:01, B*15:02, B*15:03, B*15:07, B*15:13, B*15:17, B*15:18, B*15:21, B*18:01, B*27:02, B*27:05, B*27:06, B*27:07, B*35:01, B*35:02, B*35:03, B*35:05, B*35:08, B*37:01, B*38:01, B*38:02, B*39:01, B*39:06, B*40:01, B*40:02, B*40:06, B*41:01, B*41:02, B*42:01, B*44:02, B*44:03, B*44:04, B*44:05, B*44:27, B*45:01, B*46:01, B*47:01, B*48:01, B*49:01, B*50:01, B*50:02, B*51:01, B*51:07, B*51:08, B*52:01, B*53:01, B*55:01, B*56:01, B*57:01, B*57:03, B*58:01, B*73:01, B*81:01
  01:32:47.220 [INFO ] gene(HLA-C) has 32 candidates after phasing: C*01:02, C*02:02, C*02:10, C*03:02, C*03:03, C*03:04, C*04:01, C*04:03, C*04:09N, C*04:82, C*05:01, C*06:02, C*07:01, C*07:02, C*07:04, C*07:06, C*07:18, C*08:01, C*08:02, C*12:02, C*12:03, C*12:143, C*14:02, C*15:02, C*15:05, C*15:06, C*16:01, C*16:02, C*16:04, C*17:01, C*17:03, C*18:01
  01:32:47.257 [INFO ] building frag-alleles from aminoAcids(frags=179 candSeq=128) nucFrags(hetLoci=3 candSeq=3717 nucs=1101) knownIndels(0)
  01:32:47.257 [INFO ] building frag-alleles from aminoAcids(frags=179 candSeq=128) nucFrags(hetLoci=3 candSeq=3717 nucs=1101) knownIndels(0)
  01:32:47.387 [INFO ] HLA-Y fragments(16 unique=0) shared=16) below threshold
  01:32:47.388 [INFO ] filtering candidates from fragAlleles(163) candidates(128) recovered(0)
  01:32:47.407 [INFO ]   confirmed 1 unique groups: C*07[41,36,5,0]
  01:32:47.408 [INFO ]   found 3 insufficiently unique groups: A*02[7,2,5,0], B*44[4,1,3,0], B*08[3,1,2,0]
  01:32:47.425 [INFO ]   confirmed 0 unique proteins
  01:32:47.426 [INFO ]   found 1 insufficiently unique proteins: B*08:01[2,1,1,0]
  01:32:47.430 [INFO ] building frag-alleles from aminoAcids(frags=179 candSeq=128) nucFrags(hetLoci=3 candSeq=3717 nucs=1101) knownIndels(0)
  01:32:47.624 [INFO ] candidate permutations exceeds threshold, candidates(A=630 B=1891 C=150) common(128)
  01:32:48.225 [INFO ]   discarding 0 unlikely candidates: 

Command error:
  /usr/local/bin/lilac: line 6: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8): No such file or directory
  Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.base/java.util.Arrays.copyOf(Arrays.java:3482)
        at java.base/java.util.ArrayList.toArray(ArrayList.java:369)
        at java.base/java.util.ArrayList.addAll(ArrayList.java:752)
        at com.hartwig.hmftools.lilac.coverage.ComplexBuilder.combineComplexes(ComplexBuilder.java:261)
        at com.hartwig.hmftools.lilac.coverage.ComplexBuilder.buildAlleleComplexes(ComplexBuilder.java:220)
        at com.hartwig.hmftools.lilac.coverage.ComplexBuilder.buildComplexes(ComplexBuilder.java:193)
        at com.hartwig.hmftools.lilac.LilacApplication.run(LilacApplication.java:394)
        at com.hartwig.hmftools.lilac.LilacApplication.main(LilacApplication.java:731)

Stephen Watts

06/30/2025, 6:01 AM

<!here> the oncoanalyser 2.1.0 release is now out! In this minor release we have focused on implementing and making available the PEACH, CIDER, and TEAL tools from WiGiTS. Also included in the release is a new metromap-style pipeline overview diagram 🎉

🐦 2

🙌 2

🎉 11

Stephen Watts

06/30/2025, 6:01 AM

Please see the changelog and release notes for more details

✅ 1

Chiamaka Jessica Okeke

07/01/2025, 3:09 PM

Hey everyone, I'm also working with 10 tumor-normal paired DNA samples (WES) and matched tumor RNA-seq. From the May 15th 2024 thread with @Stephen Watts and others, I understand that WES is treated as targeted and I need to generate panel-specific resource files. I'm currently looking to generate the panel-specific reference files, and learn if the support for tumor-normal WES in targeted mode is now merged or if there’s a patch I can apply. Would appreciate guidance or example configurations. Thanks

👀 1

Kairi Tanaka

07/07/2025, 12:55 PM

Hi everyone, I'm looking to run just LILAC from the oncoanalyser suite. I currently have 36 tumor–normal paired WES DNA samples, with somatic variants called using a consensus approach and copy number variants generated with CNVkit. I noticed on the GitHub that LILAC was originally designed to take SAGE somatic mutation output and PURPLE allele-specific CNV calls as input, but that other input sources are also supported. I'm wondering: • What are the implications of using alternative somatic variant and CNV callers? • Are there any best practices or known limitations when using non-SAGE or non-PURPLE inputs with LILAC? Any guidance or examples from others who have used custom inputs would be greatly appreciated.

Tobi Agbede

07/08/2025, 10:33 PM

Hello everyone, I'm trying to run nf-core/oncoanalyser using singularity offline for my CRAMs ultima files. I have setup everything needed including configs and when I run the pipeline, I keep getting error in REDUX. I have tried to fix this error by filtering out for paired reads but I have 0 paired reads and I believe the pipeline works for single end reads too? Can anyone please let me know how to navigate this when you can?? 002522.314 [Thread-49] [ERROR] read(id(413777_2-4215745261) coords(chr4:131253272-131253304) isPaired(false) cigar(110S33M75S) flags(2048)) exception: java.lang.IllegalStateException: Inappropriate call if not paired read java.lang.IllegalStateException: Inappropriate call if not paired read at htsjdk.samtools.SAMRecord.requireReadPaired(SAMRecord.java:892) at htsjdk.samtools.SAMRecord.getFirstOfPairFlag(SAMRecord.java:950) at com.hartwig.hmftools.redux.common.FragmentCoords.fromRead(FragmentCoords.java:231) at com.hartwig.hmftools.redux.ReadCache.processRead(ReadCache.java:67) at com.hartwig.hmftools.redux.PartitionReader.processSamRecord(PartitionReader.java:301) at com.hartwig.hmftools.redux.BamReader.sliceRegion(BamReader.java:82) at com.hartwig.hmftools.redux.PartitionReader.processRegion(PartitionReader.java:165) at com.hartwig.hmftools.redux.PartitionReader.processPartition(PartitionReader.java:119) at com.hartwig.hmftools.redux.PartitionThread.run(PartitionThread.java:59)

Sophie Herbst

07/14/2025, 11:06 AM

Hi, I need some help with running oncoanalyzer2.1.0. After dealing with this issue I was able to run the 2.0.0 version successfully. I am now trying to run 2.1.0, starting from the redux bam files which were created as output from 2.0.0. Somehow I keep getting this error within samtools within TEAL_prep:

htsjdk.samtools.util.RuntimeIOException: IOException seeking to unmapped reads

and have no clue how to solve this. Attached you can find my .nextflow.log and .command.log files. For the issues that I had with 2.0.0, where

/local

wasn’t getting bound to the singularity image, the issue also always occured in samtools, so I am unsure whether this might also have something to do with it? I would appreciate any help!

.command.log .nextflow.log

Lipika

07/14/2025, 11:37 AM

Hi, I am trying to run the oncoanalyser 2.1.0 for my tumor only WGS bam files. I am using this command Also, the params file.

Copy code

nextflow run nf-core/oncoanalyser \
 -profile docker \
 -r 2.1.0 \
 --mode wgts \
 --genome GRCh38_hmf \
 --input /home/isilon/patho_anemone-meso/fastq/dedup/samtools/Samplesheet.csv \
 --outdir results/ \
 -work-dir work/ \
 --processes_manual \
 --processes_include isofox,redux,amber,cobalt,purple,linx,sage,pave,esvee,lilac,cuppa \
 --redux_umi \
 --redux_umi_duplex_delim "-" \
 -c params.config \
 --resume

params file:
process {
    withName: '.*ALIGN'        { cpus = 12; memory = 72.GB; }
    withName: AMBER            { cpus = 16; memory = 24.GB; }
    withName: BAMTOOLS         { cpus = 16; memory = 24.GB; }
    withName: CHORD            { cpus = 4;  memory = 12.GB; }
    withName: COBALT           { cpus = 16; memory = 24.GB; }
    withName: CUPPA            { cpus = 4;  memory = 16.GB; }
    withName: 'ESVEE.*'        { cpus = 16; memory = 64.GB; }
    withName: LILAC            { cpus = 16; memory = 24.GB; }
    withName: 'LINX.*'         { cpus = 16; memory = 16.GB; }
    withName: REDUX            { cpus = 16; memory = 64.GB; }
    withName: ORANGE           { cpus = 4;  memory = 16.GB; }
    withName: 'PAVE.*'         { cpus = 8;  memory = 32.GB; }
    withName: PURPLE           { cpus = 8;  memory = 40.GB; }
    withName: 'SAGE.*'         { cpus = 16; memory = 64.GB; }
    withName: VIRUSBREAKEND    { cpus = 8;  memory = 64.GB; }
    withName: VIRUSINTERPRETER { cpus = 2;  memory = 8.GB;  }
}

process {
    resourceLimits = [
        cpus:   16,
        memory: 124.GB,
        disk:   1500.GB,
        time:   48.h
    ]
}
process {
    executor = "slurm"
}

params {
    // FASTQ UMI processing
    fastp_umi = true
    fastp_umi_location = "per_read"
    fastp_umi_length = 7
    fastp_umi_skip = 0

    // BAM UMI processing
    redux_umi = true
    redux_umi_duplex_delim = "-"

}

But I'm getting this error. I would appreciate any help.

Copy code

N E X T F L O W  ~ version 25.04.6
Launching `<https://github.com/nf-core/oncoanalyser>` [nostalgic_ritchie] DSL2 - revision: 0d0dc258ce [2.1.0]

ERROR ~ Cannot invoke method toUpperCase() on null object

 -- Check script '/home/gpfs/o_lipika/.nextflow/assets/nf-core/oncoanalyser/workflows/targeted.nf' at line: 13 or see '.nextflow.log' file for more details

Kairi Tanaka

07/19/2025, 7:38 PM

Untitled

Kairi Tanaka

07/19/2025, 7:39 PM

Hi Oncoanalyser team, I'm encountering an issue while running the test pipeline — it seems the CUPPA image is corrupted. The command I ran was:

Copy code

nextflow run nf-core/oncoanalyser -profile test,singularity --outdir test_results/ -r 2.1.0

Are there any best practices or recommended steps to prevent image corruption issues like this when using Singularity? Thanks in advance for your help!

Kairi Tanaka

07/24/2025, 2:13 PM

Hey team, Had a question about the custom panel training. I currently have matched tumor/normal WES as well as tumor bulk RNA seq (melanoma). If my main goal is looking at results from LILAC to see if there is any sort of loss of heterozygosity as well as any effects to the antigen presentation pathway does it make sense for me to do this training? I was able to run nextflow in targeted mode with the tso500 panel. However, when looking at the documentation of the README_TARGETED it seems a bit confusing when trying to use a custom bed / reference. Was mainly curious to see from what you guys see, if there is a drastic difference in results when looking at WES custom panel vs tso500. Also in terms of the training set, I'm assuming for the copy number it needs to be a pool of tumors? For SAGE does it need to be a pool of normals so 2 different training sets? If I have about 100 tumor samples is there any recommendation on how many I should use to train (does it make sense to use all 100 or just a subset? Thanks!