Premium accounts now available! Sign up and create a premium account. Read more Close

Advertisement

Image

TDKC (Target Distilled K-mer Classifier): Ultrafast and Memory-Efficient Sequence Classification for Target Pathogen Diagnostics

Preprint Created on 07 Jun 2026 bioRxiv

Metagenomic sequencing can identify pathogens from clinical samples without prior knowledge of the causative agent. Yet, as sequencing workflows scale to process thousands of multiplexed samples simultaneously, classifying these samples against massive reference databases creates a significant computational bottleneck. Furthermore, large-scale applications such as screening public sequence repositories remain computationally challenging. Existing metagenomic classifiers are designed for full-taxon classification, where the goal is to identify all organisms in a sample. However, many diagnostic applications focus on detecting a specific set of clinically relevant pathogens. This constraint can be exploited to significantly lower computational costs. Here we present TDKC (Target Distilled K-mer Classifier), a method for targeted metagenomic classification. TDKC constructs a compact index by distilling target-specific k-mers from a full-taxon reference database. When classifying clinical samples, TDKC uses 16.9-33.6x less memory and is 5.2-34.3x faster than per-read full-taxon and targeted classifiers (Kraken2, Centrifuger, CLARK), while maintaining high sensitivity and low false positive rates. Against the sketch-based profiler Sylph, TDKC remains 4.2x faster and uses 8.5x less memory. TDKC also supports per-k-mer accession tracking across over 3 million source accessions for downstream subtype analysis, and domain-level detection of bacteria, archaea, and viruses. By reducing the index to only the pathogens of interest, TDKC makes targeted pathogen detection feasible at scale.

Lee, S., Agarwal, V., O'Brien, W., Eskin, E.

Advertisement

Stats

  • Recommendations n/a n/a positive of 0 vote(s)
  • Views 7
  • Comments 0

Recommended by

  • No recommendations yet.

Post a comment

You need to be signed in to post comments. You can sign in here.

Comments

There are no comments yet.

Advertisement