# 00_Contig screen ## Fastp :was used to filter adapter sequences, primers and other low quality sequence from raw sequencing reads. SPART/00_Contig_screen/fastp.sh $HiFi_reads $ONT_reads ## Hifiasm SPART/00_Contig_screen/hifiasm.sh $HiFi_reads $ONT_reads $output_prefix ## Verkko SPART/00_Contig_screen/verkko.sh $output_prefix $HiFi_reads $ONT_reads $threads $memory ## Flye SPART/00_Contig_screen/flye.sh $ONT_reads $output_prefix $threads ## Remove MT & CP SPART/00_Contig_screen/rm_mt_cp.sh $mitochondrion $chloroplast $ref # 01_Contig scaffolding ## Bionano SPART/01_Contig_scaffolding/Bionano_DLS_map.sh threads bnx ref_cmap prefix xml Bio_dir cluster_xml ref bio_camp merge_xml RefAligner ## Hi-C SPART/01_Contig_scaffolding/HiC-Pro.sh ref ref_prefix hicpro_data hicpro_config hicpro_outdir SPART/01_Contig_scaffolding/yahs.sh enzyme ref bed/bam/bin profix # 02_Gap patching SPART/02_Gap_patching/wfmash_ragtag.sh prefix ref region ## Manual operation cd ragtag_output perl SPART/02_Gap_patching/paf_filter.pl -i ragtag.patch.debug.filtered.paf -minlen 10000000 -iden 0.5 **Manually editing the ragtag.patch.debug.filtered.paf file.Keep the high-quality contig and preserve the location of the only high confidence match in ragtag.patch.debug.filtered.paf that matches the sequence at both ends of the gap.** perl SPART/02_Gap_patching/renameagp.pl -i ragtag.patch.ctg.agp -i1 ragtag.patch.debug.filtered.paf -start seq00000000 -end seq00000001 -o test.agp **Test.agp is merged into ragtag.patch.agp and fasta is generated.** ## telomere patching We used _submit_telomere.sh in ONT reads >100kb.ONT reads with telomere sequence mapping to this locus based on minimap2 alignments were manually identified. The longest was selected as template , all others aligned to it and polished with Medaka: medaka -v -i ONT_tel_reads.fasta -d longest_ont_tel.fasta -o ont_tel_medaka.fasta Telomere signal in all HiFi reads was identified with the commands: _submit_telomere.sh hifi_reads.fasta Additional HiFi reads were recruited from a manual analysis. We looked for trimmed tips that could extend. All reads had telomere signal and were aligned to the medaka consensus and polished with Racon with the commands: minimap2 -t16 -ax map-pb ont_tel_medaka.fasta hifi_tel.fasta > medaka.sam racon hifi_tel.fasta medaka.sam ont_tel_medaka.fasta > racon.fasta Finally, the polished result was patched into the assembly with ragtag patch or manually patched. ### Citation https://github.com/marbl/CHM13-issues/blob/main/error_detection.md. ## Centromeric region analysis SPART/02_Gap_patching/Centromeric_region_analysis.sh workdir FASTA INDEX prefix CHIP1 CHIP2 threads # 03_Polishing SPART/03_Polishing/calsv_snv.sh workdir ref threads # 04_Evaluation ## BUSCO SPART/04_Evaluation/BUSCO.sh ref prefix ## mapping rates & coverages SPART/04_Evaluation/mapping_rates_coverages.sh hybrid_bam single_bam ont_bam ## LTR SPART/04_Evaluation/ltr.sh ref prefix ## QV SPART/04_Evaluation/qv.sh query ref ## BACs SPART/04_Evaluation/bac.sh bac_reads ref_chr