Basic Usage
One command pipeline of C-Phasing#
pipeline#
The -n 8:4
parameter of the following commands means assembling a tetraploid (4) with 8 chromosome basic numbers. If you set -n 0:0
means partition in both rounds automatically, also support it set to -n 8:0
or -n 0:4
.
Note
CPhasing
also support the monoploid scaffolding, when you set one group number, e.g. -n 8
. The pipeline
will automatically skip the step 1.alleles
, and only run one round partition.
Start from a pore-c data:#
Start from multiple pore-c data:#
specify multiple -pcd
parameters.
Note
If you want to run on cluster system and submit them to multiple nodes, you can use cphasing mapper
and cphasing-rs porec-merge
to generate the merged porec.gz
file and input it by -pct
parameter.
Start from a pore-c table (porec.gz):#
which is generated by cphasing mapper
.
Start from a paired-end Hi-C data#
Note
- 1 | If you want to run multiple samples, you can use
cphasing hic mapper
andcphasing-rs pairs-merge
to generate the mergedpairs.gz
file, and input it by-prs
parameter. - 2 | If the total length of your input genome is larger than 8 Gb, the
-hic-mapper-k 27 -hic-mapper-w 14
should be specified, to avoid the error of chromap.
Start from a 4DN pairs file,#
- Skip some steps## skip steps 1.alleles and 2.prepare steps
cphasing pipeline -f draft.asm.fasta -pct sample.porec.gz -t 10 -ss 1,2
Perform only specified steps#
Improve performance#
Add the -hcr
parameter to remove the greedy contacts (several regions contact with the whole genome) to improve the phasing quality.
Curation by Juicebox#
- generate
.assembly
and.hic
, depend on 3d-dna
cphasing pairs2mnd sample.pairs.gz -o sample.mnd.txt
cphasing utils agp2assembly groups.agp > groups.assembly
bash ~/software/3d-dna/visualize/run-assembly-visualizer.sh sample.assembly sample.mnd.txt
Note
if chimeric corrected, please use groups.corrected.agp
and generate a new corrected.pairs.gz
by cphasing-rs pairs-break
- After curation
## convert assembly to agp cphasing utils assembly2agp groups.review.assembly -n 8:4 ## or haploid or a homologous group cphasing utils assembly2agp groups.review.assembly -n 8 ## extract contigs from agp cphasing agp2fasta groups.review.agp draft.asm.fasta --contigs > contigs.fasta ## extract chromosome-level fasta from agp cphasing agp2fasta groups.review.agp draft.asm.fasta > groups.review.asm.fasta
Rename#
Rename and orient chromosome according a monoploid reference (or genome of closely related species).
Note
To reduce the time consumed, we only align the first haplotype (g1) to the monoploid, which the orientation among different haplotypes has already been set to the same in the scaffolding
step. If not, you can set —unphased
to align all haplotypes to the monoploid to adjust the orientation.