Run SeekArcTools
Run tests
Example 1: Basic usage
Set up the necessary configuration files for the analysis, including sample data paths, reference genome paths, etc. Run the SeekArcTools using the following command:
seekarctools_py arc run \
--rnafq1 /path/to/demo/demo_GE_S1_L001_R1_001.fastq.gz \
--rnafq2 /path/to/demo/demo_GE_S1_L001_R2_001.fastq.gz \
--atacfq1 /path/to/demo/demo_arc_S2_L002_R1_001.fastq.gz \
--atacfq2 /path/to/demo/demo_arc_S2_L002_R2_001.fastq.gz \
--samplename demo \
--outdir /path/to/outdir \
--refpath /path/to/reference/GRCh38 \
--include-introns \
--core 16
Example 2: Adjusting thresholds for re-running peak calling or cell calling
If the identified peaks or cells under default parameters do not meet requirements, users can adjust parameters to skip preceding steps (e.g., alignment) and re-run the workflow to optimize results while saving time.
seekarctools_py arc retry \
--samplename demo \
--outdir /path/to/outdir \
--refpath /path/to/reference/GRCh38 \
--core 16 \
--qvalue 0.01 \
--snapshift 0 \
--extsize 200 \
--min_len 200 \
--min_atac_count 1000 \
--min_gex_count 500
Note: Ensure all files from previous runs remain intact and are not deleted or relocated. The
--outdir
specifies the directory path from the initial seekarctools analysis. The--min_atac_count
must be used together with--min_gex_count
; it will not take effect if used alone.
Parameter descriptions
Parameters |
Parameter descriptions |
---|---|
–rnafq1 |
Paths to R1 fastq files of RNA library |
–rnafq2 |
Paths to R2 fastq files of RNA library |
–atacfq1 |
Paths to R1 fastq files of ATAC library |
–atacfq2 |
Paths to R2 fastq files of ATAC library |
–samplename |
Sample name |
–outdir |
output directory. Default: ./ |
–skip_misB |
If enabled, no base mismatch is allowed for barcode. Default is 1. |
–skip_misL |
If enabled, no base mismatch is allowed for linker. Default is 1. |
–skip_multi |
If enabled, discard reads that can be corrected to multiple white-listed barcodes. Barcodes are corrected to the barcode with the highest frequency by default. |
–skip_len |
Skip filtering short reads after adapter filter, short reads will be used. |
–core |
Number of threads used for the analysis. |
–include-introns |
When disabled, only exon reads are used for quantification. When enabled, intron reads are also used for quantification. |
–refpath |
The path of reference genome. |
–star_path |
External STAR software path. If the index in the reference genome is built by another STAR, please specify its path. |
–qvalue |
Minimum FDR (q-value) cutoff for peak detection. Default: 0.05. |
–nolambda |
If True, MACS3 will use the background lambda as local lambda. This means MACS3 will not consider the local bias at peak candidate regions. |
–snapshift |
MACS3 peak detection shift size. Default: 0. |
–extsize |
MACS3 peak detection extension size. Default: 400. |
–min_len |
Minimum length of peaks. If not set, it will be set to “extsize”. Default: 400. |
–broad |
If enabled, perform broad peak calling and generate results in UCSC gappedPeak format, which encapsulates the nested structure of peaks. |
–broad_cutoff |
Threshold for broad peak calling. Default: 0.1. |
–min_atac_count |
Cell caller override: define the minimum number of ATAC transposition events in peaks (ATAC counts) for a cell barcode. |
–min_gex_count |
Cell caller override: define the minimum number of GEX UMI counts for a cell barcode. |
-h,–help |
Show this parameter descriptions. |