HISAT2如何使用

發(fā)布時(shí)間：2022-03-19 09:40:35 來源：億速云閱讀：521 作者：iii 欄目：開發(fā)技術(shù)

這篇“HISAT2如何使用”文章的知識點(diǎn)大部分人都不太理解，所以小編給大家總結(jié)了以下內(nèi)容，內(nèi)容詳細(xì)，步驟清晰，具有一定的借鑒價(jià)值，希望大家閱讀完這篇文章能有所收獲，下面我們一起來看看這篇“HISAT2如何使用”文章吧。

轉(zhuǎn)錄組比對軟件HISAT2的使用說明

轉(zhuǎn)錄組分析的常用分析流程，目前都由Hophat + cufflinks 組合轉(zhuǎn)向了采用HISTA + StringTie 組合。該組合的Protocol 可參考發(fā)表在Nature Protocol 上的文章“Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown”

首先來看看比對的軟件HISTA，其速度和精度都較Tophat 有很大的提升。

其使用說明如下：

hisat2 [options]* -x <ht2-idx> {-1 <m1> -2 <m2> | -U <r> | --sra-acc <SRA accession number>} [-S <sam>]

<ht2-idx> Index 文件的前綴 (*.X.ht2)

<m1> read1 文件 (支持gz,bzip2壓縮格式)

<m2> read2 文件 (支持gz,bzip2壓縮格式)

<r> 輸出 unpaired 比對序列（支持gz,bzip2壓縮格式）

<SRA accession number> 支持對NCBI SRA數(shù)據(jù)的下載，采用逗號分隔不同SRA號

<sam> 比對結(jié)果SAM 文件的輸出 (默認(rèn): 標(biāo)準(zhǔn)輸出)

<m1>, <m2>, <r> 支持輸入一個(gè)用逗號隔開的文件列表，也支持多次輸入比如： '-U file1.fq,file2.fq -U file3.fq'.

選項(xiàng) (括號中是默認(rèn)值):

輸入:

-q 輸入文件格式是FASTQ .fq/.fastq (default)

--qseq q輸入文件格式是 Illumina's qseq format

-f 輸入文件格式是多序列的FASTA .fa/.mfa

-r 輸入是一行序列

-c <m1>, <m2>, <r> are sequences themselves, not files

-s/--skip <int> 跳過輸入文件前面的 <int> reads/pairs (none)

-u/--upto <int> 超過輸入文件前面的 <int> reads/pairs 就停止程序(no limit)

-5/--trim5 <int> 去除Reads 5'/左邊 <int> 堿基 (0)

-3/--trim3 <int> 去除Reads 3'/r右邊 <int> 堿基 (0)

--phred33 序列質(zhì)量值編碼是 Phred+33 (默認(rèn)編碼格式)

--phred64 序列質(zhì)量值編碼是Phred+64

--int-quals 序列質(zhì)量值是用空格分開的數(shù)字

--sra-acc SRA 登錄號

比對:

--n-ceil <func> 允許非A/C/G/Ts 在比對中的比例 (L,0,0.15)

--ignore-quals 如果忽略測序質(zhì)量值，則默認(rèn)質(zhì)量值為30 (off)

--nofw 不比對正向的reads (off)

--norc 不比對反向互補(bǔ)的reads (off)

剪切比對:

--pen-cansplice <int> 正常剪切位點(diǎn)的罰分 (0)

--pen-noncansplice <int> 非正常剪切位點(diǎn)的罰分 (12)

--pen-canintronlen <func> 長內(nèi)含子正常剪切位點(diǎn)的罰分函數(shù) (G,-8,1)

--pen-noncanintronlen <func> 長內(nèi)含子非正常剪切位點(diǎn)的罰分函數(shù) (G,-8,1)

--min-intronlen <int> 內(nèi)含子最小長度 (20)

--max-intronlen <int> 內(nèi)含子最大長度 (500000)

--known-splicesite-infile <path> 指定已知的剪切位點(diǎn)文件

--novel-splicesite-outfile <path> 發(fā)現(xiàn)（報(bào)告）新的剪切位點(diǎn)

--novel-splicesite-infile <path> 指定一些新的可變剪切位點(diǎn)

--no-temp-splicesite disable the use of splice sites found

--no-spliced-alignment 停用剪切比對

--rna-strandness <string> 只能RNA的連特異性 (unstranded)

--tmo 只報(bào)告與已知的轉(zhuǎn)錄本比對上的reads

--dta 報(bào)告專門為轉(zhuǎn)錄本組裝的比對reads

--dta-cufflinks 報(bào)告專門為cufflinks組裝的比對reads

打分:

--ma <int> 匹配得分 (0 for --end-to-end, 2 for --local)

--mp <int>,<int> 位點(diǎn)錯(cuò)誤匹配的最大和最小罰分，低質(zhì)量，低罰分 <2,6>

--sp <int>,<int> max and min penalties for soft-clipping; lower qual = lower penalty <1,2>

--np <int> 非A/C/G/Ts 匹配的罰分 (1)

--rdg <int>,<int> read 空格開放和延伸的罰分(5,3)

--rfg <int>,<int> 參考序列空格開放和延伸的罰分 (5,3)

--score-min <func> 最小可接受的比對打分 (L,0.0,-0.2)

比對報(bào)告輸出:

(default) 多對比結(jié)果，只報(bào)告最好的比對

-k <int> 多比對結(jié)果，最多可報(bào)告的比對數(shù)量

-a/--all 報(bào)告全部對比對結(jié)果

雙端比對:

--fr/--rf/--ff reads 比對的方向 fw/rev, rev/fw, fw/fw (--fr)

--no-mixed 不做非配對的reads 比對

--no-discordant 比做距離不一致的reads 比對

輸出:

-t/--time 輸出在搜索過程中的使用的時(shí)間情況

--un <path> 未比對上的reads 輸出路徑 <path>

--al <path> 一端比對上的reads 輸出路徑 <path>

--un-conc <path> 比對位置不一致的reads 輸出路徑 <path>

--al-conc <path> 至少有一個(gè)位置比對一致的reads 輸出路徑 <path>

--un-gz <path>, to gzip compress output, or add '-bz2' to bzip2 compress output.)

--quiet 除非有嚴(yán)重錯(cuò)誤，否則不打印錯(cuò)誤輸出

--met-file <path> 保存metrics 到文件 <path> (off)

--met-stderr 打印metrics 大標(biāo)準(zhǔn)錯(cuò)誤輸出 (off)

--met <int> 多少秒報(bào)告一次內(nèi)部 counters 和 metrics (1)

--no-head 在SAM文件中不輸出head信息

--no-sq 在SAM文件中不輸出head的@SQ 信息

--rg-id <text> 設(shè)置reads ID信息

--rg <text> 增加reads 分組信息

--omit-sec-seq put '*' in SEQ and QUAL fields for secondary alignments.

性能:

-o/--offrate <int> 覆蓋index的offrate

-p/--threads <int> 比對的線程數(shù) (1)

--reorder 強(qiáng)制保持輸出SAM文件中reads的順序同輸入的reads一致

--mm 通過內(nèi)存共享index, 使得多個(gè)bowtie能共享

其他:

--qc-filter 過濾質(zhì)量值低的reads

--seed <int> 生成隨機(jī)數(shù)的seed(種子) (0)

--non-deterministic 隨機(jī)數(shù)生成采用種子（seed) 代替reads的屬性

--remove-chrname 在比對結(jié)果中刪除參考序列名稱上的'chr'

--add-chrname 在比對結(jié)果中給參考序列名稱加上 'chr'

--version 輸出軟件的版本信息

-h/--help 輸出軟件的使用文檔

以上就是關(guān)于“HISAT2如何使用”這篇文章的內(nèi)容，相信大家都有了一定的了解，希望小編分享的內(nèi)容對大家有幫助，若想了解更多相關(guān)的知識內(nèi)容，請關(guān)注億速云行業(yè)資訊頻道。

向AI問一下細(xì)節(jié)

HISAT2如何使用

猜你喜歡

最新資訊

相關(guān)推薦

相關(guān)標(biāo)簽