breseq

 breseqとは10Mb以下のスモールゲノム向け変異解析ツールである。haploidのバクテリアゲノムのindelをgene locusまで含めて検出する。
 以下に使用例を示す。

fastpによるトリミング

(base) ~ % cd ~/Desktop
(base) Desktop % cd 20210927
(base) 20210927 % ls
A_baumannii_ATCC19606_AP022836.fa DNBSEQ_GT1_Read1.fq.gz
A_baumannii_ATCC19606_AP022836.gbk DNBSEQ_GT1_Read2.fq.gz
(base) 20210927 % gunzip DNBSEQ_GT1_Read1.fq.gz #gunzipコマンドでgzipファイルを解凍
(base) 20210927 % gunzip DNBSEQ_GT1_Read2.fq.gz
(base) 20210927 % fastp -i DNBSEQ_GT1_Read1.fq -I DNBSEQ_GT1_Read2.fq -o DNBSEQ_GT1_Read1_trimmed.fq -O DNBSEQ_GT1_Read2_trimmed.fq -t 2 -T 2 -A -L -Y 0 #fastpでトリミング
Read1 before filtering:
total reads: 4619466
total bases: 692919900
Q20 bases: 657161743(94.8395%)
Q30 bases: 577169912(83.2953%)
Read2 before filtering:
total reads: 4619466
total bases: 692919900
Q20 bases: 630907173(91.0505%)
Q30 bases: 524428115(75.6838%)

Read1 after filtering:
total reads: 4555819
total bases: 674261212
Q20 bases: 640675291(95.0189%)
Q30 bases: 563651767(83.5955%)

Read2 aftering filtering:
total reads: 4555819
total bases: 674261212
Q20 bases: 617098128(91.5221%)
Q30 bases: 513925131(76.2205%)

Filtering result:
reads passed filter: 9111638
reads failed due to low quality: 63338
reads failed due to too many N: 63956

Duplication rate: 0.342311%

Insert size peak (evaluated by paired-end reads): 264

JSON report: fastp.json
HTML report: fastp.html

fastp -i DNBSEQ_GT1_Read1.fq -I DNBSEQ_GT1_Read2.fq -o DNBSEQ_GT1_Read1_trimmed.fq -O DNBSEQ_GT1_Read2_trimmed.fq -t 2 -T 2 -A -L -Y 0
fastp v0.20.1, time used: 27 seconds
(base) 20210927 % ls
A_baumannii_ATCC19606_AP022836.fa DNBSEQ_GT1_Read2.fq
A_baumannii_ATCC19606_AP022836.gbk DNBSEQ_GT1_Read2_trimmed.fq
DNBSEQ_GT1_Read1.fq fastp.html
DNBSEQ_GT1_Read1_trimmed.fq fastp.json

仮想環境へのbreseqのインストール

breseq_py3.6という仮想環境を、python3.6のバージョンで作成、activateする。

(base) 20210927 % conda create -n breseq_py3.6 python=3.6 #breseq_py3.6という仮想環境を、python3.6のバージョンで作成
Collecting package metadata (current_repodata.json): done
Solving environment: done
## Package Plan ##

environment location: /Users/ocumbacteriology/opt/anaconda3/envs/breseq_py3.6

added / updated specs:
- python=3.6


The following packages will be downloaded:

package | build
---------------------------|-----------------
python-3.6.13 |haf480d7_2_cpython 20.5 MB conda-forge
setuptools-58.0.4 | py36h79c6626_2 960 KB conda-forge
------------------------------------------------------------
Total: 21.4 MB

The following NEW packages will be INSTALLED:

ca-certificates conda-forge/osx-64::ca-certificates-2021.5.30-h033912b_0
libcxx conda-forge/osx-64::libcxx-12.0.1-habf9029_0
libffi conda-forge/osx-64::libffi-3.4.2-he49afe7_4
ncurses conda-forge/osx-64::ncurses-6.2-h2e338ed_4
openssl conda-forge/osx-64::openssl-1.1.1l-h0d85af4_0
pip conda-forge/noarch::pip-21.2.4-pyhd8ed1ab_0
python conda-forge/osx-64::python-3.6.13-haf480d7_2_cpython
python_abi conda-forge/osx-64::python_abi-3.6-2_cp36m
readline conda-forge/osx-64::readline-8.1-h05e3726_0
setuptools conda-forge/osx-64::setuptools-58.0.4-py36h79c6626_2
sqlite conda-forge/osx-64::sqlite-3.36.0-h23a322b_2
tk conda-forge/osx-64::tk-8.6.11-h5dbffcc_1
wheel conda-forge/noarch::wheel-0.37.0-pyhd8ed1ab_1
xz conda-forge/osx-64::xz-5.2.5-haf1e3a3_1
zlib conda-forge/osx-64::zlib-1.2.11-h7795811_1010


Proceed ([y]/n)? y


Downloading and Extracting Packages python-3.6.13 | 20.5 MB | ############################################################################ | 100%
setuptools-58.0.4 | 960 KB | ############################################################################ | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate breseq_py3.6
#
# To deactivate an active environment, use
#
# $ conda deactivate
(base) 20210927 % conda activate breseq_py3.6 #仮想環境breseq_py3.6をactivate

作成した仮想環境中にbreseqをインストールする。

(breseq_py3.6) 20210927 % conda install -c bioconda breseq #仮想環境中に、bioconda経由で、breseqをインストール
Collecting package metadata (current_repodata.json): done
Solving environment: done
## Package Plan ##

environment location: /Users/ocumbacteriology/opt/anaconda3/envs/breseq_py3.6

added / updated specs:
- breseq


The following packages will be downloaded:
package | build
---------------------------|-----------------
bowtie2-2.3.5.1 | py36h2dec4b4_0 1.5 MB bioconda
breseq-0.36.0 | hfd59bb5_0 3.0 MB bioconda
cctools_osx-64-973.0.1 | h3e07e27_2 2.0 MB conda-forge
clang-12.0.1 | h694c41f_4 125 KB conda-forge
clang-12-12.0.1 |default_he082bbe_4 696 KB conda-forge
clang_osx-64-12.0.1 | hb91bd55_1 16 KB conda-forge
clangxx-12.0.1 |default_he082bbe_4 125 KB conda-forge
clangxx_osx-64-12.0.1 | h7e1b574_1 15 KB conda-forge
curl-7.79.1 | hb861fe1_0 141 KB conda-forge
gettext-0.19.8.1 | hd1a6beb_1008 3.3 MB conda-forge
gfortran_impl_osx-64-9.3.0 | h9cc0e5e_23 19.2 MB conda-forge
gsl-2.7 | h93259b0_0 3.1 MB conda-forge
harfbuzz-2.9.1 | h159f659_0 1.8 MB conda-forge
krb5-1.19.2 | hcfbf3a7_1 1.3 MB conda-forge
ld64_osx-64-609 | h2487922_2 1.5 MB conda-forge
ldid-2.1.2 | h6a69015_3 55 KB conda-forge
libclang-cpp12-12.0.1 |default_he082bbe_4 12.8 MB conda-forge
libcurl-7.79.1 | hf45b732_0 317 KB conda-forge
libgfortran-devel_osx-64-9.3.0| h6c81a4c_23 333 KB conda-forge
libglib-2.68.4 | hf1fb8c0_1 2.8 MB conda-forge
libllvm12-12.0.1 | hd011deb_2 24.1 MB conda-forge
libnghttp2-1.43.0 | h6f36284_1 867 KB conda-forge
libssh2-1.10.0 | h52ee1ee_2 221 KB conda-forge
llvm-tools-12.0.1 | hd011deb_2 12.3 MB conda-forge
mpc-1.2.1 | hbb51d92_0 103 KB conda-forge
mpfr-4.1.0 | h0f52abe_1 400 KB conda-forge
pango-1.48.10 | ha05cd14_1 385 KB conda-forge
r-base-4.1.1 | h65845f3_1 24.3 MB conda-forge
------------------------------------------------------------
Total: 116.5 MB

The following NEW packages will be INSTALLED:

_r-mutex conda-forge/noarch::_r-mutex-1.0.1-anacondar_1
bowtie2 bioconda/osx-64::bowtie2-2.3.5.1-py36h2dec4b4_0
breseq bioconda/osx-64::breseq-0.36.0-hfd59bb5_0
bwidget conda-forge/osx-64::bwidget-1.9.14-h694c41f_0
bzip2 conda-forge/osx-64::bzip2-1.0.8-h0d85af4_4
c-ares conda-forge/osx-64::c-ares-1.17.2-h0d85af4_0
cairo conda-forge/osx-64::cairo-1.16.0-he43a7df_1008
cctools_osx-64 conda-forge/osx-64::cctools_osx-64-973.0.1-h3e07e27_2
clang conda-forge/osx-64::clang-12.0.1-h694c41f_4
clang-12 conda-forge/osx-64::clang-12-12.0.1-default_he082bbe_4
clang_osx-64 conda-forge/osx-64::clang_osx-64-12.0.1-hb91bd55_1
clangxx conda-forge/osx-64::clangxx-12.0.1-default_he082bbe_4
clangxx_osx-64 conda-forge/osx-64::clangxx_osx-64-12.0.1-h7e1b574_1
compiler-rt conda-forge/osx-64::compiler-rt-12.0.1-he01351e_0
compiler-rt_osx-64 conda-forge/noarch::compiler-rt_osx-64-12.0.1-hd3f61c9_0
curl conda-forge/osx-64::curl-7.79.1-hb861fe1_0
font-ttf-dejavu-s~ conda-forge/noarch::font-ttf-dejavu-sans-mono-2.37-hab24e00_0
font-ttf-inconsol~ conda-forge/noarch::font-ttf-inconsolata-3.000-h77eed37_0
font-ttf-source-c~ conda-forge/noarch::font-ttf-source-code-pro-2.038-h77eed37_0
font-ttf-ubuntu conda-forge/noarch::font-ttf-ubuntu-0.83-hab24e00_0
fontconfig conda-forge/osx-64::fontconfig-2.13.1-h10f422b_1005
fonts-conda-ecosy~ conda-forge/noarch::fonts-conda-ecosystem-1-0
fonts-conda-forge conda-forge/noarch::fonts-conda-forge-1-0
freetype conda-forge/osx-64::freetype-2.10.4-h4cff582_1
fribidi conda-forge/osx-64::fribidi-1.0.10-hbcb3906_0
gettext conda-forge/osx-64::gettext-0.19.8.1-hd1a6beb_1008
gfortran_impl_osx~ conda-forge/osx-64::gfortran_impl_osx-64-9.3.0-h9cc0e5e_23
gfortran_osx-64 conda-forge/osx-64::gfortran_osx-64-9.3.0-h18f7dce_14
gmp conda-forge/osx-64::gmp-6.2.1-h2e338ed_0
graphite2 conda-forge/osx-64::graphite2-1.3.13-h2e338ed_1001
gsl conda-forge/osx-64::gsl-2.7-h93259b0_0
harfbuzz conda-forge/osx-64::harfbuzz-2.9.1-h159f659_0
icu conda-forge/osx-64::icu-68.1-h74dc148_0
isl conda-forge/osx-64::isl-0.22.1-hb1e8313_2
jbig conda-forge/osx-64::jbig-2.1-h0d85af4_2003
jpeg conda-forge/osx-64::jpeg-9d-hbcb3906_0
krb5 conda-forge/osx-64::krb5-1.19.2-hcfbf3a7_1
ld64_osx-64 conda-forge/osx-64::ld64_osx-64-609-h2487922_2
ldid conda-forge/osx-64::ldid-2.1.2-h6a69015_3
lerc conda-forge/osx-64::lerc-2.2.1-h046ec9c_0
libblas conda-forge/osx-64::libblas-3.9.0-11_osx64_openblas
libcblas conda-forge/osx-64::libcblas-3.9.0-11_osx64_openblas
libclang-cpp12 conda-forge/osx-64::libclang-cpp12-12.0.1-default_he082bbe_4
libcurl conda-forge/osx-64::libcurl-7.79.1-hf45b732_0
libdeflate conda-forge/osx-64::libdeflate-1.7-h35c211d_5
libedit conda-forge/osx-64::libedit-3.1.20191231-h0678c8f_2
libev conda-forge/osx-64::libev-4.33-haf1e3a3_1
libgfortran conda-forge/osx-64::libgfortran-5.0.0-9_3_0_h6c81a4c_23
libgfortran-devel~ conda-forge/noarch::libgfortran-devel_osx-64-9.3.0-h6c81a4c_23
libgfortran5 conda-forge/osx-64::libgfortran5-9.3.0-h6c81a4c_23
libglib conda-forge/osx-64::libglib-2.68.4-hf1fb8c0_1
libiconv conda-forge/osx-64::libiconv-1.16-haf1e3a3_0
liblapack conda-forge/osx-64::liblapack-3.9.0-11_osx64_openblas
libllvm12 conda-forge/osx-64::libllvm12-12.0.1-hd011deb_2
libnghttp2 conda-forge/osx-64::libnghttp2-1.43.0-h6f36284_1
libopenblas conda-forge/osx-64::libopenblas-0.3.17-openmp_h3351f45_1
libpng conda-forge/osx-64::libpng-1.6.37-h7cec526_2
libssh2 conda-forge/osx-64::libssh2-1.10.0-h52ee1ee_2
libtiff conda-forge/osx-64::libtiff-4.3.0-h1167814_1
libwebp-base conda-forge/osx-64::libwebp-base-1.2.1-h0d85af4_0
libxml2 conda-forge/osx-64::libxml2-2.9.12-h93ec3fd_0
llvm-openmp conda-forge/osx-64::llvm-openmp-12.0.1-hda6cdc1_1
llvm-tools conda-forge/osx-64::llvm-tools-12.0.1-hd011deb_2
lz4-c conda-forge/osx-64::lz4-c-1.9.3-he49afe7_1
make conda-forge/osx-64::make-4.3-h22f3db7_1
mpc conda-forge/osx-64::mpc-1.2.1-hbb51d92_0
mpfr conda-forge/osx-64::mpfr-4.1.0-h0f52abe_1
pango conda-forge/osx-64::pango-1.48.10-ha05cd14_1
pcre conda-forge/osx-64::pcre-8.45-he49afe7_0
pcre2 conda-forge/osx-64::pcre2-10.37-ha16e1b2_0
perl conda-forge/osx-64::perl-5.32.1-0_h0d85af4_perl5
pixman conda-forge/osx-64::pixman-0.40.0-hbcb3906_0
r-base conda-forge/osx-64::r-base-4.1.1-h65845f3_1
tapi conda-forge/osx-64::tapi-1100.0.11-h9ce4665_0
tbb conda-forge/osx-64::tbb-2021.3.0-h940c156_0
tktable conda-forge/osx-64::tktable-2.10-h49f0cf7_3
zstd conda-forge/osx-64::zstd-1.5.0-h582d3a0_0


Proceed ([y]/n)? y
Downloading and Extracting Packages
libclang-cpp12-12.0. | 12.8 MB | ############################################################################ | 100%
clangxx-12.0.1 | 125 KB | ############################################################################ | 100%
gsl-2.7 | 3.1 MB | ############################################################################ | 100%
mpc-1.2.1 | 103 KB | ############################################################################ | 100%
r-base-4.1.1 | 24.3 MB | ############################################################################ | 100%
libglib-2.68.4 | 2.8 MB | ############################################################################ | 100%
libssh2-1.10.0 | 221 KB | ############################################################################ | 100%
libgfortran-devel_os | 333 KB | ############################################################################ | 100%
harfbuzz-2.9.1 | 1.8 MB | ############################################################################ | 100%
llvm-tools-12.0.1 | 12.3 MB | ############################################################################ | 100%
clangxx_osx-64-12.0. | 15 KB | ############################################################################ | 100%
clang-12.0.1 | 125 KB | ############################################################################ | 100%
gettext-0.19.8.1 | 3.3 MB | ############################################################################ | 100%
ldid-2.1.2 | 55 KB | ############################################################################ | 100%
cctools_osx-64-973.0 | 2.0 MB | ############################################################################ | 100%
libcurl-7.79.1 | 317 KB | ############################################################################ | 100%
bowtie2-2.3.5.1 | 1.5 MB | ############################################################################ | 100%
clang_osx-64-12.0.1 | 16 KB | ############################################################################ | 100%
krb5-1.19.2 | 1.3 MB | ############################################################################ | 100%
libnghttp2-1.43.0 | 867 KB | ############################################################################ | 100%
breseq-0.36.0 | 3.0 MB | ############################################################################ | 100%
pango-1.48.10 | 385 KB | ############################################################################ | 100%
ld64_osx-64-609 | 1.5 MB | ############################################################################ | 100%
clang-12-12.0.1 | 696 KB | ############################################################################ | 100%
mpfr-4.1.0 | 400 KB | ############################################################################ | 100%
gfortran_impl_osx-64 | 19.2 MB | ############################################################################ | 100%
curl-7.79.1 | 141 KB | ############################################################################ | 100%
libllvm12-12.0.1 | 24.1 MB | ############################################################################ | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

breseqの実行

(breseq_py3.6) 20210927 % breseq -j 8 -o output -r A_baumannii_ATCC19606_AP022836.gbk DNBSEQ_GT1_Read1_trimmed.fq DNBSEQ_GT1_Read2_trimmed.fq
----結果の詳細は省略----

結果のファイルができていることを確認

(breseq_py3.6) 20210927 % ls
A_baumannii_ATCC19606_AP022836.fa DNBSEQ_GT1_Read2_trimmed.fq
A_baumannii_ATCC19606_AP022836.gbk fastp.html
DNBSEQ_GT1_Read1.fq fastp.json
DNBSEQ_GT1_Read1_trimmed.fq output
DNBSEQ_GT1_Read2.fq output.gd

gdtoolsコマンドによる変異箇所の確認と表示

 GenomeDiffファイル(拡張子.gd)は、breseqによって出力されるテキストファイルの一つで、検出された全ての変異を記述してある。gdtoolsコマンドは、 GenomeDiffファイルに対して様々な機能(VALIDATE、APPLY、COMPARE)を実行するためのコマンドである。

(breseq_py3.6) 20210927 % gdtools VALIDATE -r A_baumannii_ATCC19606_AP022836.gb output.gd #GenomeDiffファイルの検証
----結果の詳細は省略----
(breseq_py3.6) 20210927 % gdtools APPLY -f GFF3 -o output.gff3 -r A_baumannii_ATCC19606_AP022836.gbk output.gd #変異株のゲノム情報の出力
================================================================================
breseq 0.36.0 http://barricklab.org/breseq

Active Developers: Barrick JE, Deatherage DE
Contact:

breseq is free software; you can redistribute it and/or modify it under the
terms the GNU General Public License as published by the Free Software
Foundation; either version 2, or (at your option) any later version.

Copyright (c) 2008-2010 Michigan State University
Copyright (c) 2011-2017 The University of Texas at Austin

If you use breseq in your research, please cite:

Deatherage, D.E., Barrick, J.E. (2014) Identification of mutations
in laboratory-evolved microbes from next-generation sequencing
data using breseq. Methods Mol. Biol. 1151: 165-188.

If you use structural variation (junction) predictions, please cite:

Barrick, J.E., Colburn, G., Deatherage D.E., Traverse, C.C.,
Strand, M.D., Borges, J.J., Knoester, D.B., Reba, A., Meyer, A.G.
(2014) Identifying structural variation in haploid microbial genomes
from short-read resequencing data using breseq. BMC Genomics 15:1039.
================================================================================

*** Begin APPLY ***
Reading input reference files: A_baumannii_ATCC19606_AP022836.gbk
Reading input GD file: output.gd

Writing output file in GFF3 format

*** End APPLY ***
(breseq_py3.6) 20210927 % gdtools COMPARE -o col10R.html -r A_baumannii_ATCC19606_AP022836.gbk output.gd #複数サンプルのGenomeDiffファイルを比較して統合レポートhtmlを出力
================================================================================
breseq 0.36.0 http://barricklab.org/breseq

Active Developers: Barrick JE, Deatherage DE
Contact:

breseq is free software; you can redistribute it and/or modify it under the
terms the GNU General Public License as published by the Free Software
Foundation; either version 2, or (at your option) any later version.

Copyright (c) 2008-2010 Michigan State University
Copyright (c) 2011-2017 The University of Texas at Austin

If you use breseq in your research, please cite:

Deatherage, D.E., Barrick, J.E. (2014) Identification of mutations
in laboratory-evolved microbes from next-generation sequencing
data using breseq. Methods Mol. Biol. 1151: 165-188.

If you use structural variation (junction) predictions, please cite:

Barrick, J.E., Colburn, G., Deatherage D.E., Traverse, C.C.,
Strand, M.D., Borges, J.J., Knoester, D.B., Reba, A., Meyer, A.G.
(2014) Identifying structural variation in haploid microbial genomes
from short-read resequencing data using breseq. BMC Genomics 15:1039.
================================================================================

*** Begin ANNOTATE/COMPARE ***

Reading input reference sequence files
A_baumannii_ATCC19606_AP022836.gbk

Reading input GD file: output.gd

Annotating mutations
Writing output HTML file: col10R.html

*** End ANNOTATE/COMPARE ***
(breseq_py3.6) 20210927 %
参考のためerrorを含むraw dataはこちら
詳細を表示する
[戻る]