# bless **Repository Path**: wszdc/bless ## Basic Information - **Project Name**: bless - **Description**: No description available - **Primary Language**: Unknown - **License**: GPL-3.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-09-26 - **Last Updated**: 2021-09-26 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ----------------------------------------------------------------------------------------- BLESS: Bloom-filter-based Error Correction Solution for High-throughput Sequencing Reads Developed by: ESCAD Group, Computational Comparative Genomics Lab, and IMPACT Group in Univsersity of Illinois at Urbana-Champaign ----------------------------------------------------------------------------------------- -------------------------------------------------- Change History -------------------------------------------------- V0.11: 10/24/2013 First release. V0.12: 12/02/2013 Single-end (or merged paired-end) reads are supported. A bug in correcting errors in the first k-mer is corrected. V0.13: 04/06/2014 Error correction accuracy has been improved especially for highly repetitive genomes. BLESS supports dumping/loading the Bloom filter contents. Reads are trimmed by default. V0.14: 04/08/2014 Everything is equal to V0.13 except that this version supports Illumina 1.8+ FASTQ files. V0.15: 05/16/2014 A bug that may occurs when overall quality scores are very bad is fixed. Reads that have more than MAX_N_RATIO * Ns are modified (default MAX_N_RATIO: 0.1). V0.16: 05/24/2014 Memory leak that may happen in a read with many low-quality-score bases is fixed. V0.17: 06/03/2014 A bug in setting the maximum allowed Ns for paired reads is fixed. V0.20: 10/16/2014 (Preliminary release of BLESS 2) BLESS is parallelized using OpenMP and MPI. Reads with different length are supported. The k-mer counting part is substituted with KMC (http://sun.aei.polsl.pl/kmc). Even values can be used for k. Some minor bugs are fixed. V0.21: 12/06/2014 White space characters in correted read files are removed. The multi-threaded runtime is improved. V0.22: 12/17/2014 MPI is applied to KMC. V0.23: 12/19/2014 Minor bugs are fixed. V0.24: 01/29/2015 KMC is updated to 2.2.1 (The max memory usage of KMC can be fixed to 4 GB for any genome). A bug in writing a quality score hisgoram file is fixed. V1.00: 05/09/2015 (aka BLESS 2) Error correction algorithm is enhanced. Compressed input/output files are supported. Dependent libraries are included (Boost headers are still needed). Non-ACGTN characters can be handled. Non deterministic behaviors are corrected. V1.01: 05/10/2015 Runtime is improved by 10%. All the dependent libraries are included. Errors in compiling pigz are fixed. V1.02: 06/10/2015 A bug in single-end reads with many Ns is fixed. -------------------------------------------------- System Requirements -------------------------------------------------- Compiling BLESS requires MPI libraries. The latest version of BLESS was tested with GCC 4.9.2, MPICH 3.1.3 (or OpenMPI 1.8.2). If you see the following error message, you need to upgrade your MPI library: error: initializing argument 2 of 'int MPI_File_open(MPI_Comm, char*, int, MPI_Info, ompi_file_t)'** -------------------------------------------------- Dependency -------------------------------------------------- MPI: tested using MPICH 3.1.3 and OpenMPI 1.8.4 (Included dependent libraries) Boost library (http://www.boost.org) Google sparsehash (https://code.google.com/p/sparsehash) klib (https://github.com/attractivechaos/klib) KMC (http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&project=kmc&subpage=about) murmurhash3 (https://code.google.com/p/smhasher) zlib (http://zlib.net) pigz (http://zlib.net/pigz) -------------------------------------------------- How to Install BLESS -------------------------------------------------- Type make. -------------------------------------------------- How to Run BLESS -------------------------------------------------- Single-end reads or merged paired-end reads > ./bless -read -prefix -kmerlength : fastq file name : / : k-mer length (odd number) Paired-end reads > ./bless -read1 -read2 -prefix -kmerlength : first read file of a paired-end fastq file : second read file of a paired-end fastq file : / : k-mer length (odd number) Run bless with no option to see the entire options. -------------------------------------------------- Parallelization -------------------------------------------------- BLESS is parallelized using OpenMP and MPI. The number of threads per each node can be changed using "-smpthread" option (default: the number of cores in a SMP node). Running BLESS on multiple nodes using MPI Launch single process on a node and bind the processes to their nodes Ex) MPICH 3.1.3 Hydra Version > mpirun -ppn 1 -bind-to board ./bless -------------------------------------------------- Choosing k-mer length -------------------------------------------------- Our empirical analysis shows that the k value that satisfies the following two conditions usually generates the results close to the best one: 1) Ns / 4^k <= 0.0001 where Ns represents the number of unique solid k-mers (BLESS reports Ns). 2) Number of corrected bases becomes the maximum at the chosen k value, which means you need to increase k as long as the number of corrected reads increases. -------------------------------------------------- How to Load Existing Bloom Filter Data -------------------------------------------------- If you run BLESS with the "-prefix prefix" option, BLESS automatically generates the files prefix.bf.data and prefix.bf.size that have the contents of the Bloom filter. These files can be loaded into BLESS to skip the time-consuming Bloom filter build process. Single-end reads or merged paired-end reads > ./bless -read -load prefix -prefix -kmerlength Paired-end reads > ./bless -read1 -read2 -load prefix -prefix -kmerlength -------------------------------------------------- Publications -------------------------------------------------- Yun Heo, Xiao-Long Wu, Deming Chen, Jian Ma, Wen-Mei Hwu (2014). BLESS: Bloom filter-based error correction solution for high-throughput sequencing reads. Bioinformatics, 30 (10), 1354-1362. doi: 10.1093/bioinformatics/btu030 -------------------------------------------------- Bug Reports: -------------------------------------------------- Yun Heo