Main Page

From SeqMapReduce

Jump to: navigation, search

Contents

Introduction

SeqMapReduce is a parallel program aiming at speeding up the short sequence mapping using the MapReduce


Announcement

SeqMapReduce is a parallel program aiming at speeding up the short sequence mapping using the MapReduce.

Next-generation sequencing technologies are increasing our ability to study genome function. A new and rapidly growing family of assays for measuring the genome-wide profiles of mRNAs, small RNAs, transcription-factor binding, chromatin structure and DNA methylation status are now being implemented by applying the massively parallel, ultrahigh-throughput DNA sequencing systems. These rapid growths demand reliable, fast and easy-to-use analysis tools. We present the SeqMapReduce software for parallelizing sequence mapping using the cloud computing technology. The speed is quasi-linear to the number of computing nodes available. It took 4.5 minutes to map 6 million sequence reads to the human genome with 32 computing nodes. A comparison between SeqMapReduce and CloudBurst demonstrated that SeqMapReduce was 57.9 times faster than CloudBurst on average. We also present a user-friendly web server for unsophisticated users. The SeqMapReduce software and web service are available at http://www.seqmapreduce.org.

Benchmark


Service Demo

Human
Demo page: http://www.seqmapreduce.org/demo_human.php
Reads data set: human Illumina/Solexa dataset from the 1000 Genomes Project (accession SRR001113)
Data set link: http://www.seqmapreduce.org/demo/human/probes/probes.fa (7 million reads, each of which is 36 bp long)

Mouse
Demo page: http://www.seqmapreduce.org/demo_mouse.php
Reads data set: 2 million mouse Illumina/Solexa CHIP-seq dataset
Data set link: http://www.seqmapreduce.org/demo/mouse/probes/probes.fa (2 million reads, each of which is 36 bp long)


Source Code Download

The source codes can be obtained upon request.

Sample Data

Input data sample: http://www.seqmapreduce.org/sample/sample_probes.fa.zip
Output sample: http://www.seqmapreduce.org/sample/sample_output.zip

Personal tools