添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis. The main functions of FastQC are
  • Import of data from BAM, SAM or FastQ files (any variant)
  • Providing a quick overview to tell you in which areas there may be problems
  • Summary graphs and tables to quickly assess your data
  • Export of results to an HTML based permanent report
  • Offline operation to allow automated generation of reports without running the interactive application
  • Documentation

    A copy of the FastQC documentation is available for you to try before you buy (well download..).

    Example Reports

  • Good Illumina Data
  • Bad Illumina Data
  • Adapter dimer contaminated run
  • Small RNA with read-through adapter
  • Reduced Representation BS-Seq
  • PacBio
  • Changelog

  • 08-01-19: Version 0.11.9 released
  • Fixed a bug when analysing empty files
  • Added support for multi-read fast5 files
  • Fixed a corner case bug in adapter detection
  • Bundled a JRE with the OSX build so you don't have to install it
  • Fixed a hang if the program runs out of memory
  • 04-10-18: Version 0.11.8 released
  • Fixed a performance bug in highly duplicated sequences
  • Changed the behaviour of the sequence length module when run with --nogroup
  • Other minor bug fixes
  • Disabled the Kmer plot by default
  • Fixed a bug when long custom adapters were being used
  • Changed the tile number cutoff to accommodate the novaseq
  • Fixed various format changes in nanopore data from ONT
  • Added new Clontech sequences to the contaminant list
  • Added a --min-length option to remove short sequences
  • Added an option to specify the output name of data streamed into the program
  • Fixed the smallRNA adapter sequence so that abundance isn't under-represented in the adapter content plot
  • Fixed a bug in the warn / error code for the per-base sequence content plot
  • Fixed a typo in the documentation for the duplication plot
  • Changed the OSX launcher to not rely on the internal JVM framework, but use any command line java which is found
  • Fixed a typo in one of the adapter sequences
  • Fixed a bug which meant that some file extensions weren't removed from report names in non-interactive mode
  • Made the per-tile module not collect any stats if it's disabled in limits.txt
  • Fixed a bug in the calculation of duplication for highly duplicated, ordered files with very small numbers of sequences
  • Fixed an incorrect error flag in the per-base quality module where there were less than 100 observations in a read group
  • Fixed a bug when disabling the per-tile plot from limits.txt
  • Fixed a bug which caused the program to continue when processing of multiple files was actually complete
  • Fixed a bug which meant format selection in the interactive application didn't work
  • Added checks for mis-itentifying tile numbers in confusing sample ids
  • Added the SOLID smallRNA adapter to the standard search set
  • Fixed a bug when extracting casava names from uncompressed fastq files
  • Added support for processing files of Oxford Nanopore reads
  • Fixed incorrect warn/fail defaults for per-seq quality plot
  • Fixed memory leaks in Kmer and per-seq quality modules
  • Added an option to use a custom limits file
  • Fixed a bug in the naming of the folder inside the zip output file
  • Fixed a bug in the --extract option
  • 2-6-14: Version 0.11.1 released
  • Added configurable warn/fail thresholds for all modules
  • Allow modules to be selectively turned off
  • Added a per-tile quality plot for Illumina libraries
  • Added an adapter content plot
  • Improved the duplication plot
  • Improved the Kmer module
  • Used embedded graphics in the HTML output so you can distribute a single file
  • Added the ability to read data from stdin
  • Changed how base grouping works to better accommodate long reads
  • Dropped support for Solexa64 format (NB not Phred 64 which is still supported)
  • 3-5-12: Version 0.10.1 released
  • Added a workround to allow the analysis of concatenated gzipped files
  • Fixed a bug when FastQC was installed in a path containing characters needing to be escaped in a URL
  • Added an option to specify the location of the java interpreter on the command line
  • 9-9-11: Version 0.10.0 released
  • Added a Casava mode to sanely process the multiple fastq files produced by the latest illumina pipeline
  • Fixed a bug in Kmer analysis which missed of the last possible Kmer in each sequence
  • Fixed a classpath bug if using the wrapper script under windows
  • 31-8-11: Version 0.9.6 released
  • Fixed a crash in libraries where every sequence ended in poly-N
  • Fixed the launch wrapper to set the classpath correctly on OSX
  • 16-8-11: Version 0.9.5 released
  • Fixed a bug in text output for the per-base sequence content module
  • Made progress reporting absolute, and not approximate
  • Added a print CSS style so reports are printable again
  • 13-7-11: Version 0.9.4 released
  • Improved the error reporting for failed files in the offline application
  • 16-6-11: Version 0.9.3 released
  • Added support for bzip2 compressed fastq files
  • Added new CSS theme for HTML reports, contributed by Phil Ewels
  • 16-5-11: Version 0.9.2 released
  • Fixed a bug where grouped base numbers weren't reported in the per-base quality text report
  • Fixed a crash in the Kmer analysis when analysing small files
  • 30-3-11: Version 0.9.1 released
  • Added --quiet and --nogroup options to command line
  • Added encoding type to the basic stats
  • Added detection of Illumina <1.3 1.3 1.5 and 1.9 encodings
  • 10-2-11: Version 0.9.0 released
  • Added support for very long reads (esp 454 and PacBio)
  • Duplication detection now uses only the first 50bp of each read
  • 21-1-11: Version 0.8.0 released
  • Made all graphs easier to interpret
  • Added an option to analyse only mapped sequences from a BAM/SAM file
  • Added an option to analyse two or more files in parallel
  • 24-11-10: Version 0.7.2 released
  • Fixed bug when analysing libraries with no unique sequences
  • Added an option to specify a custom contaminant list on the command line
  • 24-11-10: Version 0.7.1 released
  • Improved the command line interface with proper options and error handling
  • Added an option to force the file format where guessing from the filename doesn't work
  • 27-10-10: Version 0.7.0 released
  • Added a Kmer enrichment analysis to find non-aligned enriched sequences
  • Cleaned up axis labels on all graphs
  • 27-10-10: Version 0.6.1 released
  • Fixed a bug which caused some sequences and qualities from BAM/SAM files to be reversed
  • 18-10-10: Version 0.6.0 released
  • Sequences can now be read from SAM/BAM format files
  • Added smoother lines to the graphs
  • 29-09-10: Version 0.5.1 released
  • Fixed a formatting bug in the text output
  • Fixed the %GC plot to work well with reads over 100bp
  • Improved the fitting of the modelled curve to the %GC plot
  • Added more illumina oligos to the contaminants file
  • 16-09-10: Version 0.5.0 released
  • Improved the fitting of the normal distribution to %GC plot
  • Calculated the total duplicated sequence % in the duplicate sequence module
  • Added pass/fail/warn icons next to each section of the HTML report
  • Put Icons and Images into subfolders in the HTML report
  • 30-07-10: Version 0.4.3 released
  • Fixed the reporting of sequence counts in the Basic Stats module
  • Added a warning before overwriting reports in the interactive application
  • 26-07-10: Version 0.4.2 released
  • Fixed y-axis scale on per-base quality plot
  • Added fail / warn checks to modules which lacked them and improved existing checks
  • Added a modelled distribtion to the per-sequence GC plot
  • Scale the width of report graphs for long sequence reads
  • 24-06-10: Version 0.4.1 released
  • Changed the duplicate module to reduce memory usage for long sequences
  • Changed the way duplicate levels are counted to be more realistic
  • 18-06-10: Version 0.4 released
  • Added a sequence duplication level module
  • Added a lauch wrapper for easier use from the commandline
  • Added full machine parsable output for integration into pipelines
  • 28-05-10: Version 0.3.1 released
  • Fixed a bug where invalid template files caused a crash
  • Non-interactive use now correctly reports progress for all files, not just the first one
  • Added some missing documentation
  • 13-05-10: Version 0.3 released
  • Added support for gzip compressed fastq files
  • Added identification of overrepresented sequences
  • Improved colorspace support
  • Added an option to save non-interactive reports to a specific directory
  • 06-05-10: Version 0.2 released
  • Added support for colorspace fastq files
  • Added templating support to allow customisation of HTML reports
  • Unzipped non-interactive reports by default, and added an option to turn this off
  • Added easily computer readable summary file to reports
  • 28-04-10: Version 0.1.1 released
  • Fixed a bug which prevented non-interactive use on a headless system
  • 26-04-10: Version 0.1 released
  • Initial set of 9 modules
  • Interactive and offline operation functional
  •