Abstract:
Normalization of read counts is an important data processing step in the detection of differentially expressed (DE) genes between two treatments in RNA-Seq data. One popular method of normalization, the trimmed mean of M-values (TMM) approach, requires the selection of a reference sample to compare all other samples against. This selection is often made somewhat arbitrarily, and can lead to unnecessary variability in DE detection results. We propose a simple method of normalization vector averaging to reduce this variability while sacrificing minimal performance.