Knowledge base

Two methods of data normalization

August 18, 2020February 23, 2021 by Yuwei Liao

Why normalization

When we draw heat maps, linear regression, neural networks, etc., we often need to normalize the data first. That’s because the values of some variables belong to different magnitude levels, such as variables around 10,000 and variables around 100. Then when drawing heat maps or machine learning, because this 10,000 exists, variables at the level of 100 The changes or sample differences will become insignificant and will not show up on the heat map, or will not play a role in machine learning training. Therefore, we have to normalize different variables so that they are at the same level of magnitude.

normalize method

There are two common ones, zscore normalization and 01 normalization.

Zscore normalization, the value of each variable is subtracted from the mean of the variable, and then divided by the variance of the change. In fact, it is to find the zscore of the normal distribution, so that the normalized value is positive and negative.

01 normalization, the value of each variable is subtracted from the minimum value of the variable, and then divided by the difference between the maximum value and the minimum value of the change, so that the normalized value obtained is between 0 and 1.

Code

R language code implementation

# Note that the input x here is a matrix of numbers, or a data.frame of numbers
# 01 normalize
scale01 <- function(x, low = min(x), high = max(x)) {
x = (x - low)/(high - low)
x
}
# zscore normalize
# Normalize each column
scale <- function(x){
 colMeans = rm = colMeans(x, na.rm = T)
 x = sweep(x, 2, rm)
 colSDs = sx = apply(x, 2, sd, na.rm = T)
 x = sweep(x, 2, sx, "/")
 return(x)
}

# Normalize each row
scale <- function(x){
rm = rowMeans(x, na.rm = T)
x = sweep(x, 1, rm)
sx = apply(x, 1, sd, na.rm = T)
x = sweep(x, 1, sx, "/")
return(x)
}

Original code

https://github.com/DavidQuigley/WCDT_WGBS/blob/master/scripts/2019_05_15_WGBS_figure_1B.R

Recent Posts

Most Used Categories

The Code Search

Two methods of data normalization

Why normalization

normalize method

Code

Original code

Leave a Reply Cancel reply

Recent Posts

Most Used Categories

Two methods of data normalization

Why normalization

normalize method

Code

Original code

Leave a Reply Cancel reply

You may like

How to select best cutoff in survival plot

Some details about prcomp function in R