scDenorm

scDenorm: a denormalization tool for single-cell transcriptomics data

Install

pip install scDenorm

#or

conda install -c changebio scdenorm

How to use

Using pbmc3k as an example dataset

import scanpy as sc
from scipy.io import mmwrite
from scDenorm.denorm import *
DEBUG:my_logger:This is a debug message
INFO:my_logger:This is an info message
WARNING:my_logger:This is a warning message
ERROR:my_logger:This is an error message
CRITICAL:my_logger:This is a critical message
ad=sc.datasets.pbmc3k()
ad.layers['count']=ad.X.copy()
ad
AnnData object with n_obs × n_vars = 2700 × 32738
    var: 'gene_ids'
    layers: 'count'
sc.pp.normalize_total(ad, target_sum=1e4)
sc.pp.log1p(ad)
smtx = ad.X.tocsr().asfptype()
smtx.data
array([1.6352079, 1.6352079, 2.2258174, ..., 1.7980369, 1.7980369,
       2.779648 ], dtype=float32)
ad.write_h5ad('data/pbmc3k_norm.h5ad')

write out as sparse matrix

mmwrite('data/scaled.mtx', smtx[1:10,])

In jupyter

Input Anndata

scdenorm('data/pbmc3k_norm.h5ad',fout='data/pbmc3k_denorm.h5ad',verbose=1)
INFO:my_logger:Reading input file: data/pbmc3k_norm.h5ad
/home/huang_yin/anaconda3/envs/sc/lib/python3.9/site-packages/anndata/__init__.py:51: FutureWarning: `anndata.read` is deprecated, use `anndata.read_h5ad` instead. `ad.read` will be removed in mid 2024.
  warnings.warn(
INFO:my_logger:The dimensions of this data are (2700, 32738).
INFO:my_logger:Selecting base
INFO:my_logger:Denormlizing ...the base is 2.718281828459045
b is 2.718281828459045
100%|██████████| 2700/2700 [00:02<00:00, 1071.27it/s]
INFO:my_logger:Writing output file: data/pbmc3k_denorm.h5ad

return a new anndata if there is no output path.

new_ad=scdenorm('data/pbmc3k_norm.h5ad')
new_ad
View of AnnData object with n_obs × n_vars = 2700 × 32738
    var: 'gene_ids'
    uns: 'log1p'
ad.layers['count'].data
array([1., 1., 2., ..., 1., 1., 3.], dtype=float32)
new_ad.X.data
array([1.       , 1.       , 2.0000002, ..., 1.       , 1.       ,
       3.       ], dtype=float32)

Input sparse matrix with cell by gene

If it is gene by cell, set gxc=True.

scdenorm('data/scaled.mtx',fout='data/scd_scaled.h5ad')
100%|██████████| 9/9 [00:00<00:00, 2883.12it/s]

In command line

Input Anndata

!scdenorm data/pbmc3k_norm.h5ad --fout data/pbmc3k_denorm.h5ad
/home/huang_yin/anaconda3/envs/sc/lib/python3.9/site-packages/anndata/__init__.py:51: FutureWarning: `anndata.read` is deprecated, use `anndata.read_h5ad` instead. `ad.read` will be removed in mid 2024.
  warnings.warn(
b is 2.718281828459045
100%|█████████████████████████████████████| 2700/2700 [00:02<00:00, 1090.85it/s]

Input sparse matrix with cell by gene

!scdenorm data/scaled.mtx --fout data/scd_scaled_c.h5ad
100%|███████████████████████████████████████████| 9/9 [00:00<00:00, 1333.31it/s]

or output mtx format.

!scdenorm data/scaled.mtx --fout data/scd_scaled_c.mtx
100%|███████████████████████████████████████████| 9/9 [00:00<00:00, 1290.78it/s]