This notebook uses CellphoneDB to infer the cell-cell communication in the scRNA-Seq dataset GSE96981: Multilineage communication regulates human liver bud development from pluripotency

In the origin publication, the authors screened for the potential ligand-receptor interactions between Mesenchymal cells (MC), and Hepatic cell (HE) in a late-developed human liver bud. By counting the frequency of each MC-HE cell pair that express both the ligand and receptor for an interaction, they found a list of interactions used for experimental verification.

In this notebook, we use CellphoneDB v3 to screen for such potential interactions. One of the interaction that we found using CellPhoneDB, VEFGA signalling (VEFGA-KDR) was confirmed experimentally in the publication to be a driver of human liver bud development

In [27]:
# Import required packages
import pandas as pd
import scanpy as sc
import numpy as np

import sys
import os

Set the PATH environment variable to run the cellphonedb cli

In [2]:
os.environ['PATH'] = os.path.dirname(sys.executable) + ":" + os.environ['PATH']

Import the expression data

We will use the dataset GSE96981_data.lb.late, which is the scRNA-Seq data of late-developed human liver bud at five time points.

In [ ]:
os.system('wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE96nnn/GSE96981/suppl/GSE96981_data.lb.late.csv.gz')

Read the expression data into a dataframe

In [13]:
liverbud_late = pd.read_csv('GSE96981_data.lb.late.csv.gz', index_col=0)
meta_cols = ['ident', 'tSNE_1', 'tSNE_2', 'PC1', 'PC2', 'PC3', 'PC4', 'PC5', 'nGene', 'orig.ident']
metadata = liverbud_late[meta_cols]
matrix = liverbud_late.drop(meta_cols, axis=1)
In [14]:
metadata
Out[14]:
ident tSNE_1 tSNE_2 PC1 PC2 PC3 PC4 PC5 nGene orig.ident
cell_id
A1_LTLB5 3 9.185785 -2.943974 0.063168 0.001287 0.030374 0.000921 -0.023990 3368 LTLB5
A10_LTLB1 4 -9.749909 13.431675 -0.085401 0.018756 0.011004 -0.089729 -0.025971 2621 LTLB1
A11_LTLB1 4 -3.404610 2.608864 -0.011080 -0.002221 -0.026479 0.041365 -0.068409 1107 LTLB1
A11_LTLB5 3 14.326743 -4.125117 0.065062 0.002582 0.052660 -0.004864 -0.014561 2539 LTLB5
A11B_LTLB3 4 -7.878148 16.415919 -0.089762 0.018900 0.005106 -0.217028 -0.079014 2569 LTLB3
... ... ... ... ... ... ... ... ... ... ...
H6_LTLB5 4 -11.113188 4.054029 -0.094064 0.022639 -0.030310 0.072431 -0.049583 1145 LTLB5
H6B_LTLB3 3 6.955985 -9.320041 0.064418 -0.000289 0.029056 -0.002495 0.000450 2899 LTLB3
H7_LTLB5 3 4.645720 -14.311555 0.058418 0.000714 0.045110 -0.014251 0.034402 5574 LTLB5
H9_LTLB2 3 8.911134 -10.052128 0.073433 -0.001560 0.041183 -0.007789 0.005272 2769 LTLB2
H9B_LTLB4 4 -9.746657 9.532939 -0.056634 0.035100 0.003585 -0.004613 0.001969 2263 LTLB4

173 rows × 10 columns

In [15]:
matrix
Out[15]:
SAMD11 NOC2L HES4 ISG15 AGRN C1orf159 SDF4 B3GALT6 UBE2J2 ACAP3 ... CMC4 BRCC3 VBP1 F8A2 F8A3 TMLHE VAMP7 RPS4Y1 USP9Y DDX3Y
cell_id
A1_LTLB5 0.000000 0.0 0.000000 0.000000 0.0 0.0 5.198471 0.000000 0.078979 0.000000 ... 0.00000 0.000000 6.164533 0.000000 0.000000 0.000000 7.126818 0.000000 0.0 0.0
A10_LTLB1 0.000000 0.0 0.000000 0.000000 0.0 0.0 0.000000 0.000000 0.000000 0.000000 ... 0.00000 3.909533 0.000000 0.000000 0.000000 0.000000 0.000000 7.708065 0.0 0.0
A11_LTLB1 0.000000 0.0 0.000000 0.000000 0.0 0.0 0.000000 0.000000 0.000000 0.000000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0
A11_LTLB5 0.000000 0.0 0.000000 0.000000 0.0 0.0 0.000000 0.000000 0.000000 0.000000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0
A11B_LTLB3 0.000000 0.0 0.000000 0.000000 0.0 0.0 0.000000 0.000000 0.000000 0.000000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 8.906865 0.0 0.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
H6_LTLB5 0.000000 0.0 0.000000 0.000000 0.0 0.0 0.000000 0.000000 0.000000 0.000000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 3.781152 0.0 0.0
H6B_LTLB3 0.000000 0.0 4.796837 0.000000 0.0 0.0 6.687915 0.000000 8.521856 3.083361 ... 0.00000 0.000000 7.978241 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0
H7_LTLB5 2.101479 0.0 4.051259 6.210874 0.0 0.0 4.937858 0.000000 5.630111 2.105819 ... 5.68764 4.010887 8.102138 2.834539 4.267543 5.964902 0.000000 0.000000 0.0 0.0
H9_LTLB2 0.000000 0.0 0.000000 7.881493 0.0 0.0 5.884554 0.000000 0.126352 0.000000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.255706 4.512960 0.000000 0.0 0.0
H9B_LTLB4 0.000000 0.0 0.000000 0.000000 0.0 0.0 5.317695 1.284917 0.000000 1.751879 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 3.702349 0.0 0.0

173 rows × 11715 columns

Create a scanpy object to store the data

In [34]:
adata_lblate = sc.AnnData(matrix, dtype=np.float32)
adata_lblate.obs = metadata.loc[:, ['ident', 'nGene', 'orig.ident']]
adata_lblate.obsm['X_pca'] = metadata.loc[:, ['PC1', 'PC2', 'PC3', 'PC4', 'PC5']].values
adata_lblate.obsm['X_tsne'] = metadata.loc[:, ['tSNE_1', 'tSNE_2']].values
adata_lblate
Out[34]:
AnnData object with n_obs × n_vars = 173 × 11715
    obs: 'ident', 'nGene', 'orig.ident'
    obsm: 'X_pca', 'X_tsne'

Get the cell type annotation from the authors

In [ ]:
os.system('wget -O lb_late_celltype.tsv https://www.dropbox.com/s/1dh5b46ik6noqit/lb_late_celltype.tsv?dl=1')
In [51]:
adata_lblate.obs['Cell type'] = pd.read_csv('lb_late_celltype.tsv', sep='\t')['Cell type'].to_list()
sc.pl.tsne(adata_lblate, color=['orig.ident', 'Cell type'], size=50)

We plot some marker genes provided in the publication. The marker genes define three main populations in the dataset: mesenchymal cells (MC), epithelial cells (EC), and Hepatic cells (HC)

In [41]:
sc.pl.tsne(adata_lblate, color=['AFP', 'PECAM1', 'TTR', 'COL1A2'], size=50, ncols=2, color_map='YlOrRd')

Run CellphoneDB

Save the scanpy object to a h5ad file

In [42]:
adata_lblate.write_h5ad('liverbud_late.h5ad')

Save the cell annotations into a text file as well

In [43]:
adata_lblate.obs.to_csv('lb_late_meta.tsv', sep = '\t')

Execute cellphonedb from system

In [ ]:
os.system('cellphonedb method statistical_analysis  \
    lb_late_meta.tsv                                \
    liverbud_late.h5ad                              \
    --counts-data hgnc_symbol                       \
    --output-path lb_late_cellphonedb               \
    --threshold 0.1')
In [42]:
os.system('ls -lah ./lb_late_cellphonedb')
total 1004K
drwxr-sr-x.  2 jovyan users 4.0K Dec 12 16:54 .
drwxrwsr-x. 10 jovyan users 4.0K Dec 12 17:17 ..
-rw-r--r--.  1 jovyan users  62K Dec 12 17:17 deconvoluted.txt
-rw-r--r--.  1 jovyan users  58K Dec 12 17:17 means.txt
-rw-r--r--.  1 jovyan users  55K Dec 12 17:17 pvalues.txt
-rw-r--r--.  1 jovyan users  48K Dec 12 17:17 significant_means.txt
-rw-r--r--.  1 jovyan users   52 Dec 12 16:48 top_pair.col.txt
-rw-r--r--.  1 jovyan users  713 Dec 12 16:48 top_pair.row.txt
-rw-r--r--.  1 jovyan users   52 Dec 12 16:52 top_pairs.col.txt
-rw-r--r--.  1 jovyan users  33K Dec 12 16:53 top_pairs.dotplot.pdf
-rw-r--r--.  1 jovyan users 697K Dec 12 16:54 top_pairs.dotplot.png
-rw-r--r--.  1 jovyan users  713 Dec 12 16:52 top_pairs.row.txt

Visualization

Explore the relevant interactions

p-values for the all the interacting partners: p.value refers to the enrichment of the interacting ligand-receptor pair in each of the interacting pairs of cell types. The p-value results are stored in the pvalues.txt file

In [43]:
pvalues = pd.read_csv('lb_late_cellphonedb/pvalues.txt', sep='\t')
pvalues.head()
Out[43]:
id_cp_interaction interacting_pair partner_a partner_b gene_a gene_b secreted receptor_a receptor_b annotation_strategy is_integrin EC|EC EC|Hepatic EC|MSC Hepatic|EC Hepatic|Hepatic Hepatic|MSC MSC|EC MSC|Hepatic MSC|MSC
0 CPI-SS00A8596B5 PVR_TNFSF9 simple:P15151 simple:P41273 PVR TNFSF9 True True False InnateDB-All False 0.154 0.154 1.000 0.388 0.466 1.0 0.314 0.275 1.0
1 CPI-SS09C06644C CADM3_CADM1 simple:Q8N126 simple:Q9BY67 CADM3 CADM1 False False False curated False 1.000 1.000 1.000 1.000 1.000 1.0 1.000 1.000 1.0
2 CPI-SS0889E281D CADM1_CADM1 simple:Q9BY67 simple:Q9BY67 CADM1 CADM1 False False False curated False 0.639 0.092 1.000 0.092 0.000 1.0 1.000 1.000 1.0
3 CPI-SS0B3829651 CADM3_EPB41L1 simple:Q8N126 simple:Q9H4G0 CADM3 EPB41L1 False False False curated False 1.000 1.000 1.000 1.000 1.000 1.0 1.000 1.000 1.0
4 CPI-SC07ECBFCA7 COL1A1_a10b1 complex simple:P02452 complex:a10b1 complex COL1A1 NaN True False False curated True 0.811 1.000 0.957 0.882 1.000 1.0 0.000 1.000 0.0

Mean values for all the interacting partners are stored in the file means.txt. Mean value refers to the total mean of the individual partner average expression values in the corresponding interacting pairs of cell types. If one of the mean values is 0, then the total mean is set to 0.

In [49]:
means = pd.read_csv('lb_late_cellphonedb/means.txt', sep='\t')
means.set_index('interacting_pair', inplace=True)
means.head()
Out[49]:
id_cp_interaction partner_a partner_b gene_a gene_b secreted receptor_a receptor_b annotation_strategy is_integrin EC|EC EC|Hepatic EC|MSC Hepatic|EC Hepatic|Hepatic Hepatic|MSC MSC|EC MSC|Hepatic MSC|MSC
interacting_pair
PVR_TNFSF9 CPI-SS00A8596B5 simple:P15151 simple:P41273 PVR TNFSF9 True True False InnateDB-All False 0.968 0.910 0.823 0.641 0.583 0.496 0.707 0.649 0.562
CADM3_CADM1 CPI-SS09C06644C simple:Q8N126 simple:Q9BY67 CADM3 CADM1 False False False curated False 0.000 0.000 0.000 0.378 1.191 0.085 0.413 1.225 0.120
CADM1_CADM1 CPI-SS0889E281D simple:Q9BY67 simple:Q9BY67 CADM1 CADM1 False False False curated False 0.661 1.473 0.368 1.473 2.286 1.180 0.368 1.180 0.075
CADM3_EPB41L1 CPI-SS0B3829651 simple:Q8N126 simple:Q9H4G0 CADM3 EPB41L1 False False False curated False 0.000 0.000 0.000 0.000 0.238 0.180 0.000 0.272 0.215
COL1A1_a10b1 complex CPI-SC07ECBFCA7 simple:P02452 complex:a10b1 complex COL1A1 NaN True False False curated True 1.363 0.675 1.376 1.460 0.773 1.473 3.699 3.012 3.712

Extract the pvalues of all cell type pairs

In [46]:
pvalues_matrix = pvalues[[
   'interacting_pair', 'EC|EC', 'MSC|MSC', 'MSC|EC', 'EC|MSC',
   'Hepatic|MSC', 'MSC|Hepatic', 'Hepatic|Hepatic', 'EC|Hepatic', 'Hepatic|EC'
]].set_index('interacting_pair')
pvalues_matrix
Out[46]:
EC|EC MSC|MSC MSC|EC EC|MSC Hepatic|MSC MSC|Hepatic Hepatic|Hepatic EC|Hepatic Hepatic|EC
interacting_pair
PVR_TNFSF9 0.154 1.0 0.314 1.000 1.0 0.275 0.466 0.154 0.388
CADM3_CADM1 1.000 1.0 1.000 1.000 1.0 1.000 1.000 1.000 1.000
CADM1_CADM1 0.639 1.0 1.000 1.000 1.0 1.000 0.000 0.092 0.092
CADM3_EPB41L1 1.000 1.0 1.000 1.000 1.0 1.000 1.000 1.000 1.000
COL1A1_a10b1 complex 0.811 0.0 0.000 0.957 1.0 1.000 1.000 1.000 0.882
... ... ... ... ... ... ... ... ... ...
SEMA5A_PLXNB3 1.000 1.0 0.012 1.000 1.0 1.000 1.000 1.000 1.000
IGFL2_IGFLR1 1.000 1.0 1.000 1.000 1.0 1.000 1.000 1.000 1.000
CXCL17_GPR35 1.000 1.0 1.000 1.000 1.0 1.000 1.000 1.000 1.000
EDA_EDA2R 1.000 1.0 1.000 1.000 0.0 1.000 1.000 1.000 1.000
ESAM_ESAM 0.000 1.0 1.000 1.000 1.0 1.000 1.000 1.000 1.000

379 rows × 9 columns

Search for interactions related to one cell type pair

In [47]:
# We want to find significant interactions from Hepatic to epithelial cells
PAIR='Hepatic|EC'
PVALUE_THRESHOLD=0.05
sig_pair = pvalues_matrix[pvalues_matrix[PAIR]< PVALUE_THRESHOLD]
sig_pair[PAIR].sort_values()
Out[47]:
interacting_pair
FGFR4_EPHA4             0.000
TNFRSF10B_TNFSF10       0.000
TNFRSF10C_TNFSF10       0.000
FGFR2_EPHA4             0.000
IGF2_IDE                0.000
CEACAM1_SELE            0.000
EFNA1_EPHA4             0.000
EFNA4_EPHA4             0.000
EFNA5_EPHA4             0.000
VTN_aVb1 complex        0.000
CDH1_a2b1 complex       0.000
MDK_SORL1               0.000
SEMA3E_PLXND1           0.000
COL4A6_a2b1 complex     0.000
EPHB4_EFNB1             0.000
COL18A1_a2b1 complex    0.000
COL4A5_a2b1 complex     0.000
TNFSF15_TNFRSF6B        0.000
FGFR3_EPHA4             0.000
CXCL5_ACKR1             0.004
CXADR_FAM3C             0.004
EFNB2_EPHA4             0.004
COL9A2_a2b1 complex     0.004
BMR1A_AVR2B_BMP4        0.004
COL4A3_a2b1 complex     0.004
IL33 receptor_IL33      0.004
TNFSF15_TNFRSF25        0.008
IGF2_IGF2R              0.008
MET_HGF                 0.008
COL27A1_a2b1 complex    0.008
COL4A4_a2b1 complex     0.008
COL4A6_a10b1 complex    0.008
CXCL1_ACKR1             0.008
BMR1A_ACR2A_BMP4        0.012
BMPR1A_BMPR2_BMP4       0.012
COL22A1_a2b1 complex    0.012
COL4A5_a10b1 complex    0.012
VEGFA_KDR               0.019
VEGFA_FLT1              0.020
LRP6_CKLF               0.023
COL4A1_a2b1 complex     0.024
CD46_JAG1               0.032
CD74_APP                0.032
INHA_TGFBR3             0.039
ERBB4_HBEGF             0.040
Name: Hepatic|EC, dtype: float64

Get the mean expression of both ligand and receptor

In [51]:
means.loc[sig_pair.index, PAIR].sort_values()
Out[51]:
interacting_pair
INHA_TGFBR3             0.676
ERBB4_HBEGF             0.819
BMR1A_AVR2B_BMP4        0.846
BMR1A_ACR2A_BMP4        0.846
BMPR1A_BMPR2_BMP4       0.846
TNFSF15_TNFRSF25        0.892
IL33 receptor_IL33      0.976
FGFR2_EPHA4             1.026
EFNA4_EPHA4             1.171
EFNB2_EPHA4             1.178
FGFR3_EPHA4             1.214
EFNA5_EPHA4             1.234
COL22A1_a2b1 complex    1.261
CEACAM1_SELE            1.271
COL4A4_a2b1 complex     1.315
FGFR4_EPHA4             1.318
COL4A5_a10b1 complex    1.352
COL9A2_a2b1 complex     1.366
EPHB4_EFNB1             1.371
COL4A3_a2b1 complex     1.391
COL4A6_a10b1 complex    1.424
COL4A1_a2b1 complex     1.489
MET_HGF                 1.520
COL27A1_a2b1 complex    1.621
CDH1_a2b1 complex       1.738
COL18A1_a2b1 complex    1.762
COL4A5_a2b1 complex     1.771
TNFRSF10C_TNFSF10       1.837
CXCL5_ACKR1             1.839
COL4A6_a2b1 complex     1.843
EFNA1_EPHA4             1.920
CD74_APP                2.027
TNFSF15_TNFRSF6B        2.072
CXCL1_ACKR1             2.435
SEMA3E_PLXND1           2.454
IGF2_IGF2R              2.497
TNFRSF10B_TNFSF10       2.538
VEGFA_KDR               2.840
LRP6_CKLF               2.923
CXADR_FAM3C             3.088
IGF2_IDE                3.145
MDK_SORL1               3.201
VEGFA_FLT1              3.392
CD46_JAG1               3.468
VTN_aVb1 complex        4.328
Name: Hepatic|EC, dtype: float64

Draw the dotplot

Construct the dataframe for the dotplot

In [54]:
pair_df = pd.DataFrame.from_dict({
    'mean': means.loc[sig_pair.index][PAIR],
    'pvalue': sig_pair[PAIR]
})
pair_df['pair'] = PAIR
pair_df.head()
Out[54]:
mean pvalue pair
interacting_pair
COL4A5_a10b1 complex 1.352 0.012 Hepatic|EC
COL4A6_a10b1 complex 1.424 0.008 Hepatic|EC
COL4A1_a2b1 complex 1.489 0.024 Hepatic|EC
COL4A5_a2b1 complex 1.771 0.000 Hepatic|EC
COL18A1_a2b1 complex 1.762 0.000 Hepatic|EC
In [57]:
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib as mpl

The helper function for dotplot

In [58]:
def dotplot(df, figsize=(20, 1)):
    mpl.rcParams['axes.grid'] = False
    mpl.rcParams['legend.loc'] = 'upper center'
    mpl.rcParams['legend.frameon'] = False
    fig, ax = plt.subplots(figsize=figsize)
    sns.set_style('whitegrid')
    sns.scatterplot(x=df.index, y='pair', 
                    size='mean', sizes = (100, 300), 
                    hue=df['pvalue'], data=df, palette='crest')
    plt.setp(ax.get_xticklabels(), rotation=90)
    plt.legend(bbox_to_anchor=(1.04,0.5), loc='center left', borderaxespad=0)
    plt.show()

We plot the potential interactions between two cell types: Hepatic and Epithelial cells. The size of the dot is correlated with the total mean expression of the ligand and receptor. The color of the dot is related to the p-value of the interaction

In [60]:
dotplot(pair_df)

We can filter out the interactions that involve protein complex to retain only simple ligand-receptor pair

In [61]:
simple_int = pair_df.index.str.endswith('complex')
simple_df = pair_df.loc[~simple_int]
simple_df.head()
Out[61]:
mean pvalue pair
interacting_pair
INHA_TGFBR3 0.676 0.039 Hepatic|EC
CXADR_FAM3C 3.088 0.004 Hepatic|EC
BMPR1A_BMPR2_BMP4 0.846 0.012 Hepatic|EC
BMR1A_ACR2A_BMP4 0.846 0.012 Hepatic|EC
BMR1A_AVR2B_BMP4 0.846 0.004 Hepatic|EC
In [62]:
dotplot(simple_df)

We can also filter the interactions by the mean expression threshold

In [63]:
THRESHOLD = 2
hvg_df = simple_df[simple_df['mean'] > THRESHOLD]
dotplot(hvg_df)

These significant interactions above could be a list of potential interactions for downstream verification. Actually, the VEGFA-KDR interaction (VEGFA signalling) in the list was confirmed experimentally using a KDR inhibitor experiment. The experimental results show that LB development is impaired in the presence of KDR inhibitor.

Plot the top interactions of interested cell type pairs

Helper function to extract the top interactions

In [64]:
def get_top_interactions(pair, pvalue=0.05, mean_exp=2, only_simple=True):
    """
    Extract the top interactions of one celltype pair
    @param pair: cell type pair. e.g 'MC|EC'
    @param pavalue: the pvalue threshold to filter significant interactions
    @param mean_exp: the mean expression threshold to filter interactions
    @param only_simple: whether only simple interactions are considered
    @return list: the top interactions
    """
    ### filter by pvalues
    sig_df = pvalues_matrix[pvalues_matrix[pair]< pvalue]
    sig_df = sig_df[pair]
    mean_df = means.loc[sig_df.index][pair]
    pair_df = pd.DataFrame.from_dict({'mean': mean_df, 'pvalue': sig_df})
    pair_df['pair'] = pair
    ### filter by mean 
    hvg_df = pair_df[pair_df['mean'] > mean_exp]
    ### filter by complex interaction
    if only_simple:
        simple_int = hvg_df.index.str.contains("complex")
        hvg_df = hvg_df[~simple_int]
    return hvg_df.index.to_list()

Get the top interactions of our interested cell type pairs

In [65]:
pairs = ['Hepatic|Hepatic', 'Hepatic|EC', 'Hepatic|MSC', 'MSC|EC', 'EC|MSC']
top_inters = {p: get_top_interactions(p) for p in pairs}

Get the unique set of interactions

In [66]:
shared_inters = []
for p in pairs:
    shared_inters += top_inters[p]
shared_inters = list(set(shared_inters))
print("Total shared interactions: ", len(shared_inters))
Total shared interactions:  59

Draw the heatmap of top interactions with values is p-values

In [67]:
top_df_pvalues = pvalues_matrix.loc[shared_inters, pairs]
sns.clustermap(top_df_pvalues, yticklabels=True, cmap="YlGnBu")
Out[67]:
<seaborn.matrix.ClusterGrid at 0x7f7b49560490>
In [68]:
top_df_means = means.loc[shared_inters, pairs]
sns.clustermap(top_df_means, yticklabels=True, cmap="YlGnBu")
Out[68]:
<seaborn.matrix.ClusterGrid at 0x7f7a9c667820>

Draw the dotplot of the top interactions

In [69]:
### set the interacting_pair column
top_df_means['interacting_pair'] = top_df_means.index
top_df_pvalues['interacting_pair'] = top_df_pvalues.index

### Melt the dataframes
melt_df_means = top_df_means.melt(id_vars=['interacting_pair'], value_name="mean").set_index("interacting_pair")
melt_df_pvalues = top_df_pvalues.melt(id_vars=['interacting_pair'], value_name="pvalue").set_index("interacting_pair")

top_df = melt_df_means
top_df['pvalue'] = melt_df_pvalues['pvalue']
top_df = top_df.rename(columns={"variable": "pair"})
top_df = top_df.sort_values(["pair", "pvalue"])

Draw the dotplot of all cell type pairs

In [70]:
dotplot(top_df, figsize=(20, 4))