Part I: Exploring BiG-SLICE query result
Overview
Teaching: 0 min
Exercises: 20 minQuestions
How do I explore BiG-SLICE query result?
Objectives
Visualize query hits to BiG-FAM GCF models with Cytoscape
Exploring BiG-SLICE Query result
In this episode, we will explore BiG-SLICE query hits of S. venezuelae genomes with the BiG-FAM database (version 1.0.0, run 6). First, let us grab the BiG-SLICE query result from bgcflow run.
If you haven’t done so, generate a symlink to the bgcflow run on the VM:
ln -s /datadrive/bgcflow bgcflow
Open up this folder /home/bgcflow/datadrive/bgcflow/data/processed/s_venezuelae/bigslice/query_as_6.0.1
and download the query_network.csv
and gcf_summary.csv
to your local computer.
Using Cytoscape, load and visualize your BiG-FAM hits following this video:
1. Importing network and annotation from tables
2. Filtering top n hits to disentagle the network
Enriching the annotation with other results from BGCflow run
As we can see, without further annotation about the BGC information, the network does not give us much information. In the next step, we will combine summary tables from two other BGCflow outputs:
- BiG-SCAPE
- GTDB taxonomy
We have prepared a jupyter notebook table that will generate the annotation required to enrich our network.
Getting Jupyter up and running
On your home directory, create a folder called s_venezuelae_tutorial/notebook
. You can then download the .ipynb
file of this episode and run it from the directory you just made.
mkdir -p s_venezuelae_tutorial/notebook
wget -O s_venezuelae_tutorial/notebook/05-bigslice_query.ipynb {link to .ipynb file}
If you’re using the VM for the workshop, activate the bgc_analytics
conda environment by:
conda activate bgc_analytics
Then, run jupyter lab with:
jupyter lab --no-browser
VS code will forward a link to the jupyter session that you can open in your local machine.
Go to your notebook directory and start exploring the notebook to generate the table
Key Points
BGCflow returns an edge table of your BGC query to the top 10 hits of GCF models in the BiG-FAM database