Install
Setup
PyPHLAWD should be easy to setup (or that is the hope). There are a few programs that it requires (listed below). These need to be in the PATH (so when you type the program, like mafft
, it just runs). You should use the instructions on these pages to help you install for your machine. If you are running linux, it is significantly simpler because many of these are in your repos (e.g., sudo apt-get ...
). If you are running mac, you can probably install many of not all of these using brew (e.g., brew install ...
).
Requirements
- a database created by
phlawd_db_maker
(COMING SOON: or your own sequences). You can grab a premake one for many databases here. - python : version 2
- mafft : You will need a recent version (>=v7.3 works well) that has threading and merging. If you are running linux, you can probably run
sudo apt install mafft
orbrew install mafft
on Mac (with homebrew) - FastTree : (if you have
treemake
on) - blast+ : Currently, this runs
blastn
andmakeblastdb
from the blast+ package. Soon, it will useblastp
as well. You can install this withsudo apt install ncbi-blast+
orbrew install blast
on Mac (with homebrew) - mcl : Markov clustering for the clustering runs (you won’t need this if you only bait). If you are on linux, you can run
sudo apt install mcl
orbrew install brewsci/bio/mcl
on Mac (with homebrew) - phyx
- Relies upon
pxssort
,pxrevcomp
,pxrmt
,pxcat
, andpxrms
. You can only install these if you like by specifyingmake pxssort pxrevcomp pxrmt pxcat pxrms
instead of justmake
ormake all
when installingphyx
. Then you want to dosudo make install
so these go in your PATH. More instructions can be found at thephyx
website.
- Relies upon
- cython : This is optional but will speed up some functions
-cython can be installed using pip
sudo pip install cython
or if you are running linux, you can dosudo apt install cython
Database
There are some prebuilt databases here. The files are big (many GBs so it may take time). If there is one that isn’t listed, put an issue in github or make one with the instructions below.
We would recommend that you use phlawd_db_maker
to make the necessary sequence database. You will want to make a database (e.g., phlawd_db_maker pln ~/PHLAWD_DBS/pln.db
). This will be used if you are using NCBI sequences (if you are using something else, see that section below).
Post installation setup
Now that you have a database and you have the dependencies installed, you should clone the repository with this command git clone https://github.com/FePhyFoFum/PyPHLAWD.git
. Then we should compile the cython
part. This will make operations on larger trees much faster. To compile this, go into the src
directory (cd PyPHLAWD/src
) and type bash compile_cython.sh
. Hopefully cython
was installed and there was just some output but no error. You don’t need to do anything else. Stuff will just work faster. The only other thing you will need to do (and this is discussed more in the runs page) and that is change the conf.py
file line that starts with DI=...
(should be the second line). Make sure that that points to your PyPHLAWD src
directory. You should be ready for the runs page.
Updating PyPHLAWD
Occasionally, there will be updates to PyPHLAWD and you will want to pull to the most recent version. You can do that by simply running git pull
inside your PyPHLAWD directory. However, if you have changed files (you probably have changed conf.py
), you will want to save those somewhere first (I would just copy the file somewhere temporarily). Then you can run git checkout conf.py
(or whatever other files you have changed) to revert them. Then do git pull
and move your edited files back. You may want to check the git pull
output to make sure the files you have changed were not changed in the source code. If so, you probably want to merge them. If there are more questions about this, post an issue and we will add a gif or more instructions.
Using virtualenv for clusters or other systems
There are some situations where it might be easier to install PyPHLAWD using virtualenv. In particular, this may be the easiest way to install on a cluster. To do this, you will need to install pip
(or already have pip
installed). If you have a linux distribution with a package manager, you can install it with that. If you have to get and install pip, you can use the instructions here. Basically, wget --no-check-certificate https://bootstrap.pypa.io/get-pip.py -O - | python - --user
, then edit ~/.bashrc
or ~/.bashrc_profile
with export PATH=$HOME/.local/bin:$PATH
. When you do which pip
after this, it should point to ~/.local/bin
.
After pip
is installed, you can then follow the instructions here. Basically, pip install virtualenv
, mkdir PyPHLAWD_env
, virtualenv PyPHLAWD_env
, source PyPHLAWD_env/bin/activate
, and cd PyPHLAWD_env
. Then you can use pip to install the other python bits (cython, networkx, clint, and sqlite3). Thanks to Javier Igea for working some of this out.