Install
Setup
PyPHLAWD should be easy to setup (or that is the hope). There are a few programs that it requires (listed below). These need to be in the PATH (so when you type the program, like mafft, it just runs). You should use the instructions on these pages to help you install for your machine. If you are running linux, it is significantly simpler because many of these are in your repos (e.g., sudo apt-get ...). If you are running mac, you can probably install many of not all of these using brew (e.g., brew install ...).
Requirements
- a database created by
phlawd_db_maker(COMING SOON: or your own sequences). You can grab a premake one for many databases here. - python : version 2
- mafft : You will need a recent version (>=v7.3 works well) that has threading and merging. If you are running linux, you can probably run
sudo apt install mafftorbrew install maffton Mac (with homebrew) - FastTree : (if you have
treemakeon) - blast+ : Currently, this runs
blastnandmakeblastdbfrom the blast+ package. Soon, it will useblastpas well. You can install this withsudo apt install ncbi-blast+orbrew install blaston Mac (with homebrew) - mcl : Markov clustering for the clustering runs (you won’t need this if you only bait). If you are on linux, you can run
sudo apt install mclorbrew install brewsci/bio/mclon Mac (with homebrew) - phyx
- Relies upon
pxssort,pxrevcomp,pxrmt,pxcat, andpxrms. You can only install these if you like by specifyingmake pxssort pxrevcomp pxrmt pxcat pxrmsinstead of justmakeormake allwhen installingphyx. Then you want to dosudo make installso these go in your PATH. More instructions can be found at thephyxwebsite.
- Relies upon
- cython : This is optional but will speed up some functions
-cython can be installed using pip
sudo pip install cythonor if you are running linux, you can dosudo apt install cython
Database
There are some prebuilt databases here. The files are big (many GBs so it may take time). If there is one that isn’t listed, put an issue in github or make one with the instructions below.
We would recommend that you use phlawd_db_maker to make the necessary sequence database. You will want to make a database (e.g., phlawd_db_maker pln ~/PHLAWD_DBS/pln.db). This will be used if you are using NCBI sequences (if you are using something else, see that section below).
Post installation setup
Now that you have a database and you have the dependencies installed, you should clone the repository with this command git clone https://github.com/FePhyFoFum/PyPHLAWD.git. Then we should compile the cython part. This will make operations on larger trees much faster. To compile this, go into the src directory (cd PyPHLAWD/src) and type bash compile_cython.sh. Hopefully cython was installed and there was just some output but no error. You don’t need to do anything else. Stuff will just work faster. The only other thing you will need to do (and this is discussed more in the runs page) and that is change the conf.py file line that starts with DI=... (should be the second line). Make sure that that points to your PyPHLAWD src directory. You should be ready for the runs page.
Updating PyPHLAWD
Occasionally, there will be updates to PyPHLAWD and you will want to pull to the most recent version. You can do that by simply running git pull inside your PyPHLAWD directory. However, if you have changed files (you probably have changed conf.py), you will want to save those somewhere first (I would just copy the file somewhere temporarily). Then you can run git checkout conf.py (or whatever other files you have changed) to revert them. Then do git pull and move your edited files back. You may want to check the git pull output to make sure the files you have changed were not changed in the source code. If so, you probably want to merge them. If there are more questions about this, post an issue and we will add a gif or more instructions.
Using virtualenv for clusters or other systems
There are some situations where it might be easier to install PyPHLAWD using virtualenv. In particular, this may be the easiest way to install on a cluster. To do this, you will need to install pip (or already have pip installed). If you have a linux distribution with a package manager, you can install it with that. If you have to get and install pip, you can use the instructions here. Basically, wget --no-check-certificate https://bootstrap.pypa.io/get-pip.py -O - | python - --user, then edit ~/.bashrc or ~/.bashrc_profile with export PATH=$HOME/.local/bin:$PATH. When you do which pip after this, it should point to ~/.local/bin.
After pip is installed, you can then follow the instructions here. Basically, pip install virtualenv, mkdir PyPHLAWD_env, virtualenv PyPHLAWD_env, source PyPHLAWD_env/bin/activate, and cd PyPHLAWD_env. Then you can use pip to install the other python bits (cython, networkx, clint, and sqlite3). Thanks to Javier Igea for working some of this out.
