Table of Contents¶
What is Phagescan?¶
Phagescan is an open source virus scan aggregator for all you malware analysts and infosec types. It is designed to be an in-house replacement/augment for some cloud based services out there. The current code allows you to:
- Submit a file through the web interface for scanning by several different AV and other analysis engines
- Look at the results of each scan
- Add new engines/analysis tools by writing a thin Python wrapper (tools can run on Linux or Windows VMs)
- Isolate and monitor your "worker" VMs which actually execute the engines and perform scans (Openstack or EC2)
Quick Start¶
The fastest way to get started is to follow the steps in our Quick Installation Guide.
What do I need?¶
- Python 2.7
- Django >=1.5, <1.6
- Ubuntu 12.04 recommended for Master
- At least one scanner listed in the engines directory
Please note that some of engine adapters are commercial. This means you'll need to buy your own licenses or find trial versions to test things out. We have extensive experience in setting up custom installs of this project and you can contact us via the contact info on our site.
If you're developing on this project we highly recommend Pycharm and Vagrant. Maintaining a development environment and scanners on your box for development is a pain.
Getting Help¶
- Check out our extensive documentation.
- Look us up on the PhageScan Google Group.
High-level Design¶
Phagescan is split up into two major components: a scan master and multiple scan workers. The workers receive jobs from the master through rabbitMQ and the awesome Celery framework for RPC. Currently, samples are pushed across the Queue although this can be adapted to different storage models. This means that both the master and worker can be isolated from the Internet. It also means that just the workers can be isolated from the Internet. We're also working on functionality to update the workers online and then move them (one-way) to an Isolated network where they do their scanning.
Scan Master¶
This is the front end of Phagescan that houses the samples and the web interface and is responsible for the following tasks:
- Sending scan tasks to worker queues
- Storing the results of scan tasks
- Monitoring worker and scanner engine queue health
- Updating workers' virus definitions if Internet connectivity is allowed
- Summarizing scan results
The Scan master consists of a Django application for providing the web interface and a few necessary Celery tasks for keeping everything moving. The master is the only architectural entity that has permanent access to both the samples and the Postgres database. Postgresql is being used as the DB backend and is being accessed through Django's ORM. Postgres is required at this point because we're using the hstore extension to store arbitrary Key/value pairs from the scan results.
Scan Workers¶
This is the backend of the system and can scale horizontally. Each scan worker is a VM that has one or more scan engines (e.g. ClamAV, PEID) loaded on it. These scan engines can be an external binary or a Python module but should not require access to the Master's database. In either case, a scan engine needs a Python adapter in the engines/ directory. Adapters in the engines directory are automatically enumerated on the worker and can automatically test which engines are installed. The worker uses this functionality to consume only from Queues that it has engines for. Check out the the AbstractEvilnessEngine or AbstractMDEngine classes and other concrete engines in the engine directory if you want to develop your own. Please note that both the master and workers need to have the same engines directory.
Types of Engines¶
We have two types of "scan engines" from above. They are: evilness engines and metadata engines. Metadata engines are things like PEID or OPAF. The key trait of a metadata engine is that it doesn't make a judgement on a sample's goodness.
An evilness engine does make a judgement on a file's good/bad state and needs to provide two pieces of data: a boolean indicating infection and a string indicating the type of infection. The evilness engine can also provide an arbitrary Python dict back to the scan master which contains any other traits that need to be stored in the DB for later retrieval/analysis.
Installation¶
PhageScan is a very flexible framework. In its simplest form, it can be installed on a single laptop. In its most robust form, it requires a cloud computing environment. And you can select anything in between.
Installation Types¶
There are three primary installation types for PhageScan:
- Everything in a single VM.
- Easiest way to test drive Phagescan.
- Good for development; master and worker using same code base in one VM.
- Good for Incident Response teams; it will easily run on an average laptop.
- Only Ubuntu is currently supported as the guest OS. So, you can only use/develop engines that run on Ubuntu.
- Lowest resource requirements.
- A single ScanMaster VM and a small number of manually controlled ScanWorker VMs.
- Ideal for more extensive development and testing. You can test network functionality as well as multiple ScanWorker OS's.
- Ideal for Incident Response teams; you can run engines on multiple ScanWorker OS's.
- Multiple VMs take more resources.
- A single ScanMaster Host with a collection of ScanWorkers running in a Cloud Computing environment.
- Ideal for enterprise CIRT or IR environments.
- Currently supports Amazon EC2 and OpenStack cloud computing environments.
- Can be fully installed in one or more on-premise devices.
- Higher resource requirements.
Quick Installation¶
The fastest way to get up and running is to have Vagrant build/setup the VM(s) for you. This will work for installation type 1 and 2, but the following instructions will perform installation type 1. Your host OS can be any OS that runs Salt and Vagrant. By default, Phagescan expects an Ubuntu host, but we have also used Mac OS X.
On your host, install your preferred virtualization software and Vagrant. We use VirtualBox on Ubuntu or VMWare Fusion on Mac OS X for virtualization software.
- Use the latest version (>= 4.2) of VirtualBox and the VM Extension Pack from Oracle downloadable from the VirtualBox website.
- The version distributed by Ubuntu is too old; see Ubuntu VirtualBox help.
- Use the latest version (>= v1.3.0) of Vagrant from the Vagrantup website.
- The version distributed by Ubuntu is too old.
- While Vagrant does have some support for Windows guests, we use Windows XP, which is not supported.
- Ensure your Vagrant installation has the Salt provisioner. New versions of Vagrant (>= v1.3.0) have it by default. If you are using a 1.2.x version of Vagrant, you will have to install the vagrant-salt plugin to get the Salt provisioner.
Download phagescan onto your host by cloning the phagescan git repo:
$ git clone git@github.com:scsich/phagescan.git
You should now have a phagescan directory. The rest of the documentation will refer to this directory as [Project_root_dir].
Prepare the directories that store the files that will be installed into the Master and Minion VMs by Salt (licenses and installation media):
$ cd [Project_root_dir]
$ dev/vagrant_prep.py
- vagrant_prep.py has useful defaults, but you can customize it if you wish. See the Salt Directories documentation for further reference.
Update the VM configuration settings in [Project_root_dir]/installation/salt-masterless/pillar/settings.sls. For a development setup, the defaults are sufficient. For a production setup, you should update the ps_root variable at the top and the passwords in the SERVERS section.
We are about ready to start up our first VM phagedev. If you are not using VirtualBox, you should edit the [Project_root_dir]/Vagrantfile and update the settings for the phagedev config. Test your Vagrant config:
$ cd [Project_root_dir]
$ vagrant status
It should output something like the following:
Current machine states:
uworker not created (virtualbox)
cworker not created (virtualbox)
phagedev not created (virtualbox)
There will be several machines listed, but we are interested in phagedev. It is an Ubuntu Precise 64 Phagescan worker and master combined
You can have at most 1 of each of the vms running at any one time.
You will run the vagrant commands on each VM by specifying the VM name:
$ vagrant up [ uworker | cworker | phagedev ] $ vagrant ssh [ uworker | cworker | phagedev ] $ vagrant halt [ uworker | cworker | phagedev ]
Start up the phagedev vm:
$ vagrant up phagedev
- When you run vagrant up, if you have not already downloaded the box for it, it will be downloaded automatically. Once it is downloaded, it will use the Salt provisioner to install and configure the respective set of base packages.
SSH into your vagrant host to verify build:
$ vagrant ssh phagedev
Ensure that all salt states are set:
[phagedev]$ sudo salt-call state.highstate
At this point, there are some important things you need to know.
- The phagedev vm has all software and libraries install for the master and worker, but only 2 engines are installed by default: clamav and yara.
- The [Project_root_dir] directory on your host will be mapped read/write into each vagrant VM as /vagrant. So you can use an editor/IDE on your development host and execute your code/tests inside your vagrant VM.
- When you ssh into the vagrant vm, you will be user 'vagrant' which has no password and has sudo privileges.
- These vagrant VMs should not be used for production; the privileges and file share is very open.
- The python virtualenv on each vagrant vm is in /opt/psvirtualenv.
- Once your VM is fully built, it is a good idea to halt it and take a snapshot. Then you can quickly revert to a clean VM should you experience problems during use/development.
Few of the Phagescan services are started by default, so that is the next step. Configuring the Master and Worker is all done on the phagedev VM, so you will need at least 5 terminals logged into phagedev. Remember, you can create more terminals in the VM by ssh'ing through Vagrant:
$ vagrant ssh phagedev
For the Master, you need to configure and start Django and 3 celery workers. We have yet to automate these steps, so you'll have to do it manually.
- Create Django database tables, cache, and superuser.
The first command will prompt you to create a django superuser. Do so. For development, define devuser/devpass. Give a fake e-mail addr:
[phagedev]$ cd [Project_root_dir] [phagedev]$ source /opt/psvirtualenv/bin/activate [phagedev]$ python manage.py syncdb --settings=scaggr.settings [phagedev]$ python manage.py migrate --settings=scaggr.settings [phagedev]$ python manage.py createcachetable --settings=scaggr.settings cache
Copy the appropriate Celery config files to the [Project_root_dir]:
[phagedev]$ cp installation/scanmaster/masterceleryconfig.py masterceleryconfig.py [phagedev]$ cp installation/scanmaster/resultsceleryconfig.py resultsceleryconfig.py [phagedev]$ cp installation/scanmaster/periodicceleryconfig.py periodicceleryconfig.py
Collect Static files:
[phagedev]$ python manage.py collectstatic
Start the celery processes, each in separate terminals:
[phagedev]$ DJANGO_SETTINGS_MODULE=scaggr.settings celeryd --config=masterceleryconfig -E -B -l info --hostname=master.master [phagedev]$ DJANGO_SETTINGS_MODULE=scaggr.settings celeryd --config=resultsceleryconfig -E -B -l info --hostname=master.results [phagedev]$ DJANGO_SETTINGS_MODULE=scaggr.settings celeryd --config=periodicceleryconfig -E -B -l info --hostname=master.periodic
Start the django development web server.
Run django as same user that you used to start celeryd:
[phagedev]$ python manage.py runserver -v 3 0.0.0.0:8000 --settings=scaggr.settings
For the Worker, you only need to configure and start one celery worker. Take advantage of the salt states to automate this step:
[phagedev]$ sudo salt-call state.sls celery.worker
Now we have the Django Web Interface listening on port 8000 in the VM, which is mapped to port 8090 on your host. To connect to the django instance:
From your host: http://localhost:8090
From other vagrant vms: http://192.168.33.10:8000
Login to the Django Web User Interface with the django superuser user/password that you created.
When you are finished and want to shutdown the phagedev VM, do the following:
Shutdown celery services
Shutdown Django service
Logout of the phagedev VM.
Halt the phagedev VM:
$ vagrant halt phagedev
Some final notes.
The Master services will not start on boot by default.
- For Django to start at boot, you'll want to install gunicorn and supervisord. You will also want a real web server in front of django, like Apache or Nginx.
- For the 3 celery services to start at boot, you can use the default and init.d script from the worker as a template. See [Project_root_dir]/installation/salt-masterless/salt/celery/[master and worker]
The Worker celery service will start on boot.
If you want to add additional Worker engines, you can use Salt to add them. It is generally a simple salt-call command to install and start it, but remember that you need to do a few things first:
Copy the installation media to the install-media directory. See the Salt Directories documentation for further reference.
Copy the license to the license directory. See the Salt Directories documentation for further reference.
Ensure all variables in settings.sls are updated for that engine. See [Project_root_dir]/installation/salt-masterless/pillar/settings.sls.
Then you can install an engine like this:
[phagedev]$ sudo salt-call state.sls <engine name> [phagedev]$ sudo salt-call state.sls avira
You need to restart the Master and Worker celery services after adding a new engine.
Building A Single Master or Worker¶
Master¶
We have partially completed the Salt states to build a scan master, but for now, you should do it manually. To manually build a scan master, the following instructions will guide you:
Worker¶
ScanWorkers can be Ubuntu, CentOS, or Windows VMs. Ubuntu instructions were tested on 12.04 x86_64, Desktop edition. CentOS instructions were tested on 6.3 x86_64 and 6.4 x86_64. Windows instructions were tested on Windows XP SP3.
We have Salt states to automatically build Ubuntu and CentOS Workers, but Windows scanworkers require a fully manual build. To use Salt states to automatically build Ubuntu and CentOS workers, select the uworker or cworker VMs from Vagrant and start them up. They will build to a base Worker with no engines installed, so you'll simply have to add engines to them.
To manually build a scan worker, the following instructions will guide you:
Salt Directories for Dependency Files¶
For general Salt configuration and usage, see Salt's Documentation.
In Phagescan, Salt needs 4 directories.
- States - State files (.sls) define the configuration actions Salt will perform.
- Pillar - Pillar files (.sls) define protected global variables (usernames, passwords, paths, etc)
- Installation Media - Media files (.tar, .deb, .rpm) contain packages of software and files to be installed into a VM.
- Licenses - License files are for commercial software licenses; many engines are commercial software.
The States are a part of the Phagescan repo, so there is nothing else to do. The Phagescan repo has a pillar-sample directory that is used to create the Pillar directory. The Installation Media and Licenses directories have to be created.
Content of Salt Directories¶
Salt¶
This directory stores all of the salt states and some service configuration files. You should not have to make any changes to them.
Pillar¶
This directory stores all of the protected information and global variables that Salt uses when configuring a VM. The Phagescan repo has a pillar-sample directory that has a default set of files, which are sufficient for development use. That is why the vagrant_prep.py script simply copies pillar-sample to pillar.
Pillar-sample can be found at [Project_root_dir]/installation/salt-masterless/pillar-sample
In a production environment, the settings.sls file should be updated to change the ps_root and usernames/passwords.
The bottom portion of the file is for engines and licenses; some of which require definition of a variable a text string. One example is the ESET update username and password.
If you want to move the Pillar directory to a different location, update the Master or Minion config file. See Salt's Documentation for assistance.
Licenses¶
This directory stores all of the licenses for the software that is installed by Salt. The directory structure should be:
licenses/<salt state name>/<engine license file>
As an example, the license that goes with salt states in salt/avira/ would be:
licenses/avira/hbedv.key
As an alternative, you can keep licenses in a GIT repository. To do that, create a repository that uses SSH shared key for access. Name the repository anything you want, we'll call it my-lic for this example. In the repository, you should have a master branch containing your licenses. It should use the following directory structure pattern:
my-lic/<salt state name>/<engine license file>
As an example, the license that goes with salt states in salt/avira/ would be:
my-lic/avira/hbedv.key
Then, in salt-production/master, add your repo to the gitfs_remotes section. Assuming we are also using GIT to store our salt states in a repo named salt-states, your gitfs_remotes would look like this:
gitfs_remotes:
- git+ssh://git@github.com/myuser/salt-states.git
- git+ssh://git@github.com/myuser/my-lic.git
The user running the salt master must have an ssh key connected with the repository.
Each of your git repos must have a single branch in it that is named to match the env name that you use in top.sls:
'lic' for licenses 'media' for install-media 'master' for states 'master' for pillar
Lastly, install the python module GitPython >= 0.3.0 and after you restart the salt-master, the gitfs remote file sources will be active. (they are cached locally, but checked each time a salt command is run). See the Salt gitfs reference.
Installation Media¶
This directory stores all of the local installation media that is installed by Salt. The directory structure should be:
install-media/<salt state name>/<engine files>
As an example, the media that goes with the states in salt/avira/ would be:
install-media/avira/antivir-server-prof.tar.gz
As with Licenses, the install-media files can be backed up by a git repo, but realize the repo will be several GB, so it may not be the best idea.
Define Location of Salt Directories¶
The Salt master and minion configuration files are preconfigured to with a default location for each of these directories. Those files can be found in two locations.
- For a masterless minion, the minion config is: [Project_root_dir]/installation/salt-masterless/salt/minion.
- For a mastered minion, the master and minion configs are: [Project_root_dir]/installation/salt-production/[master | minion].
The default location for each of the Salt directories is as follows:
- States are in [Project_root_dir]/installation/salt-masterless/salt
- Pillar are in [Project_root_dir]/installation/salt-masterless/pillar
- Installation Media are in [Project_root_dir]/installation/install-media
- Licenses are in [Project_root_dir]/installation/licenses
The script, [Project_root_dir]/dev/vagrant_prep.py, is used to automatically create these directories for you. It has some variables at the top that allow you to create these Salt directories from other existing sources. See the PILLAR, MEDIA, and LICENSES variables in the conf dictionary at the top of vagrant_prep.py.
Creating Salt Directories¶
Automatic Creation¶
The fastest way to create the Salt directories is to use [Project_root_dir]/dev/vagrant_prep.py. By default, it will copy pillar-sample to pillar, create an empty license directory, and create an install-media directory containing the scan_task_master and scan_worker source code .zip files. If you want anything else, see the PILLAR, MEDIA, and LICENSES variables in the conf dictionary at the top of vagrant_prep.py.
Manual Creation¶
To do it manually, do the following:
- installation/salt-masterless/pillar
Copy pillar-sample to pillar. Update settings.sls and top.sls files with your values.
- installation/install-media/{scan_task_master, scan_worker}
Create the directory installation/install-media. In there you should place your installation media in separate sub-directories. Most importantly, you MUST create the two sub-directories scan_task_master and scan_worker. Run installation/scanmaster/make_scanmaster_zip.sh and place the .zip into scan_task_master. Run installation/scanworker/make_scanworker_zip.sh and place the .zip into scan_worker.
- installation/licenses
In there you should place your commercial licenses in separate sub-directories.
Note: For the install-media and licenses directories, the sub-directories should be named similarly to the names of the salt states that are using those files. Refer to the salt states that you intend to use for the proper naming of your license and install-media sub-directories and files. For example, salt states in:
installation/salt-masterless/salt/avast/
Would have install-media and licenses in:
installation/install-media/avast/
installation/licenses/avast/
Build Production Worker VMs with Salt¶
This document describes how to use Salt and the states in salt-masterless/salt to build Ubuntu and CentOS VMs for your production environment.
In these instructions, you first build a Salt Master, and then use the Salt Master to automatically build Production Scan Worker VMs.
You can build a Salt Master Manually or with Vagrant. The following instructions walk you through both options.
Build a Salt Master VM with Vagrant¶
Before starting these instructions: 1. You must have a working Phagescan Scan Master setup. 2. You must have created the Salt support directories and populated them with any install-media, licenses, and pillar configs.
This is the less secure option, but is faster. It is less secure because it uses the Salt states and support files via a file share mounted at /vagrant. That file share is the [Project_root_dir] from the Scan Master.
On the scan master, you will note the production Salt configuration files in:
[Project_root_dir]/installation/salt-production/
The files in this directory are intended to be used by the production Salt master and production Scan Workers.
You'll also note the Vagrant file in:
[Project_root_dir]/Vagrantfile
It has definitions for VMs that start with prod_. You'll notice that they are defined to use the minion file from salt-production instead of the one from salt-masterless. The production minion file defines the Salt Master's IP Address, which by default it is the IP address of the salt VM that is also defined in the Vagrantfile.
So, we need to first build the salt VM with Vagrant, then we can use Vagrant to spin up prod_<os>
VMs.
Perform the following steps on the Phagescan Scan Master.
Start up the saltmaster
VM with Vagrant:
$ cd phagescan
$ vagrant up saltmaster
Once the Salt Master is up and running, ssh onto it:
$ vagrant ssh saltmaster
Copy autosign.conf to Salt's config directory:
$ sudo cp installation/salt-production/autosign.conf /etc/salt/
Edit /etc/autosign.conf as necessary. Ensure the host patterns match what your scan worker's hostnames will be. If you are using the Vagrant-created VM's, the defaults will work for you.
Copy the Salt master config to Salt's config directory:
$ sudo cp installation/salt-production/master /etc/salt/
- If you created the Salt support directories on the Scan Master in a location other than the defaults, you'll have to update the Salt Master config with those paths. By default the Salt Master config looks for all of the support directories under /vagrant, which translates to the Scan Master's phagescan/ directory.
Restart the Salt Master service:
$ sudo service salt-master restart
Build a Salt Master VM Manually¶
This is the more secure option, but takes longer.
These instructions assume you are using an Ubuntu host/VM as your Salt Master. However, you can use any OS that is supported by Salt.
This host is typically named salt.
First, install the Salt Master software. Refer to the Salt Install docs for instructions to do that. We generally use the bootstrap script and provide the option to only install the Salt Master.
Clone the phagescan repo to a user directory on your salt master host. The actual location doesn't matter, because we will be moving it.
$ git clone git@github.com:scsich/phagescan.git
Move the phagescan directory to /srv/:
$ sudo mv phagescan /srv/
Copy autosign.conf to Salt's config directory:
$ sudo cp phagescan/installation/salt-production/autosign.conf /etc/salt/
Edit /etc/autosign.conf as necessary. Ensure the host patterns match what your scan worker's hostnames will be. If you are using the Vagrant-created VM's, the defaults will work for you.
- Ensure /opt/phagescan/installation/salt-masterless/salt/top.sls has an entry matching that same hostname pattern. The defaults match, so you should only have to change things if you change the defaults.
Now we have to create the Salt support directories. This is where the Pillar, Install Media, and Licenses will be stored. These can be local files or git repos. These instructions assume local files for each of those. See Salt Master Directories for more details.
We are going to use the /srv/ directory to store all of our Salt files locally on the Salt Master. We will use /srv/phagescan/dev/vagrant_prep.py to setup our directories for us.
Go to /srv/phagescan:
$ cd /srv/phagescan
Edit dev/vagrant_prep.py:
$ vi dev/vagrant_prep.py
In vagrant_prep.py, make the following changes to the conf
dictionary:
In the 'PILLAR' config, set 'DST_PATH' to /srv/pillar:
'DST_PATH': '/srv/pillar',
In the 'MEDIA' config, set 'DST_PATH' to /srv/install-media:
'DST_PATH': '/srv/install-media',
In the 'LICENSES' config, set 'DST_PATH' to /srv/licenses:
'DST_PATH': '/srv/licenses',
In the 'WORKER' config, set 'DST_PATH' to /srv/install-media/scan_worker:
'DST_PATH': '/srv/install-media/scan_worker',
In the 'MASTER' config, set 'DST_PATH' to /srv/install-media/scan_task_master:
'DST_PATH': '/srv/install-media/scan_task_master',
Now, run vagrant_prep.py to create those directories:
$ sudo dev/vagrant_prep.py
Check your work:
$ ls -R /srv/
- You should see the 4 directories.
Populate the licenses and install-media directories and update the settings in pillar/settings.sls. For settings.sls, make sure to at least update ps_root and any usernames and passwords.
- Note: ps_root is the Phagescan root as it will exist on other Worker VMs, not the Salt Master.
- Refer to the Salt Master Directories docs for guidance.
Now, update the file permissions and ownership for each of the /srv/ directories. They should be owned by the same user that the Salt-Master service is run as; generally 'salt'. They should NOT be editable by other users. And pillar should NOT be readable/editable by other users. To make it simple, make salt the owner and group and remove access for all other users.
$ sudo chown -R salt:salt /srv/phagescan
$ sudo chmod -R o-rwx /srv/phagescan
$ sudo chown -R salt:salt /srv/pillar
$ sudo chmod -R o-rwx /srv/pillar
$ sudo chown -R salt:salt /srv/licenses
$ sudo chmod -R o-rwx /srv/licenses
$ sudo chown -R salt:salt /srv/install-media
$ sudo chmod -R o-rwx /srv/install-media
Update the Salt master config /etc/salt/master
.
Most importantly, you should update the variables: file_roots, pillar_roots, gitfs_root, and gitfs_remotes to ensure
they match your directory structure.
The following shows what those variables will look like if you use the defaults provided above:
file_roots:
base:
- /srv/phagescan/installation/salt-masterless/salt
media:
- /srv/install-media
lic:
- /srv/licenses
pillar_roots:
base:
- /srv/pillar
#gitfs_remotes:
# - git+ssh://git@github.com/myuser/phagescan.git
#gitfs_root: installation/salt-masterless/salt
If you ever want to get updated versions of the files in /srv/phagescan, you can do a git pull.
In that case, you'll want to re-run dev/vagrant_prep.py
.
It will create the updated .zip files for the master and worker code
and place them into the respective install-media dirs as previously configured.
Restart salt-master service:
$ sudo service salt-master restart
Once you restart the salt-master service, you can start using salt to build scanworkers.
Build Production Scan Worker VMs with Salt¶
Now that we have a Salt Master setup, we can move on and build the Scan Worker VMs. We cannot use Salt to build a WinXP VM and we have not added the ability to build newer Windows VMs either. Thus, this currently applies to Ubuntu and CentOS.
To set ourselves up for success, we need to ensure that the hostname of our VMs is known by the Salt Master. There are three files on the Salt Master that need the hostname or at least a regex pattern that will match the hostname:
- autosign.conf
- pillar/top.sls
- salt/top.sls
By default, these files are setup to recognize two patterns:
prod.worker*
.*-.*-.*-.*-.*
The first can be used to name a host something like prod.worker.ubuntu
or prod.worker.centos
.
The second can be used when a hostname is a UUID.
Knowing that, build your Scan Workers:
Manually Build a Scan Master¶
These instructions are for manually building/configuring a scan master host, instead of using Vagrant/Salt to build a VM. This must be an Ubuntu host. This is how you would manually build a production scan master. It is only lacking the automated startup scripts for celery and django.
If you are building a production scan master, you should create a user specifically to run Phagescan and then run everything as that user. We use the user avuser by default. If this is not for a production scan master, you can select a user of your choice.
Prepare your Environment¶
Create user avuser and set a password:
$ sudo adduser -U avuser
$ sudo passwd avuser
Clone the GitHub repo, move it into /opt, and set ownership:
$ git clone git@github.com:scsich/phagescan.git
$ sudo mv phagescan /opt/
$ sudo chown -R avuser:avuser /opt/phagescan
You now have a /opt/phagescan directory, which we will refer to as [Project_root_dir].
Install necessary OS packages:
If running Ubuntu:
$ sudo apt-get install $(< [Project_root_dir]/PACKAGES.ubuntu)
$ sudo apt-get install $(< [Project_root_dir]/installation/scanmaster/PACKAGES.ubuntu)
Build & activate a virtual environment:
$ sudo su
$ virtualenv --setuptools /opt/psvirtualenv
[root@host]$ source /opt/psvirtualenv/bin/activate
Your prompt should look like this after:
(psvirtualenv)[root@host]$
If you need to deactivate the virtual env (don't do this now):
(psvirtualenv)[root@host]$ deactivate
Install Python requirements into Virtualenv:
(psvirtualenv)[root@host]$ pip install -r [Project_root_dir]/installation/scanmaster/PACKAGES.pip
You are done with the root user, so return to your standard user:
(psvirtualenv)[root@host]$ exit
If not already running, start rabbitmq and postgresql:
$ sudo /etc/init.d/rabbitmq-server start
$ sudo /etc/init.d/postgresql start
Configuration¶
Unless otherwise specified, assume the commands listed here are to be executed from [Project_root_dir].
Set up rabbitmq:
Replace the username and password with the credentials that you would like to use and run the following commands:
$ sudo rabbitmqctl add_user phagemasteruser longmasterpassword
$ sudo rabbitmqctl add_user phageworkeruser longworkerpassword
$ sudo rabbitmqctl add_vhost phage
$ sudo rabbitmqctl set_permissions -p phage phagemasteruser ".*" ".*" ".*"
$ sudo rabbitmqctl set_permissions -p phage phageworkeruser ".*" ".*" ".*"
If the master and worker hosts use different user/pass combinations to communicate with the broker, you must use the commands as written above. However, if the master and worker hosts use the same user/pass, you may add one user and set_permissions on that one user. In that case, there is no need for the second user. A production system should use separate credentials for the master and workers.
Note: These credentials are the BROKER_CONF for the master and workers. So, make sure the username and password you created here are set in both the master and worker BROKER_CONFs:
For the master it is in scaggr/settings.py. For the worker it is in workerceleryconfig.py.
Set up postgres:
Replace the username and password with the credentials that you would like to use and run the following commands:
$ sudo su postgres
$ psql
postgres=# create user citestsuper createdb superuser password 'sup3rdup3r';
postgres=# create database phage owner citestsuper;
postgres=# \q
$ psql -d phage
phage=# create extension hstore;
phage=# \q
$ exit
The remaining configuration should be done as user avuser, so switch now:
$ sudo su avuser
Create Django database tables, cache and superuser:
[avuser@host]$ python manage.py syncdb --settings=scaggr.settings
[avuser@host]$ python manage.py migrate --settings=scaggr.settings
[avuser@host]$ python manage.py createcachetable --settings=scaggr.settings cache
- Note: The first command prompts you to create a django superuser. Do so and use a strong password. For development define devuser/devpass. Give a fake e-mail addr.
Copy the appropriate config files to [Project_root_dir]:
[avuser@host]$ cp installation/scanmaster/masterceleryconfig.py masterceleryconfig.py
[avuser@host]$ cp installation/scanmaster/resultsceleryconfig.py resultsceleryconfig.py
[avuser@host]$ cp installation/scanmaster/periodicceleryconfig.py periodicceleryconfig.py
Collect Static files:
[avuser@host]$ python manage.py collectstatic
Start the celery processes each in separate terminals:
[avuser@host]$ DJANGO_SETTINGS_MODULE=scaggr.settings celeryd --config=masterceleryconfig -E -B -l info --hostname=master.master
[avuser@host]$ DJANGO_SETTINGS_MODULE=scaggr.settings celeryd --config=resultsceleryconfig -E -B -l info --hostname=master.results
[avuser@host]$ DJANGO_SETTINGS_MODULE=scaggr.settings celeryd --config=periodicceleryconfig -E -B -l info --hostname=master.periodic
Start the django development web server:
Run as same user that you used to start the 3 celeryd processes.
[avuser@host]$ python manage.py runserver -v 3 127.0.0.1:9000 --settings=scaggr.settings
You can now access the Phagescan Web User Interface:
http://127.0.0.1:9000
http://127.0.0.1:9000/admin
Optional production extras:
- To automatically start celeryd processes, you can use init.d scripts. See installation/salt-masterless/salt/celery/master for reference versions.
- To automatically start django on boot, you can use gunicorn or supervisord. See installation/salt-masterless/salt/[gunicorn|supervisord] for reference versions.
- In production, you should have a full webserver in front of Django: apache or nginx. The step that processed installation/scanmaster/PACKAGE.ubuntu, installs Nginx by default. See installation/salt-masterless/salt/[nginx] for reference configs.
- In production, you should enable the EngineActiveMarkerTask periodic task in virusscan/tasks.py.
- Also, schedule update_definitions to run periodically. virusscan/models.py:ScannerType.update_definitions().
Manually Build a CentOS Scan Worker¶
These instructions are for manually building/configuring a CentOS scan worker host (or VM), instead of using Vagrant/Salt to build a VM. This is how you would manually build a production scan worker. It is only lacking the automated startup scripts for celery.
If you are building a production scan worker, you should create a user specifically to run Phagescan and then run everything as that user. We use the user avuser by default.
This guide was developed against a CentOS 6 / RHEL 6 based system.
CentOS 6.3 was installed with the "Minimal" option selected and then updated with yum update
.
CentOS 6.4 has been tested successfully as well.
NOTE: The vm/host must have 4GB of RAM or more, else your later step to install the Symantec engine will fail!
Prepare your Environment¶
Install commonly used packages (openssh-server is installed by default):
$ su root
[root@host]$ yum install sudo openssh-clients acpid unzip htop bash-completion vim-enhanced
[root@host]$ exit
Now we can use sudo.
Create user avuser and set a password:
$ sudo adduser -U avuser
$ sudo passwd avuser
On the scan master, create scan_worker.zip by using the script:
installation/scanworker/make_scanworker_zip.sh
Transfer that .zip file from the scan master to this host.
Unzip the scan_master.zip, move it into /opt, and set ownership:
$ unzip scan_master.zip
$ sudo mv phagescan /opt/
$ sudo chown -R avuser:avuser /opt/phagescan
You now have a /opt/phagescan directory, which we will refer to as [Project_root_dir].
Install necessary OS packages:
If running Ubuntu:
$ sudo apt-get install $(< [Project_root_dir]/PACKAGES.centos)
Unless otherwise specified, assume the commands listed here are to be executed from [Project_root_dir].
Install python 2.7:
$ sudo yum groupinstall "Development tools"
$ sudo yum install zlib-devel bzip2-devel openssl-devel ncurses-devel
$ curl -O http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tar.bz2
$ tar xjf Python-2.7.3.tar.bz2
$ cd Python-2.7.3
$ ./configure --prefix=/usr/local
$ sudo make && sudo make altinstall
$ cd ..
$ curl -O http://pypi.python.org/packages/source/d/distribute/distribute-0.6.32.tar.gz
$ tar xzf distribute-0.6.32.tar.gz
$ cd distribute-0.6.32
$ sudo /usr/local/bin/python2.7 setup.py install
$ sudo /usr/local/bin/easy_install-2.7 virtualenv
Build & activate a virtual environment:
$ sudo su
$ /usr/local/bin/virtualenv-2.7 --setuptools /opt/psvirtualenv
[root@host]$ source /opt/psvirtualenv/bin/activate
Your prompt should look like this after:
(psvirtualenv)[root@host]$
If you need to deactivate the virtual env (don't do this now):
(psvirtualenv)[root@host]$ deactivate
Install Python requirements into Virtualenv:
(psvirtualenv)[root@host]$ pip install -r [Project_root_dir]/installation/scanworker/PACKAGES.pip
You are done with the root user, so return to your standard user, su to avuser and activate virtual env:
(psvirtualenv)[root@host]$ exit
$ sudo su avuser
$ source /opt/psvirtualenv/bin/activate
Copy the Celery config file to the [Project_root_dir]:
(psvirtualenv)[avuser@host]$ cp installation/scanworker/workerceleryconfig.py workerceleryconfig.py
Edit workerceleryconfig.py as necessary. In particular, tailor BROKER_CONF to your environment.
Install chosen engines¶
Refer to the following files:
[Project_root_dir]/engines/[engine_name]/INSTALL
- Currently, only the Symantec engine is supported on CentOS
Start the Celery worker process¶
Use the following command to manually start celeryd:
(psvirtualenv)[avuser@host]$ celeryd -l INFO -E --config=workerceleryconfig --hostname=worker.centos
To start celery on boot, see the init.d/default scripts located in the salt state tree. See installation/salt-masterless/salt/celery/worker for reference versions.
Build a CentOS Scan Worker with Salt¶
These instructions are for building/configuring a CentOS scan worker host (or VM) using Salt. This is how you would build a production scan worker.
This guide was developed against a CentOS 6 / RHEL 6 based system.
CentOS 6.3 was installed with the "Minimal" option selected and then updated with yum update
.
CentOS 6.4 has been tested successfully as well.
Prepare your Base VM¶
Create the base VM in any way that you desire.
* Use the "Minimal" option and install all updates with yum update
.
* Use the hostname prod.worker.centos
to take advantage of default Salt Master configuration settings.
* The vm/host must have 4GB of RAM or more, else your later step to install the Symantec engine will fail!
Install the Salt Minion client salt-minion
onto your VM.
Refer to Salt Install documentation for reference.
Edit the /etc/salt/minion/
file and define the master
variable as the IP address of the Salt Master VM.
Then restart the salt-minion
service:
$ sudo service salt-minion restart
At this point, the base VM is ready for the Salt Master to install and configure it as a Scan Worker.
Install Scan Worker States¶
The Salt states are setup to create a Scan Worker with celery automatically started on boot. You only have to run this one command on the VM and Salt will do the build for you:
$ sudo salt-call state.highstate
The output of this command will be colored Green/Red/Teal. If you see any Red, then you have a problem that you'll have to investigate and resolve. If you only see Green/Teal, your VM should be ready to go.
Install chosen engines¶
Refer to the following files:
[Project_root_dir]/engines/[engine_name]/INSTALL
- Currently, only the Symantec engine is supported on CentOS
Manually Build an Ubuntu Scan Worker¶
These instructions are for manually building/configuring an Ubuntu scan worker host (or VM), instead of using Vagrant/Salt to build a VM. This is how you would manually build a production scan worker. It is only lacking the automated startup scripts for celery.
If you are building a production scan worker, you should create a user specifically to run Phagescan and then run everything as that user. We use the user avuser by default.
Prepare your Environment¶
Create user avuser and set a password:
$ sudo adduser -U avuser
$ sudo passwd avuser
On the scan master, create scan_worker.zip by using the script:
installation/scanworker/make_scanworker_zip.sh
Transfer that .zip file from the scan master to this host.
Unzip the scan_master.zip, move it into /opt, and set ownership:
$ unzip scan_master.zip
$ sudo mv phagescan /opt/
$ sudo chown -R avuser:avuser /opt/phagescan
You now have a /opt/phagescan directory, which we will refer to as [Project_root_dir].
Install necessary OS packages:
If running Ubuntu:
$ sudo apt-get install $(< [Project_root_dir]/PACKAGES.ubuntu)
Build & activate a virtual environment:
$ sudo su
$ virtualenv --setuptools /opt/psvirtualenv
[root@host]$ source /opt/psvirtualenv/bin/activate
Your prompt should look like this after:
(psvirtualenv)[root@host]$
If you need to deactivate the virtual env (don't do this now):
(psvirtualenv)[root@host]$ deactivate
Install Python requirements into Virtualenv:
(psvirtualenv)[root@host]$ pip install -r [Project_root_dir]/installation/scanworker/PACKAGES.pip
You are done with the root user, so return to your standard user, su to avuser and activate virtual env:
(psvirtualenv)[root@host]$ exit
$ sudo su avuser
$ source /opt/psvirtualenv/bin/activate
Copy the Celery config file to the [Project_root_dir]:
(psvirtualenv)[avuser@host]$ cp installation/scanworker/workerceleryconfig.py workerceleryconfig.py
Edit workerceleryconfig.py as necessary. In particular, tailor BROKER_CONF to your environment.
Install chosen engines¶
Refer to the following files:
[Project_root_dir]/engines/[engine_name]/INSTALL
Start the Celery worker process¶
Use the following command to manually start celeryd:
(psvirtualenv)[avuser@host]$ celeryd -l INFO -E --config=workerceleryconfig --hostname=worker.ubuntu
To start celery on boot, see the init.d/default scripts located in the salt state tree. See installation/salt-masterless/salt/celery/worker for reference versions.
Build an Ubuntu Scan Worker with Salt¶
These instructions are for building/configuring an Ubuntu scan worker host (or VM) using Salt. This is how you would build a production scan worker.
This guide was developed against an Ubuntu 12.04 system.
Prepare your Base VM¶
Create the base VM in any way that you desire.
- Use the "Basic Server Install" option and install all updates.
- Use the hostname
prod.worker.ubuntu
to take advantage of default Salt Master configuration settings. - The vm/host must have 2GB of RAM or more. Essentially, you need to increase RAM as you increase the number of engines running on that VM.
Install the Salt Minion client salt-minion
onto your VM.
Refer to Salt Install documentation for reference.
Edit the /etc/salt/minion/
file and define the master
variable as the IP address of the Salt Master VM.
Then restart the salt-minion
service:
$ sudo service salt-minion restart
At this point, the base VM is ready for the Salt Master to install and configure it as a Scan Worker.
Install Scan Worker States¶
The Salt states are setup to create a Scan Worker with celery automatically started on boot. You only have to run this one command on the VM and Salt will do the build for you:
$ sudo salt-call state.highstate
The output of this command will be colored Green/Red/Teal. If you see any Red, then you have a problem that you'll have to investigate and resolve. If you only see Green/Teal, your VM should be ready to go.
Install chosen engines¶
Refer to the following files:
[Project_root_dir]/engines/[engine_name]/INSTALL
- Currently, every engine is supported on an Ubuntu Worker except Symantec and Panda.
Manually Build a WinXP Scan Worker¶
These instructions are for manually building/configuring a Window XP scan worker host (or VM). We do not currently support building Windows XP scan workers using Vagrant/Salt, so this is the only method. This is how you would manually build a production scan worker. It is only lacking the automated startup scripts for celery.
If you are building a production scan worker, you should create a user specifically to run Phagescan and then run everything as that user. We use the user avuser by default.
Prepare your Environment¶
On the scan master, create scan_worker.zip by using the script:
installation/scanworker/make_scanworker_zip.sh
Transfer scan_worker.zip to the Windows XP host.
Note: One way to easily transfer files to the WinXP vm is to use a rudimentary webserver on the VM Host. On the VM Host, run this command in the directory containing files that need to be transferred onto the XP VM:
$ python -m SimpleHTTPServer 7878
Then in the XP VM, you can use the web browser to go to the IP of the VM Host on port 7878:
http://192.168.33.10:7878/scan_worker.zip
Extract scan_worker.zip
to c:\
to create c:\phagescan
.
We will refer to this directory as [Project_root_dir]
Install python 2.7.x, setuptools, pip and virtualenv
Follow these instructions: http://docs.python-guide.org/en/latest/starting/install/win/
- Make sure you get the latest version of python in the 2.7.x series. Get the python x86 msi installer. Install for all users and accept all defaults.
- Once you install python and update your PATH, then setuptools and pip will install. Just download the .py files and double-click them (setuptools first, then pip).
- To install Virtualenv, use the windows command terminal to run pip.
Start a cmd terminal, create a virtual environment, and activate it:
$ virtualenv c:\psvirtualenv
$ c:\psvirtualenv\Scripts\activate
Your prompt should look like this after:
(psvirtualenv) C:\<path> >
If you need to deactivate the virtual env:
$ deactivate
If all of the above worked, close that terminal window.
Install & Setup MinGW:
Some requisite python packages must be compiled. We will utilize MinGW to compile these packages on Windows. Follow the Graphical User Interface Installer instructions here: http://www.mingw.org/wiki/Getting_Started
Select the following packages for installation:
mingw-developer-tools, mingw32-base, mingw32-gcc-g++, msys-base
Once the installation is complete, refer to the "After Installing You Should ..." section on the Getting_Started page. Follow the instructions to create the fstab file for MSYS.
Then move to the "Environment Settings" section and add MinGW and MSYS to your system PATH:
Add ";C:\MinGW\bin;C:\MinGW\MSYS\1.0\bin;C:\MinGW\MSYS\1.0\local\bin;" to the system PATH.
Lastly, instruct Python to utilize MingW as the compiler.
Open (or create)
c:\psvirtualenv\Lib\disutils\distutils.cfg
Write the following into that file:
[build] compiler=mingw32
For the remainder of this document, you should open a new cmd window and activate the virtual environment. You should be in the [Project_root_dir] directory.
Install necessary Python packages:
(psvirtualenv) $ pip install -r installation\scanworker\PACKAGES.pip
For testing you can also install these:
(psvirtualenv) $ pip install -r installation\dev\PACKAGES.pip
Copy the appropriate celery config file to the Project_root_dir:
(psvirtualenv) $ copy installation\scanworker\workerceleryconfig.py workerceleryconfig.py
Edit workerceleryconfig.py as necessary. In particular, tailor BROKER_CONF to your environment.
Install chosen engines¶
Refer to the following files:
[Project_root_dir]/engines/[engine_name]/INSTALL
- Only the Panda engine is currently supported on Windows XP
Start the Celery worker process¶
Use the following command to manually start celeryd:
(psvirtualenv) $ celeryd -l INFO -E --config=workerceleryconfig -P solo
Running celeryd at boot¶
You can use a scheduled task to run the batch file celeryd.winxp.bat at boot.
Get batch file from the scan master. It is in: installation/salt-masterless/salt/celery/worker/celeryd.winxp.bat.
Copy the batch file to [Project_root_dir].
Schedule a task to run this script at boot using the following steps:
First, you have to change the user to have a password and login to require a password:
Start -> Control Panel -> User Accounts Click on your user Click on 'Create a password' Set a password. Make your files private. Click Apply, then Ok.
Next, create the task:
Start -> All Programs -> Accessories -> System Tools -> Scheduled Tasks Double-click Add Scheduled Task Click Next Browse to find celeryd.winxp.bat Select 'When my computer starts' Specify the user/password to run the task. - Start it as the user 'HOSTNAME\avuser'. - Enter the password for user 'avuser'. Click the box to open the Advanced Properties Click Finish - If you get an error about Access denied, click Ok. Then when the Advanced Properties window appears, click the Set password.. button to re-enter the password for avuser. In the Advanced Properties, go to the Settings tab and uncheck all boxes. Click Apply Click Ok
It is possible to create a proper window service to start celery at boot, but we have yet to do so. It would probably involve the use of pywin32.
Phagescan Web User Interface¶
The Phagescan web user interface is where a user spends most of his time.
Accessing¶
Access the Phagescan Web User Interface on localhost by going to this URL:
http://127.0.0.1:9000
A production master will have a Nginx front-end, so you'll want to access via the public IP address or domain name that Nginx is listening on. It may be on port 80 or 443 depending on SSl configuration:
http://1.2.3.4
http://phagescan.example.com
https://1.2.3.4
https://phagescan.example.com
Usage¶
Most of the links at the top should be self-explanatory. However, the Server Status page needs some explanation. See the Server Status documentation for assistance.
TODO: Add more about the UI.
Server Status¶
The Server Status page in the Phagescan web interface provides a status display of each of the celery queues; this includes queues for engines and the master.
For the engine queues, there are 2 columns. The first column displays the number of workers that are on-line, have a queue for that engine, and are under the management of Phagescan. The second column simply displays the number of queues that are running for each engine. In a production environment, both columns should be equal. In a development environment, it can vary.
Starting a Worker¶
When Phagescan is first started, both columns will show zero active queues. The first column will have an UP ARROW icon next to each engine queue. To start a worker and thus the queues for a set of engines, click the UP ARROW icon. Many of the queues are handle by a single worker, so clicking the UP ARROW next to one queue per worker type is sufficient. When starting a worker, Phagescan will start 1 or more actual worker VMs. The number of VMs per worker type is configured in the ADMIN interface. (PUT REFERENCE TO ADMIN INTERFACE HERE) Starting a new set of workers can take several minutes, so be patient.
For example, the current design has three (3) worker types: Ubuntu, CentOS, and WinXP. The WinXP worker only handles the Panda engine. The CentOS worker only handles the Symantec engine. The Ubuntu worker handles the rest of the engines. So, in this example, you would start all workers and thus queues by clicking three UP ARROWs. 1) Symantec, 2) Panda, and 3) One of any other engine.
Stopping a Worker¶
When Phagescan is up and running, both columns will show active queues. The first column will have a RED X icon next to each engine queue. To stop a worker and thus the queues for the set of engines on that worker, click the RED X icon. This will shutdown and destroy all of the VMs for the worker type associated with that queue.
For example, the current design has three VMs running for the Ubuntu worker type. If you click the RED X, it will destroy all 3 VMs.
Phagescan Master¶
- The Phagescan Master is the front-end and includes the following services:
- Django, Celery, RabbitMQ, PostgreSQL, Nginx, Gunicorn, and Supervisord.
- Django
- Runs the Phagescan UI and Admin UI on localhost.
- Celery
- The tasking system that communicates with the workers.
- RabbitMQ
- The broker between the master and worker celery tasks.
- PostgreSQL
- The database used by Django.
- Nginx
- The external-facing service that proxies requests to Django.
- Gunicorn
- Used to start Django automatically.
- Supervisord
- Used to start CeleryCAM automatically.
Starting/Stopping All Django and Celery Services¶
The script dev/ps_services.sh
can be used to start, stop, restart all of the Master Django and Celery services at one time.
Start¶
Assuming the RabbitMQ service is running and configured, you can start Django and Celery as follows:
sudo rabbitmqctl start_app
sudo dev/ps_services.sh start
Stop¶
You can stop the Django and Celery services and the RabbitMQ app as follows:
sudo dev/ps_services.sh stop
sudo rabbitmqctl stop_app
Restart¶
You can restart the Django and Celery services and reset the RabbitMQ app as follows:
sudo dev/ps_services.sh stop
sudo rabbitmqctl stop_app
sudo rabbitmqctl start_app
sudo dev/ps_services.sh start
This can be helpful to reset the Celery queues.
Django¶
Django can be started manually or with startup scripts. During development it is best to start it manually. A production environment will use the startup scripts.
Either way, Django should only listen on localhost. To access Phagescan from outside the host, Nginx is used as a proxy.
Manual Start/Stop¶
Run as same user that you used to start the 3 celeryd processes:
[avuser@host]$ python manage.py runserver -v 3 127.0.0.1:9000 --settings=scaggr.settings
Stop it with CTRL+C.
Automated Start/Stop¶
In Phagescan, we use gunicorn
to automatically start Django on boot.
Start with:
$ sudo service gunicorn start
Stop with:
$ sudo service gunicorn stop
See installation/salt-masterless/salt/gunicorn for sample configuration files and startup script.
Accessing¶
Access the Phagescan Web User Interfaces at:
http://127.0.0.1:9000
http://127.0.0.1:9000/admin
Celery¶
The Scan Master has 4 different Celery services running at the same time.
- CeleryCAM - this is only used in production
- Periodic Runner - run periodic tasks
- Result Collector - collect results from workers
- Master Task - manage tasks and send tasks to workers
Services 2-4 require a celery config file to be located in the [Project_root_dir]. The CeleryCAM uses a config file withing the /etc/supervisor/conf.d/ directory.
Manual Start/Stop¶
Start the 3 celery processes each in separate terminals:
[avuser@host]$ DJANGO_SETTINGS_MODULE=scaggr.settings celeryd --config=masterceleryconfig -E -B -l info --hostname=master.master
[avuser@host]$ DJANGO_SETTINGS_MODULE=scaggr.settings celeryd --config=resultsceleryconfig -E -B -l info --hostname=master.results
[avuser@host]$ DJANGO_SETTINGS_MODULE=scaggr.settings celeryd --config=periodicceleryconfig -E -B -l info --hostname=master.periodic
Stop them with CTRL+C.
If you want debug output, change the -l info
to -l debug
.
Automated Start/Stop¶
In Phagescan, we use Celeryd startup scripts and supervisord
to start the 4 Celery services automatically.
Start with:
$ sudo service celeryd-master start
$ sudo service celeryd-periodic start
$ sudo service celeryd-result start
$ sudo service supervisord start
Stop with:
$ sudo service celeryd-master stop
$ sudo service celeryd-periodic stop
$ sudo service celeryd-result stop
$ sudo service supervisord stop
See installation/salt-masterless/salt/[celery/master | supervisord] for sample configuration files and startup script.
RabbitMQ¶
RabbitMQ is installed as an Ubuntu package and will start automatically on boot, by default. For Phagescan, all we need to do is add users, permissions, and vhosts for Celery to use.
Adding Users, Permissions, and Vhosts¶
Replace the username and password with the credentials that you would like to use and run the following commands:
$ sudo rabbitmqctl add_user phagemasteruser longmasterpassword
$ sudo rabbitmqctl add_user phageworkeruser longworkerpassword
$ sudo rabbitmqctl add_vhost phage
$ sudo rabbitmqctl set_permissions -p phage phagemasteruser ".*" ".*" ".*"
$ sudo rabbitmqctl set_permissions -p phage phageworkeruser ".*" ".*" ".*"
If the master and worker hosts use different user/pass combinations to communicate with the broker, you must use the commands as written above. However, if the master and worker hosts use the same user/pass, you may add one user and set_permissions on that one user. In that case, there is no need for the second user. A production system should use separate credentials for the master and workers.
Note: These credentials are the BROKER_CONF for the master and workers. So, make sure the username and password you created here are set in both the master and worker BROKER_CONFs:
For the master it is in scaggr/settings.py. For the worker it is in workerceleryconfig.py.
Deleting Users, Permissions, and Vhosts¶
You delete the user and vhost and the permissions are automatically deleted.
$ sudo rabbitmqctl del_user phagemasteruser
$ sudo rabbitmqctl del_user phageworkeruser
$ sudo rabbitmqctl del_vhost phage
Starting the RabbitMQ Application¶
You only have to do this if you've manually stopped the application. Starting the RabbitMQ service does this automatically.
$ sudo rabbitmqctl start_app
Stopping the RabbitMQ Application¶
This is how you would clear/delete the Celery Queues from RabbitMQ after you stop the Celery services:
$ sudo rabbitmqctl stop_app
Examining the Queues¶
RabbitMQ is the broker for the Celery queues, so you can examine many details about the Celery queues using rabbitmqctl
.
See the rabbitmqctl documentation for guidance.
One of the more useful commands is:
$ sudo rabbitmqctl -p phage list_queues name consumers messages
This will list all of the queues on the phage vhost; specifically the name, number of consumers, and number of messages for each queue.
PostgreSQL¶
Configure¶
Replace the username and password with the credentials that you would like to use and run the following commands:
$ sudo su postgres
$ psql
postgres=# create user citestsuper createdb superuser password 'sup3rdup3r';
postgres=# create database phage owner citestsuper;
postgres=# \q
$ psql -d phage
phage=# create extension hstore;
phage=# \q
$ exit
Starting/Stopping¶
This is a standard service, so start and stop by doing:
$ sudo service postgresql start
$ sudo service postgresql stop
Nginx¶
A default Nginx config file is provided in installation/salt-masterless/salt/nginx/. Once it is configured, you can start or stop the nginx service as follows:
$ sudo service nginx start
$ sudo service nginx stop
Note: after changing the configuration file, you should restart the service.
Phagescan Workers¶
The Phagescan Worker is one of a collection of VMs running in a virtualization infrastructure. We currently support OpenStack and EC2 for cloud computing providers. You can also run VMs in any virtualization framework like: VMWare, VirtualBox, etc.
Using the Phagescan Workers¶
Once the VM is built and configured, there are only a few things that you may want to do on the VM.
Start/Stop the Celery Service¶
In a production environment, there will be a single celery service that will start automatically. On Linux, you can start and stop the service using the standard service scripts:
$ sudo service celeryd start
$ sudo service celeryd stop
In a development environment, it is often easiest to start it manually in debug mode to observe the output.
Refer to the OS-based installation documentation for the specifics:
Ubuntu
CentOS
Windows XP
Start/Stop Engine Services¶
Each engine will have its own start and stop script if it is run as a service. Some engines are simple scripts that are only executed on demand. Refer to the Engine Installation Documentation in [Project_root_dir]/engines/[engine_name]/INSTALL for assistance.
Update Engine Signatures¶
Some engines are signature-based, which we refer to as Evilness engines. On a production system, they are configured to automatically install signature updates multiple times per day. Provided the workers are given access to the public Internet. There is also a periodic Celery service on the Scan Master that will attempt to update signatures every 4 hours. However, if you want to update them manually, refer to the Engine Installation Documentation in [Project_root_dir]/engines/[engine_name]/INSTALL for assistance.
OpenStack Web User Interface¶
The OpenStack Web UI is where you will do initial setup and possibly troubleshooting.
Log onto UI¶
The OpenStack UI is at something like:
https://10.10.10.10:8080/horizon/project/instances
Note: This UI requires separate credentials from the Phagescan Web UI.
Find IP Addresses for Virtual Network Devices¶
Log into the OpenStack interface and look at the Network page for the description of the virtual network hardware. For an individual VM, you should look on the Instances page.
Log onto a Worker VM¶
Go to the OpenStack UI, and click on the "Instances" link. On the Instances page, click on the Instance that you want to log onto. Then at the top there are three tabs, click the right-most tab, which is a VNC session to that Instance. Here you can log in.
Phagescan Admin Web User Interface¶
The Phagescan Admin web user interface is where the Admin can make manual adjustments to the Phagescan setup.
Accessing¶
Access the Phagescan Admin Web User Interface on localhost by going to this URL:
http://127.0.0.1:9000/admin
A production master will have a Nginx front-end, so you'll want to access via the public IP address that Nginx is listening on. It may be on port 80 or 443 depending on SSl configuration:
http://1.2.3.4/admin
https://1.2.3.4/admin
Configuration¶
There are only a couple of places where configuration is needed.
- first place
- second place
TODO: Finish this page
Phagescan Development¶
To develop on Phagescan, first go through the Quick Installation to get a phagedev VM setup. If you do not want to use VM's, we have Manual Instructions, so you can do everything on your development host.
There are some important facts about these vagrant VMs to note:
- The [Project_root_dir] directory on your development host will be mapped read/write into the vagrant VM as /vagrant. So you can use an editor/IDE on files on your development host and execute your code/tests inside your vagrant VM.
- When you ssh into the vagrant vm, you will be user 'vagrant' which has no password and has sudo privileges.
- The python virtualenv on each vagrant vm is in /opt/psvirtualenv.
- The Vagrantfile is in [Project_root_dir].
- Once your VM is fully built, it is a good idea to halt it and take a snapshot. Then you can quickly revert to a clean VM should you experience problems during development.
- Port 8000 in the vagrant VM can be accessed by localhost:8090 on your development host.
Prepare your Environment¶
Setup Git¶
Make sure your have git fully setup on your development host:
$ git config --global user.name "yourUserName"
$ git config --global user.email you@domain.com
Fork our GitHub repo in your own GitHub account. If you haven't already, clone your forked repo:
$ git clone git@github.com:<YOUR GITHUB USER>/phagescan.git
$ cd [Project_root_dir]
$ git remote add upstream git@github.com:scsich/phagescan.git
If you already cloned the repo, update your Git remotes:
$ cd [Project_root_dir]
$ git remote rm origin
$ git remote add upstream git@github.com:scsich/phagescan.git
$ git remote add origin git@github.com:<YOUR GITHUB USER>/phagescan.git
Refresh your repo and select your git branch:
$ git pull
$ git branch --all
$ git checkout -b <mybranch> origin/<mybranch>
For your Python SDK, you have the choice of working with an SDK that is located within a Vagrant VM, or working with an SDK that is local to you development host. The following two sections walk you through those options.
Remote Python SDK in VirtualEnv in a Vagrant VM¶
If you use the SDK in a Vagrant VM, it will be fully setup by the automated Salt provisioner when you start the VM. As such, the only thing else you need to do is configure your IDE to use it.
We recommend that you use PyCharm as your Python IDE. It is able to use the SDK located within a Vagrant VM and execute code and tests within that Vagrant VM. This leaves your development host free of extra python libs and other tools. Use the PyCharm Guide to create a new Python Project and connect it with your Vagrant VM.
Local Python SDK in VirtualEnv on development host¶
If you do not want to use the SDK in a Vagrant VM and instead wish to use one local to your development host, this section will help you install everything in a Python Virtual Environment.
We still recommend PyCharm, because it is one of the best Python development IDE's around.
Install necessary OS packages:
If running Ubuntu:
$ sudo apt-get install $(< [Project_root_dir]/PACKAGES.ubuntu)
$ sudo apt-get install $(< [Project_root_dir]/installation/scanmaster/PACKAGES.ubuntu)
Build & activate a virtual environment:
$ virtualenv --setuptools ~/psvirtualenv
$ source ~/psvirtualenv/bin/activate
Your prompt should look like this after:
(psvirtualenv)[user@host]$
If you need to deactivate the virtual env (don't do this now):
(psvirtualenv)[user@host]$ deactivate
Install Python requirements into Virtualenv:
(psvirtualenv)[user@host]$ pip install -r [Project_root_dir]/installation/dev/PACKAGES.pip
(psvirtualenv)[user@host]$ pip install -r [Project_root_dir]/installation/scanmaster/PACKAGES.pip
(psvirtualenv)[user@host]$ pip install -r [Project_root_dir]/installation/scanworker/PACKAGES.pip
Now you can start development. There are some handy scripts and config files in [Project_root_dir]/dev/.
Remember that on the VM you have to manually start/stop celery and django.
The simplest thing to do is to run everything as the user 'vagrant', which is the user you are logged in as when you connect to a Vagrant VM using ssh.
Creating new Engines¶
All you are doing when creating an engine is creating a Python wrapper around a tool that you want to run. There are two types of engines:
- Metadata engines - returns data about files, but does not make a good/bad judgement.
- Evilness engines - makes a good/bad judgement, but has the option to return other data as well.
TODO.. add more..
Make Pycharm use Python SDK in a Vagrant VM¶
Ensure you've completed the steps for setting up a development system before setting up PyCharm.
These instructions will help you start a new development project in PyCharm. You need PyCharm 2.7.x or newer. Ideally, you'll have a 3.x version. And we'll configure your project to use the Python virtualenv in the phagedev VM as your SDK.
First, make sure the phagedev VM is running.
Add a Remote Project SDK¶
Start up PyCharm and we'll add a Remote Project SDK.
- NOTE: You will have to re-do this each time to shutdown your VM, or create a new VM. Thankfully, it only takes 30 seconds.
Go to the Project Structure settings
- From Quck Start window, Configure->Project Defaults->Project Structure
- From an open project, File->Project Project Structure
In the Project Structure settings, add a new Project SDK of type Python SDK
In the Select Interpreter Path window, select "Remote..."
In the Configure Remote Python Interpreter window do the following.
Enter these values:
Host=127.0.0.1 Port=2222 User name=vagrant (not anonymous login) Auth type: Key pair (OpenSSH) Private key file: ~/.vagrant.d/insecure_private_key Passphrase: <leave it blank> Python interpreter path: /opt/psvirtualenv/bin/pythonClick the Test connection... button to verify it works. Click OK
At this point, PyCharm will scp all of the PyCharm helpers into the VM. This is why you have to re-do this process for each new VM.
Notes:
- You get the Port value from the output when you ran
vagrant up <vmname>
.The line looks like this:
[vmname] -- 22 => 2222 (adapter 1)The "Fill from Vagrant config" button will only work if you are using a properly configured Vagrantfile.
- In the Project Structure settings, enter a path for the Project compiler output. Use something like: [Project_root_dir]/output
- Click Apply and then OK to close the Project Structure window.
Update Project to use New SDK¶
If you already have a project created for PhageScan, you can update the SDK used by your project to be the new Remote Python SDK.
If you don't have a project created, create one now and select the remote SDK when it asks.
Create a New Python Project¶
Creating a new Python project for Phagescan is quick and easy.
Create a new Project
- from Quick Start window, Create New Project
- from an open Project, File->New Project
In the New Project window, select "Python Module" in the left frame. Enter phagescan as the Project Name. The Project location should be [Project_root_dir]. Click Next.
In the Specify Python Interpreter window, select the Remote Python interpreter.
In the desired technologies window, check the box next to Django and enter these values:
Project name: phagescan Application name: scaggr Templates folder: [Project_root_dir]/templates
Click Finish.
Creating Test to Run in the VM¶
You can create/run tests on the remote VM, but first you have to map [Project_root_dir] on your host to the /vagrant dir on the VM.
From within your project, go to Run-> Edit Configurations.
Select the sideways wrench icon. It should display the Defaults.
Under defauls go to Python tests and select Nosetests.
At the bottom of the nosetests window is a field named "Path mappings"
Add an entry that maps your [Project_root_dir] on your development host to /vagrant on the vm. It will be something like this:
[Project_root_dir]=/vagrant
Click Apply, then OK.
Now you can create and run nosetests and they will contain this default mapping.
Manually Build Development VM¶
These instructions are for manually building/configuring a development host that is both a scanmaster and scanworker, instead of using Vagrant/Salt to build a phagedev VM. This must be an Ubuntu host.
Prequisites¶
Start by following the steps to Prepare your Environment. Specifically, you want to complete the steps to Setup Git and Local Python SDK in VirtualEnv on development host.
If not already running, start rabbitmq and postgresql:
$ sudo /etc/init.d/rabbitmq-server start
$ sudo /etc/init.d/postgresql start
Configuration¶
Unless otherwise specified, assume the commands listed here are to be executed from [Project_root_dir].
Set up rabbitmq:
$ sudo rabbitmqctl add_user test test
$ sudo rabbitmqctl add_vhost phage
$ sudo rabbitmqctl set_permissions -p phage test ".*" ".*" ".*"
Set up postgres:
$ sudo su postgres
$ psql
postgres=# create user citestsuper createdb superuser password 'sup3rdup3r';
postgres=# create database phage owner citestsuper;
postgres=# \q
$ psql -d phage
phage=# create extension hstore;
phage=# \q
$ exit
Resolve 'scanmaster' to localhost:
$ sudo su
# echo "127.0.0.1 scanmaster" >> /etc/hosts
# exit
Install the scanner engines you wish to test.
Refer to the following files:
[Project_root_dir]/engines/[engine_name]/INSTALLNOTE: Since we're only setting up a development environment, ignore any 'avuser' instructions. We will not create this user.
For the Master, you need to configure and start Django and 3 celery workers.
Create Django database tables, cache, and superuser.
The first command will prompt you to create a django superuser. Do so. Use devuser/devpass and give a fake e-mail addr:
$ cd [Project_root_dir] $ source /opt/psvirtualenv/bin/activate (psvirtualenv)$ python manage.py syncdb --settings=scaggr.settings (psvirtualenv)$ python manage.py migrate --settings=scaggr.settings (psvirtualenv)$ python manage.py createcachetable --settings=scaggr.settings cache
Copy the appropriate Celery config files to the [Project_root_dir]:
(psvirtualenv)$ cp installation/scanmaster/masterceleryconfig.py masterceleryconfig.py
(psvirtualenv)$ cp installation/scanmaster/resultsceleryconfig.py resultsceleryconfig.py
(psvirtualenv)$ cp installation/scanmaster/periodicceleryconfig.py periodicceleryconfig.py
Collect Static files:
(psvirtualenv)$ python manage.py collectstatic
Update scaggr/settings.py.
- Set DEBUG is True
- Update the BROKER_CONF values to match the user and vhost values we configured in the rabbitmq step above.
Start the celery processes, each in separate terminals:
(psvirtualenv)$ DJANGO_SETTINGS_MODULE=scaggr.settings celeryd --config=masterceleryconfig -E -B -l info --hostname=master.master
(psvirtualenv)$ DJANGO_SETTINGS_MODULE=scaggr.settings celeryd --config=resultsceleryconfig -E -B -l info --hostname=master.results
(psvirtualenv)$ DJANGO_SETTINGS_MODULE=scaggr.settings celeryd --config=periodicceleryconfig -E -B -l info --hostname=master.periodic
Start the django development web server.
Be sure to run django as same user that you used to start celeryd:
(psvirtualenv)$ python manage.py runserver -v 3 0.0.0.0:8000 --settings=scaggr.settings
For the Worker, you only need to configure and start one celery worker. Copy the worker Celery config:
(psvirtualenv)$ cp installation/scanworker/workerceleryconfig.py workerceleryconfig.py
Edit the Worker celery config and define the BROKER_CONF variables to match the those from the settings.py above.
Start the Celery Worker:
(psvirtualenv)$ celeryd --config=workerceleryconfig -E -l INFO -n worker
Now we have the Django Web Interface listening on port 8000 on your host. To connect to the Django Web User Interface:
http://localhost:8000
Login to the Django Web User Interface with the django superuser user/password that you created.
You can reach the Admin interface by going to:
http://localhots:8000/admin/
Some helpful tips:
When you start Django for the first time, it will create a logs directory: [Project_root_dir]/logs. That is where celery and django logging is written.
If you're experiencing weird DB-related issues, drop and re-add the DB. It might be some latent problem:
$ [Project_root_dir]/dev/dev_reset_db.sh
Or, if you prefer, reset the DB and remove all uploaded samples and MD engine-generated files:
$ [Project_root_dir]/dev/dev_reset_all.sh
NOTE: these scripts will require sudo ability to the postgres user.
Troubleshooting¶
The troubleshooting section is designed to guide you through solving some of the more common problems. Phagescan has a number of moving parts, so this guide is sub-divided accordingly. Unless explicitly stated, this guide refers to a production deployment of Phagescan.
Troubleshooting the Phagescan Master¶
If you are having issues with the scanmaster, the Phagescan front-end, Phagescan master troubleshooting page provides troubleshooting assistance for the most common issues.
Troubleshooting the Phage Workers¶
If you are having issues with the scanworker, the Phagescan back-end VMs, Phagescan worker troubleshooting page provides troubleshooting assistance for the most common issues.
Troubleshooting the Cloud Backend in OpenStack¶
If you are having issues with the communication between the Master and the Phagescan back-end VMs, Phagescan OpenStack troubleshooting page provides troubleshooting assistance for the most common issues.
Troubleshooting the Master¶
Troubleshooting the master.
Where To Find Logs¶
Celery and Django logs will be in one of these directories:
/opt/phagescan/logs
c:/phagescan/logs
/var/log/celery/
RabbitMQ, Nginx, and PostgreSQL logs will be in:
/var/log/
Rebooting The Master Host¶
If you are running everything on a single host, including OpenStack, for a production environment, it can be complicated to do a controlled reboot. This is not a common situation, but here is what you do:
First, you should shutdown/destroy all VM instances. Then you can do a standard reboot command.
Log into the Phagescan Status page and click the "X" icon next to one engine from each VM type to destroy the workers. (i.e. clicking the 'X' for Panda, Symantec, and Kaspersky will destroy all of the 3 types of VMs, because they each run on a different VM type).
Log into the OpenStack UI, go to the "Instances" page, and Force Destroy any instances in the "shutdown" state. !!! DO NOT remove images from the Images or Snapshots pages !!!
Properly reboot the master box:
$ sudo shutdown -r now
Upon reboot, log into the Phagescan server status page and click the Up arrow next to one engine from each VM type to start up the workers. (i.e. clicking the up arrow for Panda, Symantec, and Kaspersky will start up the 3 types of VMs, because they each run on a different VM type).
Wait about 15 minutes for everything to stabilize.
Problems Starting Worker VMs After a Power Outage¶
This problem arises because OpenStack only allows the number of VMs Instances as can be supported by physical hardware. The odd thing is that it counts instances in the shutdown state. When the power goes out and the host boots, all of the previously running VMs will be in the shutdown state. None of those VMs will be restarted by Phagescan. And none of those VMs will be automatically removed. The solution is to manually remove all of them and tell Phagescan to start up new VMs.
If you experience a forced poweroff, here is what you should do upon rebooting:
Log into the Phagescan server status page and click the "X" icon next to one engine from each VM type to destroy the workers. (i.e. clicking the 'X' for Panda, Symantec, and Kaspersky will destroy all of the 3 types of VMs, because they each run on a different VM type).
Log into the OpenStack interface, go to the "Instances" page, and Force Destroy any instances in the "shutdown" state.
Properly reboot the master box:
$sudo shutdown -r now
Upon reboot, log into the Phagescan server status page and click the Up arrow next to one engine from each VM type to start up the workers. (i.e. clicking the up arrow for Panda, Symantec, and Kaspersky will start up the 3 types of VMs, because they each run on a different VM type).
Scanners Are Failing¶
There are a number of reasons why this could happen. The most common reasons are:
- One or more of the worker VMs is out of disk space.
- Celery tasks are not functioning properly.
If the worker VM is out of disk space, the easiest thing to do is to destroy that worker VM type and re-create it.
If the Celery tasks are not functioning properly, the easiest thing to do is to Stop all Scan Master Celery and Django services and the RabbitMQ application and then restart them.
See restart under Starting/Stopping All Django and Celery Services.
Error Messages¶
Troubleshooting OpenStack¶
OpenStack is a complex framework. It is beyond the scope of our documentation to go into it's setup and configuration. However, we do have a few tips that are helpful when dealing with Phagescan Workers running in OpenStack.
Find Network IPs in Use by Virtual Network Hardware¶
The easiest way is to use the OpenStackUI. Alternatively, you can use the command line.
On the command line, you can do the following.
Get the list of devices:
sudo ip netns list
At a minimum, you should see a qrouter
and a qdhcp
device.
Query the status of each device:
sudo ip netns exec <qrouter-blah-uuid> ifconfig -a
sudo ip netns exec <qdhcp-blah-uuid> ifconfig -a
Log onto a Worker VM¶
This can be done either using the OpenStack UI's VNC relay or the command line.
For the command line, it will only work on the Linux VM's and only ssh with public key auth is allowed. Make sure you have the private key associated with the public key stored in the VM.
If you do not know the IP address of the VM that you want to connect to, use the OpenStack UI to find the IP and OS Type of the VM that you want to connect to.
Find the qrouter by doing this:
$ sudo ip netns list
Using the qrouter value and the username for that VM type, do this:
$ sudo bash
$ ip netns exec qrouter-<lots of UUID chars> ssh -i /path/to/.ssh/id_rsa username@vm_ipaddr
Remember this: If you change anything on a VM, your changes will not persist for the long term. Phagescan destroys VMs and re-creates them from a template VM when you click the "X" or "UP arrow" icons on the server status page.
Odd Networking Issues¶
Make sure the virtual router and dhcp service are running:
$ sudo ip netns list
You should see both a qrouter-<long uuid> and a qdhcp-<long uuid> entry.
Make sure both of them have IP addresses. If they don't or they're not up then the workers probably can't get to RabbitMQ:
$ sudo ip netns exec <qrouter-long-uuid> ifconfig -a
$ sudo ip netns exec <qdhcp-long-uuid> ifconfig -a
Make sure the worker VMs have an IP address and in the proper IP subnet matching the qdhcp and qrouter. You can log in using the VNC method and then check the interface settings.
Make sure you can ping the Phagescan UI IP Address from the Worker. Log onto the worker using either VNC or CLI and ping the IP of the Phagescan UI.
While logged onto the worker, look at the worker celery logs.
If none of those seems to help, it might be best to do a proper reboot of the master host. See "WHAT IS THE BEST WAY TO REBOOT THE MASTER HOST"
Where To Find Logs¶
OpenStack logs are in:
/var/log/
You'll often wan the horizon or nova logs if you are investigating VM issues.
Troubleshooting the Workers¶
Troubleshooting the workers.
Errors¶
Worker VM Out of Memory¶
There are a number of reasons why a worker VM could be out of memory.
- One of the Engines could have a memory leak.
- The Celery service might be running in DEBUG mode.
- There are other reasons.
First, check to see if the Celery service is running in DEBUG mode.
If you are running Celery on the command line, it is the -l
flag.
If you are running Celery as a service, look in /etc/default/celeryd for the CELERYD_LOG_LEVEL
parameter.
In production, it should be INFO
.
If you change the logging level, restart the Celery service.
If Celery is not the problem, just reboot that worker VM.