==================
Processing Modules
==================

Cuckoo's processing modules are Python scripts that let you define custom
ways to analyze the raw results generated by the sandbox and append
some information to a global container that will be later used by the
signatures and the reporting modules.

You can create as many modules as you want, as long as they follow a
predefined structure that we will present in this chapter.

Global Container
================

After an analysis is completed, Cuckoo will invoke all the processing
modules available in the ``cuckoo/processing/`` directory, all of which
fall under the ``cuckoo.processing`` module. Any additional module you decide
to create must be placed inside that directory.

Every module should also have a dedicated section in the
``$CWD/conf/processing.conf`` file: for example if you create a module
``cuckoo/processing/foobar.py`` you will have to append the following
section to ``$CWD/conf/processing.conf``::

    [foobar]
    enabled = yes

Every module will then be initialized and executed and the data returned
will be appended in a data structure that we'll call **global container**.

This container is simply just a big Python dictionary that includes
the abstracted results produced by all the modules classified by their
identification key.

Cuckoo already provides a default set of modules which will
generate a *standard* global container. It's important for the existing
reporting modules (HTML report etc.) that these default modules are
not modified, otherwise the resulting global container structure would
change and the reporting modules wouldn't be able to recognize it and
extract the information used to build the final reports.

The currently available default processing modules are:
    * **AnalysisInfo** (``cuckoo/processing/analysisinfo.py``) - generates some basic information on the current analysis, such as timestamps, version of Cuckoo and so on.
    * **ApkInfo** (``cuckoo/processing/apkinfo.py``) - generates some basic information on the current APK analysis (Android analysis).
    * **Baseline** (``cuckoo/processing/baseline.py``) - baseline results from gathered information.
    * **BehaviorAnalysis** (``cuckoo/processing/behavior.py``) - parses the raw behavioral logs and perform some initial transformations and interpretations, including the complete processes tracing, a behavioral summary and a process tree.
    * **Buffer** (``cuckoo/processing/buffer.py``) - dropped buffer analysis.
    * **Debug** (``cuckoo/processing/debug.py``) - includes errors and the *analysis.log* generated by the analyzer.
    * **Droidmon** (``cuckoo/processing/droidmon.py``) - extract Dynamic API calls Info From Droidmon logs.
    * **Dropped** (``cuckoo/processing/dropped.py``) - includes information on the files dropped by the malware and dumped by Cuckoo.
    * **DumpTls** (``cuckoo/processing/dumptls.py``) - cross-references TLS master secrets extracted from the monitor and key information extracted from the PCAP to dump a master secrets file.
    * **GooglePlay** (``cuckoo/processing/googleplay.py``) - Google Play information about the analysis session.
    * **Irma** (``cuckoo/processing/irma.py``) - IRMA connector.
    * **Memory** (``cuckoo/processing/memory.py``) - executes Volatility on a full memory dump.
    * **Misp** (``cuckoo/processing/misp.py``) - MISP connector.
    * **NetworkAnalysis** (``cuckoo/processing/network.py``) - parses the PCAP file and extracts some network information, such as DNS traffic, domains, IPs, HTTP requests, IRC and SMTP traffic.
    * **ProcMemory** (``cuckoo/processing/procmemory.py``) - performs analysis of process memory dump. **Note**: the module is able to process user defined Yara rules from data/yara/memory/index_memory.yar. Just edit this file to add your Yara rules.
    * **ProcMon** (``cuckoo/processing/procmon.py``) - extracts events from procmon.exe output.
    * **Screenshots** (``cuckoo/processing/screenshots.py``) - screenshot and OCR analysis.
    * **Snort** (``cuckoo/processing/snort.py``) - Snort processing module.
    * **StaticAnalysis** (``cuckoo/processing/static.py``) - performs some static analysis of PE32 files.
    * **Strings** (``cuckoo/processing/strings.py``) - extracts strings from the analyzed binary.
    * **Suricata** (``cuckoo/processing/suricata.py``) - Suricata processing module.
    * **TargetInfo** (``cuckoo/processing/targetinfo.py``) - includes information on the analyzed file, such as hashes.
    * **VirusTotal** (``cuckoo/processing/virustotal.py``) - searches on VirusTotal.com for antivirus signatures of the analyzed file. **Note**: the file is not uploaded on VirusTotal.com, if the file was not previously uploaded on the website no results will be retrieved.

Getting started
===============

In order to make them available to Cuckoo, all processing modules must be
placed inside the ``cuckoo/processing/`` directory.

A basic processing module could look like:

.. code-block:: python
    :linenos:

    from cuckoo.common.abstracts import Processing

    class MyModule(Processing):

        def run(self):
            self.key = "key"
            data = do_something()
            return data

Every processing module should contain:
    * A class inheriting ``Processing``.
    * A ``run()`` function.
    * A ``self.key`` attribute defining the name to be used as a sub container
      for the returned data.
    * A set of data (list, dictionary, string, etc.) that will be appended to
      the global container.

You can also specify an ``order`` value, which allows you to run the available
processing modules in an ordered sequence. By default all modules are set with
an ``order`` value of ``1`` and are executed in alphabetical order.

If you want to change this value your module would look like:

.. code-block:: python
    :linenos:

    from cuckoo.common.abstracts import Processing

    class MyModule(Processing):
        order = 2

        def run(self):
            self.key = "key"
            data = do_something()
            return data

You can also manually disable a processing module by setting the ``enabled``
attribute to ``False``:

.. code-block:: python
    :linenos:

    from cuckoo.common.abstracts import Processing

    class MyModule(Processing):
        enabled = False

        def run(self):
            self.key = "key"
            data = do_something()
            return data

The processing modules are provided with some attributes that can be used to
access the raw results for the given analysis:

    * ``self.analysis_path``: path to the folder containing the results (e.g., ``$CWD/storage/analysis/1``)
    * ``self.log_path``: path to the *analysis.log* file.
    * ``self.file_path``: path to the analyzed file.
    * ``self.dropped_path``: path to the folder containing the dropped files.
    * ``self.logs_path``: path to the folder containing the raw behavioral logs.
    * ``self.shots_path``: path to the folder containing the screenshots.
    * ``self.pcap_path``: path to the network pcap dump.
    * ``self.memory_path``: path to the full memory dump, if created.
    * ``self.pmemory_path``: path to the process memory dumps, if created.

With these attributes you should be able to easily access all the raw results
stored by Cuckoo and perform your analytic operations on them.

As a last note, a good practice is to use the ``CuckooProcessingError`` exception
whenever the module encounters an issue you want to report to Cuckoo.
This can be done by importing the class like this:

.. code-block:: python
    :linenos:

    from cuckoo.common.exceptions import CuckooProcessingError
    from cuckoo.common.abstracts import Processing

    class MyModule(Processing):

        def run(self):
            self.key = "key"

            try:
                data = do_something()
            except SomethingFailed:
                raise CuckooProcessingError("Failed")

            return data