Overview

The goal of Nemesis is to create an extensible file-processing system for Adversary Simulation operations which takes files collected from C2 agents and provides automated analysis and assists with triage.

Project Structure

./docs/ - Documentation that's published to the GitHub page
./infra/ - Infrastructure files (Dapr, Postgres, etc.)
./libs/ - Shared libraries between projects and the file_enrichment_modules
./projects/ - Applications that comprise the various Nemesis services
./tools/ - Miscellaneous helper scripts

Design Choices

Many of the decisions made with Nemesis 1.X resulted in an over-engineered system that was less flexible and difficult to expand/maintain. Nemesis 2.0 aims to take lessons learned and simplifies the entire architecture:

Docker/Docker-Compose is used instead of k8s for speed of development and general ease of use, especially as we didn't experiment with scaling in the previous version (we may move back to k8s at some point).
Dapr is now used to increase reliability and to offload infrastructure plumbing concerns
Strict protobuf schemas were dropped in favor of a flexible schema
Overall project code/approaches were greatly simplified
Dropped Elasticsearch (the largest resource hog) in favor of consolidating with PostgreSQL

HTTP Endpoint

Easy for people to create consumers without needing to structure their messages with protobuf.

RabbitMQ

We still use RabbitMQ as the main queuing system for Nemesis. While RabbitMQ does not have some of the features of Kafka such as persistent storage and replay, it is significantly lighter weight and can still scale well.

With Dapr pub/sub integration, this can easily be swapped out.

Dapr

Nemesis 2.0 makes heavy use of Dapr, the Distributed Application Runtime. The Dapr components that Nemesis utilizes are detailed in the following sections. Images in this page were pulled from the appropriate locations from the Dapr Documentation.

Pubsub

Nemesis utilizes the Dapr Publish & subscribe building block for its internal queueing system. Currently, Nemesis utilizes RabbitMQ for the queue, but this can easily be easily swapped for alternative systems like Kafka or Redis Streams by ensuring the provider is stood up in the docker-compose.yml, modifying the pubsub.yaml file with an alternative provider, and ensuring the connection string is passed through via an environment variable as in the current pubsub.yaml example.

Dapr Pubsub

Workflows

Dapr Workflows enable developers to build reliable, long-running business processes as code. They provide a way to orchestrate microservices with built-in state management, error handling, and retry logic for complex distributed applications.

Dapr Workflow Overview

Nemesis uses in two specific places/services. First, in the file_enrichment project, Dapr workflows are used to control the main file enrichment processing logic. The enrichment_workflow() function controls the main enrichment workflow, with the enrichment_module_workflow() function invoked as a child workflow.

The document_conversion project also implements a Dapr workflow in the document_conversion_workflow() function to handle converting documents and extracting text. This is broken out into a separate project as it's a time-consuming task.

Secrets

Nemesis uses the Dapr Secrets management building block to protect secrets internally (like PostgreSQL connection parameters). Currently the Local environment variables component is used. These secrets are also referenced within some Dapr files such as pubsub.yaml.

This reason for using this abstraction is so alternative secret management systems like Vault or Kubernetes secrets can be used in the future:

Dapr Secrets

An example of retrieving secrets is in libs/common/common/db.py which retrieves individual PostgreSQL connection parameters (POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_HOST, POSTGRES_PORT, POSTGRES_DB, POSTGRES_PARAMETERS) and constructs the connection string.

Service Invocation

In a few places in Nemesis, Dapr's Service Invocation building block is used to ease the complexity of some API invocations. This building block is specifically used when calling the Gotenberg API and when calling some of the internal file enrichment APIs by the web API.