Hosting the Compete With Team Europe CTF

Posted on Jul 5, 2023

The Compete With Team Europe CTF was a multi-national capture the flag contest organized by ENISA and the Team Europe Trainers. It was created in order to give Team Europe a chance to refine their skills under pressure.

It consisted of two events: a Jeopardy and Attack-Defense CTF.

Timeline

2023-07-04: Right-size the VMs
2023-07-05: Cloud done
2023-07-06: Source NAT / Networking done
2023-07-07: All but 1 checker green
2023-07-07: Recover from cluster downtime, learn why m6i instances are better

CTF engine

We chose the ECSC2022 / Faust CTF engine early, to align with the tools that the ECSC 2022 and ICC 2022 competitions used. This would be familiar to the players.

Ideating the challenges

The challenge ideas were discussed by the organizers early on, and validated against the format of the game. Challenge authors were CTF players from Team Europe.

Authoring the challenges

The challenges were built in a manner of a few weeks by the authors working closely together. The deliverable format was a docker-compose.yml file.

Vulnbox(es)

Since the vulnbox was deployed by us, the easiest way to set it up was to use Packer and generate an AMI from that. Then, it’s trivial to roll out the same AMI for 25 teams.

For repeated / iterative work, we generated a new AMI and replaced the running test copies of the vulnbox with the new image.

User data

Since in most aspects, the 25 VMs are identical, they can be spawned from the same AMI and that would be all. However, the instances will need some individualized configuration as well. This is where user data comes in. Both AWS Windows and Linux instances support user data, which is essentially a text field in the AWS database that the VM can access. Official AMIs are configured to read the user-data as an EC2Launch or cloud-init config file, for Windows or Linux respectively.

User data is read from the EC2 Instance Metadata Service (IMDS).

EC2Launch

EC2Launch is a system feature integrated into AWS, that essentially performs first-boot setup on Windows VMs. The default configuration for EC2Launch sets the admin password and uploads it to EC2.

The syntax of EC2Launch is either a PowerShell file in XML-like brackets, or a YAML file detailing the steps to perform. Critically for us, EC2Launch includes a feature to reboot the instance and continue the script where it left off. In order to trigger that feature, the same exit code is used as when triggering a reboot in SSM System Manager Run Commands: exit 3010. Exiting with that code will reboot the VM and simply restart the script.

We found that for the most part, creating marker files (stage1.txt, stage2.txt) and testing for their existence is sufficient to restart the script at an appropriate position.

Unfortunately for us, user data also creates a bunch of logs on the system, under Windows Event Viewer, C:\ProgramData\Amazon\EC2Launch\logs and %TEMP%. The files are saved to disk after the PowerShell script exits.

So, this way we can configure 25 Windows domains automatically. Nice!

cloud-init

cloud-init is a system feature built into Linux AWS AMIs to also initialize the VM on first boot. There’s many more options, but the main ones we’re interested in are in the user data.

Building the checkers

The checkers

AWS architecture

ENISA’s AWS account was used. Since they deploy Control Tower, the default VPC is deleted, so a VPC and subnets need to be created.

We opted to go for the following AWS architecture:

VPC
|
+--- VPN subnet 10.20.150.0/24
|    |
|    +---- VPN (m6i.2xlarge)
|
+--- Gameserver subnet 10.20.151.0/24
|    |
|    +---- Gameservers (x 11)
|    |
|    +---- Windows Checker VMs (x 6)
|
+--- Player 1 subnet 10.20.1.0/24
|    |
|    +---- Player 1  dc1 10.20.1.4 (t3.medium)
|    +---- Player 1 srv1 10.20.1.5 (t3.medium)
|    +---- Player 1 lin1 10.20.1.6 (t3.medium)
|
+--- Player 2 subnet 10.20.2.0/24
     ...

Terraform set-up

The natural tool for the job is Terraform, due to its modules support and simple iteration features. It also naturally fit into the model, where one person could prepare the changes and another could apply them to the cluster.

The setup was quite simple: a module was used for Player subnets, and using a hashmap, it was possible to iterate over player subnets to avoid repeating the player block over and over.

Networking

One of the biggest goals was to offload to the cloud provider as much as possible. This way, there are less single points of failure. However, there’s a core part of the Faust CTF architecture that doesn’t lend itself to resiliency: the router.

In the CTF, we hosted the router on a single EC2 instance (“VPN”) and scaled the instance up to sufficient size. This way the chance of the router quitting on us was reduced.

In the subnet configuration, the route for 0.0.0.0/0 was pointed to the ENI of the VPN instance. We disabled source/dest checking on the router, and made sure that the router had two network interfaces. (In hindsight, that might have been unnecessary.)

The AWS reachability analyser was crucial to help traverse the nesting doll of security groups, routes etc.

One of my main learnings here was: no matter how much you want it, AWS will not route in-subnet traffic out based on Subnet Route Tables.

Tooling

Why a Kubernetes cluster

Initializing the database

Kubernetes manifests

Deploying the gameserver

Deploying the checkers

Deploying the Windows checker

Deploying the TCP submission server

Deploying ticketer

Deploying the network

Deploying the reverse proxy

Closing the network

Deploying the VPN

Deploying ticketer (again)

Deploying logging

Deploying Grafana

Fixing the checkers

Updating the scoreboard design

Fixing the VMs

Scoping-in Jeopardy server

Scaling up to 25 teams

Reset

We reset all the VMs on the night before the CTF so there are no logs or similar available.

Resetting VMs was simple: terminate in AWS and re-deploy. At that point we had not figured out how to clear EC2Launch metadata from the VMs, so instead we opted for the manual approach of having @JaGoTu RDP into each box and delete the EC2Launch artifacts.

Day 2

We started the CTF by simply setting the gamecontrol times correctly. The first 5 ticks were failing for basically everyone, because of “usual Faust problems”.