Argonne National Labs Cyber Defense Competition 2017 Debrief
This post highlights my contributions to Iowa State University’s 2017 team for the Argonne National Labs Cyber Defense Competition.
I was drawn to compete at the Argonne National Labs 2017 CDC because it was based off of the CDC format hosted at Iowa State University, it sounded fun to visit and compete at a national lab, and I thought it would be a good opportunity to try out some defensive strategies I never got a chance to test before my switch to participating on White and Red teams at Iowa State. I joined a team with five other undergraduate students and was nominated as team captain. Two undergraduate members were experience CDC participants and the remaining three had no prior experience competing. None of us knew each other when we signed up to compete in the competition.
The competition scenario had a SCADA spin to it and involved managing a physical power and water Industrial Control System (ICS). Along with SSH access to the Raspberry Pi that controlled a water pump and some LEDs to simulate the ICS, we were given remote access to a network of virtual machines including a Windows domain controller, mail server, web server, and file server. Aside from the mail server we were not allowed to change or reinstall any of the operating systems, but we were allowed to add additional virtual machines and services.
We replaced the mail server and added a firewall, DNS/NTP server, and log monitoring server. Our firewall guy did an awesome job segmenting the network into three DMZs based on High, Low, and Trusted risk levels. We later added a fourth DMZ for a Remote Desktop Server, which is not captured by this diagram.
My strength has always been securing whatever web applications were thrown into the mix for a CDC. This competition had two web applications that I took responsibility for securing.
We were given a Drupal website with a custom theme, LDAP user authentication, SAMBA file browser, and a help desk support ticketing system that was all running on an old Debian server running a MySQL database. Every CDC I like to take the opportunity to learn a recent technology. This CDC I thought it might be fun to try using Docker to sandbox the exposed services on the web server. This way if an attacker compromises the website and obtains remote code execution he still needs to figure out how to escape from the virtualized Docker machine.
My motivation to using Docker for the web server was also because it offered a fun loophole to the competition rules. I wanted to migrate the MySQL database to an SQLite database because SQLite is simply a file and I could remove another attack surface by disabling MySQL services. The rules, however, stated that we could not upgrade any software on the system aside from minor versions. Since the installed version of PHP was too old to support the Drupal’s requirements for SQLite database support, I was in a bind. Since we could add new software and “change how user’s access our services” installing Docker as another level of abstraction allowed me to host the web server inside a modern OS with the latest software dependencies.
After a few minutes of looking at the provided Drupal installation, I found multiple file and directory permission issues as well as one obvious and one less obvious C99 shell. I decided to not take any chances. I made a backup of the MySQL database and committed the original installation files to a private Github repository and setup a local virtual machine for development purposes.
In my development machine I installed Docker and used the community recommended Dockerfile to install a fresh Drupal installation. With the help of a project I found to port MySQL to SQLite and some manual work to migrate the MySQL database to SQLite and configure Drupal to use the migrated database. Next I systematically search for and installed the latest versions of each Drupal module that was running on the original Drupal installation. This was actually a lot of work. I fixed the remaining Drupal migration issues and then Google’d for a list of recommended Drupal security modifications. I added session limits, session timeouts, fine grained access controls, and general web application security improvements.
The scenario provided several user roles in the company so I created a new access control group for each role and scoped permissions accordingly.
Finally, with the magic of Docker I committed my finished Docker container I built locally, pushed the changes to DockerHub, and pulled the docker container down on the competition web server. After running both versions side by side for a while and feeling confident of my Dockerized version of the web server we disabled the MySQL and Apache services. Another team member went so far as to recompile and harden the kernel so we wouldn’t be vulnerable to the DirtyCow attack that affected our kernel version.
The web server host mounted a SAMBA share from the file server, which we had to use to host a copy of our White and Green team documents through the website. Since these documents contained our network configuration and user passwords (a competition requirement), we decided to take extra measures to protect it. To make matters worse the mount point contained the plain text password of the file share, so if an attacker could access the file system he could potentially pivot his attack into the file server. Since the web server only needed read only access to the mount we decided to mount the share in the web server host machine and then use Docker volumes to add the read only mount access to the web server. This way if an attacker gained access to the file system, he would be sandboxed in the Docker machine without any ability to write to the file share or access the host file system.
HMI Web Application
We were given a Python Flask Human Machine Interface (HMI) web app to control the power and pump ICS running on a Raspberry Pi. The application contained no authentication whatsoever and contained a status page that must be publicly accessible by unauthenticated users. I created two roles (water and power) and added a login and logout route to the application. The login route queried against LDAP to authenticate and assign the user’s role to a session. Every endpoint except the status and login page was protected by an authentication session that would timeout after two minutes of inactivity. As an extra measure, we assigned every Green team user (company employee) a four digit pin number along with their badge. Using Twilio we created a toll-free phone call-in service where users could enter their badge number followed by their PIN number followed by the # key for a time-based access token that was required to complete a login. Based on the user’s role they could access the Power, Water, or Kill Switch (which required both roles).
(click to view album)
While I was able to start working on the competition in my free time right away, that was not true of all my team. The Green and White team documents were due 1.5 weeks before the competition. All together, we had about four weeks to setup, which sounds like plenty of time, but the first week collided with midterms for most team members. The second week was Spring break and most team members (including myself) were traveling. Another experienced team member was involved working as White team on two other CDCs at Iowa State University and didn’t have time to work on the competition until the day before the competition started. While some of the inexperienced members were available to work earlier we had difficulty coordinating times when experienced members could help mentor. Up until the day before the competition the network was setup almost exclusively by our extremely talented and ambitious firewall guy and myself.
Last Minute Preparations
The remote setup for the competition ended on Wednesday the week before the competition and reopened on Friday afternoon before the competition on Saturday. We left Iowa State University at 1:30PM and arrived at Argonne National Labs around 7pm after checking into our shared team AirBnB residence.
Argonne National Labs kicked everyone out at 9PM so we went to grab dinner and game plan. At this point we still had no working mail server and several of our services were offline. After dinner, we found a McDonald’s and took over their poor little WiFi network. After a while we stayed until 2AM when the WiFi shutdown and decided to head back to the AirBnB. Working very quietly so as not to wake up our hosts, we worked through the night. Nobody had more than 2 hours of sleep in total.
We were required to make printed copies of Green team documents (which included passwords) for Green team to use. This really bothered us, so we ended up adding a Remote Desktop Server (RDP) to our network in the middle of the night and created a separate “Green team bootstrap document” that provided instructions for Green team members to login to the RDP server, where a complete digital copy of the Green team documents was provided.
About twenty minutes before the competition all of our services were working and showing up Green on the scoreboard. That’s when the trouble started…
As soon as the rest of the competitors showed up our web server became really slow and connections kept timing out. We saw that the requests were getting through the firewall and the web server was even responding but often the request would timeout (but not always!). Anyway, long story short, we never quite figured out the root issue to this problem. We had several theories from the unmanned switches Argonne provided to Chrome caching SSL certificates. After reconfiguring our network for the better part of the morning we eventually resolved the problem, but the damage was done.
For the better part of the morning Green team could not access our website and we lost a lot of points in our Green team usability score. Funny enough Red team seemed to have no problem accessing our services the entire day and we had a near 99% uptime for all services according to the service scanner (we accidentally rebooted the mail server for a tweak right before a service scan).
At the end of the competition we ended in 6th place out of 15 teams. The top teams (including the winning team I believe) had their sites defaced and had other compromises, but we were never compromised once. The worst thing that happened was that I forgot to disable account registration on the Drupal website, so Red team created a few new users, but those users had no privileges and that was the end of it. I disabled account registration and blocked extra users Red team created. Red team threw a lot of hail Mary autopwn like attacks at us, but of course nothing out of the box was going to work (we had already scanned our network and fixed known attacks).
The Red team score was worth 50% of the competition points and we got 90% of those points. We also scored well on White and Green team documentation as well as the intrusion reports due every other hour. We did well on all the anomalies, except one. There was a timed anomaly to turn off the water pump, which we did not complete in time. In the end, the Green team usability and our missed water pump anomaly is what killed us.
We missed 60 points for an anomaly to turn off the water pump, which if we had gotten would have put us in first place by itself. This was due to a combination of us having a little too much security (we had to run to get the Firewall guy from his car while he tried to take a 15 minute nap because he was the only one with the SSH keys to log in and the web UI was currently timing out to the previously described issue). Once we got in we found that Raspberry Pi’s GPIO pins had become unmapped from the application because we did not have a shutdown hook to call the
GPIO.cleanup() function. Since we had force quit the application earlier in an attempt to fix our network issues, we got the pin mappings out of state. The application still runs fine, but the Raspberry Pi will silently ignore all signals to the GPIO pins! Dope! I think normally I would have caught this during my development process, but I had mocked out the GPIO pins so I could develop it locally (since I didn’t have access to the Raspberry Pi setup directly).
If we could have solved our networking issue earlier or have finished the water pump anomaly in time, I think we would have won the competition for sure, but this isn’t the reason we lost. We lost because we did not prepare early enough. Every CDC that I have won, my team has won because we completed our services we built our services far in advance and were able to test the system over and over until we knew it inside out. This year everything came together last minute and it was a blast, but we were tried and prone to making mistakes.
I was incredibly proud of how the team came together in the last moments, despite our inability to get started as a team early. As things came together the inexperienced members were forced into the flurry and really ended up pulling their own weight. The RDP and mail servers were setup in the middle of the night each by Freshman team members with nothing but persistence and Googling on McDonald’s terrible WiFi and both servers were completely untouched by Red team at the end of the competition.
We walked a very fine line between security and usability during this competition. In the past, my team has always error on the side of usability and had just enough security to stop Red team. However, this year we went to the other end of the spectrum and arguably had too much security. This is a tricky balance and could be what stopped us from winning if out network issues were really our fault.
As for all my personal contributions, it seems my extra security measures were unnecessary. Red never got any of the credentials to users that could operate water or pumps and the only user that could operate the kill switch was fired within the first half hour of the competition. Green team didn’t operate the pumps or power generator in the Green team checks so the phone system went unused the entire competition. Red also never even came close to getting control of the Drupal site, so the Docker container was basically just overkill.
All in all it was a fun competition and I’d like to come back next year to win it.
Thoughts on the CDC Scenario
I was a bit disappointed in CDC’s anomalies. There were very few anomalies and they were all simple. In fact, the first anomaly was just to fire the only user that had access to the kill switch feature on the HMI (which I was more than happy to do). To me anomalies are about forcing Blue teams to make a mistake, so why not add another user or a service that causes Blue teams to rush and misconfigure the network? The anomalies were also weighted poorly in my opinion. One anomaly was worth 40 points to add 200 LDAP users and another had 60 points to shutoff a water pump, however shutting off the pump is arguably much easier. Both of these tasks had huge weights compared to other anomalies, so failing either was a game changer.
I felt Green team really should have been operating the power and water pumps (not Blue via the pump shutdown anomaly), but Green team checks appear to stop just short of checking if the pumps are active. I also felt anomalies were not very verifiable. For instance, what if we had just unplugged or rebooted the Raspberry Pi to shutdown the pump? Our team played very honest and as close to the rules as we could, but we could have played dirty to win. Another example was an anomaly that was due right at the end of the competition to add 200 users from a spreadsheet to LDAP and the anomaly was checked by logging into the website (which uses LDAP). If the anomaly was to add users to LDAP (which is an internal network service) then White team has no tangible way to validate LDAP was ever updated. Drupal has a user import module that can import a spreadsheet of users. The LDAP Drupal module has an option to enable failover authentication by looking in the database if the LDAP fails.
As for the scenario itself, I felt the user roles were incredibly vague. The Prime Minister of some foreign country?? had a login account, but it’s unclear that they should have had access to do anything at all. Same goes for the intern account. Clarifying with the White team on IRC didn’t help much either. There seemed to be a too many cooks in the kitchen problem where many of the volunteers were not involved in creating the scenario or were having more fun trolling participants…which is fine, but I am only concerned that if the interpretation of the scenario is left completely to the Blue team then the Blue team will just make the most conservative choices possible in order restrict access to the network. Is it fair to another team if they make more realistic interpretations than a team that just wants to lock everything down? The Green team checks usually clarify this to Blue team indirectly, but since the Green team checks were so shallow in this competition (ex: go to a web site, click a link, download a file) there was no penalty for making unrealistic assumptions.
There were also some rules that really didn’t make sense to me. For example, the Green team documents had to have user passwords and had to be shared on a public file share accessible through our website. We ended up just encrypting the documents with the employee assigned PIN number scheme we devised and making the link to the file share displayed conditionally based on the role of the user. However, we also had to print the Green team documents and give to Green team, but Red team later admitted to walking past and stealing the documents since attacks on Green team were allowed in this competition. Our “bootstrap documentation” protected us here, but seemed like an unrealistic challenge that needed to be solved, especially since White team leaked our passwords to Red team during the competition anyway (good thing we took extra steps to jail each user).
Finally, I felt the HMI application really missed an opportunity to do something interesting. First let me say that in my opinion the web application is the only real practical entry-point into the network for Red team to attack. I say this because almost every CDC uses off the shelf components for all services except web applications. There is room for Blue teams to leave a stock service misconfigured, but these settings are easily Google-able and any average Blue team will be able to lock the services down. This leaves custom code (such as a web application) that Blue teams must take special care to secure. However, in this CDC the main web application was Drupal, which falls into the stock off-the-shelf software category. The second web application (HMI) was quite small (only 177 lines of code) and only contained one vulnerability, which was just a complete lack of user authentication. So, for Red team, there was really very little attack surface area for any team that knew the basics of how to secure their network. Even if there was something juicy to attack, I found it odd that the competition didn’t have any flags placed on Blue team machines that Red team was supposed to capture.
The HMI web application operations consisted of turning power on and off and turn water on and off. That’s it. The logic was so small it wasn’t much effort to rewrite the whole thing in one sitting. Why not have more power and water pumps. Should power affect water? What happens if I increase power a bit, could it affect some other system? I’m imagining two Raspberry Pi’s. One that has some interesting model of a complex system and controls the physical pumps and LEDs, and one that we have access to. The code that we have access to should just control the GPIO outputs that physically feed into the inputs of the second machine running a simulator of a complex model that ultimately controls the outputs of the LEDs and pumps. The Drupal website had complexity in its code for sure, but Drupal is such a well-known system that it is easy to secure. The web application needs to unique (probably homebrewed), provide complexity, challenge participants to think about authentication roles, and provide ample attack surface area to Red team. The HMI application did none of these things.
All things considered, the Argonne National Labs CDC was very exciting and attracted competent teams to compete against. It’s only the second year the competition has been running and has shown enormous growth already. I’m really looking forward to what comes next, since there is still room to grow and the organizers seem serious about pushing the competition even further.
- We used Slack to communicate outside of email and to coordinate tasks during the competition.
- We used Google Docs for collaboratively editing reports, updating passwords, sharing larger files, etc.
- We used GitHub for versioning web application code changes
- We used DockerHub for versioning website container
- Our AirBnB host was awesome cost us $22 a person for the night
- McDonald’s won’t physically kick you out of the lobby if you just stay there all night moochin’ WiFi and not ordering a single item…eventually they just shut off the WiFi.
Documents and Resources
Our Green team and White team documents as well as our Intrusion Reports are linked below.
- White Team Documentation
- Green Team “Bootstrap” Documentation
- Green Team Documentation
- 10AM Intrusion Report
- Noon Intrusion Report
- 2PM Intrusion Report
- 4PM Intrusion Report
The version controlled web application source for the Drupal site and the HMI application is available at: https://github.com/benjholla/anlcdc2017-web.
The Drupal Docker image is available at: https://hub.docker.com/r/benjholla/anlcdc2017-web. Note that we did turn off some additional services in the Docker image as well as install SSL certificates but I never committed that version to DockerHub.