You may have seen a few more geek notes on here of late. I’ve really enjoyed jumping into CTFs. My objective isn’t to win, but to find more ways to solve puzzles.

This weekend’s adventures were a little different, though. My company sponsors UMBC’s CyberDawgs team, and they’ve asked us to contribute challenges to their upcoming CTF. I tasked our IRAD team with coming up with a few and I wrote a couple, as well. So this weekend I spent some normalizing our submissions’ README files and doing a final test of the submissions.

One of the submissions was really giving me trouble. The IRAD team member who’d developed it had demonstrated it to us, but the solution instructions in the README just weren’t “clicking” to then be able to reproduce a solve, much less help anyone else understand how to solve. It’s customary in CTFs to have a Discord channel where mentors can offer assistance to those on the right track; given that I don’t want to be up all night myself providing that support, thought it best to provide a walkthrough for someone else..

Not only did I “crack” it (helped, of course, by the solution instructions in his README), but then I was able to provide a linked reproducible recipe using a tool called CyberChef that is really useful for a lot of CTF grunt work. I’m avoiding linking to the recipe or giving any more info on the challenge, of course, given that there’ll be hopefully lots of folks taking a crack at it in early May. I’m now more confident, though, that there may be some folks who solve it AND I better understand a particular kind of encryption approach.

I gave a talk in November to a local high school about computer science as a career field. Aha, I think – I’ve given this talk before – I’ll just brush up my well-prepared slide deck.

My slide deck has a graphic in it that looks something like the below. All credit to Daniel van der Ende and his work on the GitHub Data Challenge in 2014. It’s an interesting way to show the various combinatrics of languages that are used in projects today. It’s actually common nowadays that a project has multiple types of code in it. Often there’ll be the front-end (often JavaScript + HTML + CSS) with some sort of back-end. The point I wanted to convey in the original presentation was that software engineers often don’t just need to know one language. I then would riff lightly one which of the languages they could see in my slide I’d worked with in some form or fashion. (In the snippet you can see of the image, Perl, Scala, Go, JavaScript, Ruby, and Lua. I did just enough of CoffeeScript to not want to do it anymore…)

Well, now it’s 2021. The slide information needs to be updated, and Mr. van der Ende has not updated his image, but he was kind enough to make available his source code and a handy README file which walks (loosely) through how to get the data.

Challenges then solved so far:

  • getting access to BigQuery
  • finding new sources of the data, since the dataset van der Ende references doesn’t seem to exist anymore
  • making BigQuery convinced that I have permission to run queries
  • updating the query to match the new data source, including figuring out how to flatten arrays – really not in his original flow
  • downloading mysql to my developer machine and setting up a database and username/password combo
  • updating van der Ende’s code to read directly from a CSV, rather than assuming I’m using a JSON file
  • getting php to work on my developer workstation – this particular box has done lots of things for me lately, but php hasn’t been one of them
  • figuring out how to populate the languages list the code asked for, given the languages represented in the dataset I downloaded. (For the record, awk, sort, uniq was the happy combo.)
  • uh, figuring out a better way to ingest the CSV, since pulling in the full file at once took up too much memory for my computer
  • (more to come undoubtedly to get it working…)

Note: I ultimately ran into enough things with it that I left the original image. Still on my todo list to bring this to resolution…

My masters classes keep sending us into Wireshark to analyze packet files. I thought I had a decent understanding of how to use Wireshark from some previous experience through work, but I keep finding new tricks as I try to figure out things about unknown protocols. Note that I’m using Wireshark 3.0.3, because that’s what’s installed in the lab infrastructure. I am aware that Wireshark 3.4 is out: my plan is to play with that version on my personal computer to see new goodies.

Copy and Paste

We keep needing to fill out spreadsheets of interesting things learned. We’re running Wireshark through a VDI infrastructure and I’m typically doing my homework on a laptop, so with limited screen real estate, even my touch typing skills aren’t helpful enough. The Copy capability in Wireshark lets me capture just the value for the field – highly useful for things like MAC addresses.

Protocol Hierarchy

Forget about randomly traversing files which including 100K packets – let the protocol hierarchy show likely interesting data points within the file. Filter by said protocol, and data patterns emerge. Worth calling out also the Conversations and Endpoints statistics areas, as well. Nice ways to get a holistic view of what’s going on in the file and what might be worth diving into.

Statistics -> …

We’re looking at SCADA pcap files, including BACnet. Delighted to find a traversal means for BACnet that let me inspect the devices and services seen in the pcap. I was less happy to see that iFix wasn’t in the list, and that Wireshark just treats it as plain TCP (again, with my older version of Wireshark, with its default set of dissectors, etc). Possibilities for expansion.

Expert Analysis

There’s a menu option for ‘Expert Analysis’ that I hadn’t played with before. Add its data, and then allow it to create filters to show just that data – voila. Evidence of TCP retransmissions? Yes, please.

My masters class had us writing Yara rules for our project lab. Given that I recently gave a talk at DataWorks MD that took a brief foray into describing the use of Yara rules for static malware analysis – well – I was prepared for and looking forward to this particular lab.

The challenging part of the lab: to help us understand how analysts decide which byte(s) to check for hex strings, the lab had use the Linux utility, hexeditor. As instructed, we were to

  • sudo hexeditor
  • use the keyboard’s arrows to navigate into a particular file
  • press Ctl-W to invoke ‘search’
  • use the arrows to navigate to the hex search option, as opposed to text search
  • type in the appropriate hex string. Note: the hex string could be longer than the editor would show us in its entry window. With a long enough string, we were then working blind with typos
  • if the hex string was found, jot down at what byte position so that we could later use that in our Yara rules

Bleah… Too many opportunities for typos. Too slow, as we needed to iterate across five files. _Really_ too slow when you consider we were doing this in a VM hosted on university infrastructure, using its GUI via NoMachine.

Improvement 1: sudo hexeditor filename at least got me into a particular file, and importantly, let my file history show me what files I had already interacted with.

I then looked for command-line options to target hexeditor with a search string. That would at least let me repeat previous commands and edit the filename or the hexstring. Unfortunately, hexeditor doesn’t support anything of that sort. grep would apparently have gotten me to whether the pattern existed in the file, but not given me the byte location.

Long-ish story short, although the lab itself had no reason to cause me to do this, and it certainly took me longer to work this out than to just hand jam it, I now have scripts to iterate over a set of files and a set of hex strings to determine if the hex string is represented in the files, and if so, where. My geek demon is satisfied this evening, and I’m holding onto the files here to help in CTFs or other future geekish fun. Credit to here for the general approach for finding hex data locations in files, and here for helping work out the problem of iterating over lines that contain spaces.

#!/bin/bash

# test_hex_find.sh
# Examine file for hex value
# Argument 1: file name to check
# Argument 2: hex string to look for

position=$(od -v -t x1 $1 | sed 's/[^ ]* * //' | tr '\012' ' ' | grep -b -i -o "$2" | sed 's/:.*//')

if [ ! -z "$position" ]
then	
  position=$(( position/3 ))

  echo "filename: $1, hex value: $2"
  printf '%06X\n' $position
fi
#!/bin/bash

# find_hex.sh
IFS=$'\n' hex_strings=( $(xargs -n1 <hex_strings.txt) )


for hex_string in ${hex_strings[@]}; do
	echo $hex_string
done

for file in *.exe; do
  for hex_string in ${hex_strings[@]}; do 
    ./test_hex_find.sh $file "$hex_string"
  done
done
"C6 45 F4 74 C6 45 F5 6C C6 45 F6 76 C6 45 F7 63 C6 45 F8 2E C6 45 F9 6E C6 45 FA 6C C6 45 FB 73"
"8A 04 17 8B FB 34 A7 46 88 02 83 C9 FF"
"5C EC AB AE 81 3C C9 BC D5 A5 42 F4 54 91 04 28 34 34 79 80 6F 71 D5 52 1E 2A 0D"

Yeah, this kind of joke is just my kind. Thank you, Ian Coldwater, for enlivening my day. Thinking about posting it at work, too.

As the leader of our Women In Tech group for work, I particularly appreciate the pun-blaming on MOM! All the better that it’s the capitalized, exclamation-pointed version.

While my thoughts are fresh on my latest CTF…

Pluses:

  • Throughout the event, in top 3. Currently in top 2, but closing out for the day to get other things done.
  • Figured out a few things: interrogating VMDKs via extracting them; linking up a shared drive in Kali
  • Had some success with python scripting to interrogate Word documents to find hidden data, as well as to find md5 and sha1 hashes. Sha-1 grep string was: ‘[0-9A-Fa-f]{40}’

Need to learn:

  • reverse engineering to interrogate malware or other executables
  • faster ways to traverse Wireshark data. Getting protocol statistics is a good starting point – want to get better there
  • executing random files – need VMs stood up for Windows to have them ready to roll…

Hmmm – I thought the CTF was closing out tonight, but it’s not until Sunday night. I need to carefully tread this, for the sake of my health and marriage..

Folks who are paying closer attention to this blog than it warrants may have caught notice of a link in the left navigation to a ‘Kubernetes 101‘ presentation. That link came about when I was asked a year or two ago to give a presentation at work on Kubernetes. I built the presentation deck based on a presentation I’d put together at a previous company which they were kind enough to give me access to again, and THAT presentation was a recap of some training materials I’d built out for a customer. So, I’ve gotten to present on Kubernetes a few times.

I’m now on my third project making use of Kubernetes, or k8s for short. The first go-round, I helped developers understand how to deploy things to it and someone else stood up and maintained the cluster. The second project, I built tools (“operators”) to run within k8s, as well as built scripts that automated the deployment of our clusters. This go-round, we’re using a new k8s distribution, with its own tooling for deployment and administration, and part of my role is to figure out whether our team found all the bits I’d been able to turn on in previous installations. (Auditing, for the record, is a good thing…). With each new project, k8s has matured and my angle for working with it has changed, so I get to learn and try new things.

That’s generally how software and systems development works… no one (or at least, vanishingly few) ever really knows a tool or language inside and out completely, particularly in connection with its full ecosystem. I’ve gotten to write Golang, Ansible, and Java (via k8s’s client-sdk). I’ve used REST APIs invoked via curl or hit the same endpoints using kubectl and its command syntax to interrogate k8s internal state. I’ve figured out how to query Prometheus using PromQL, and then how to interact with a time series database to which we’d exported the Prometheus data. Oh, and with each new release of k8s (they’re about to release 1.18), the capabilities and APIs change.

I got to interview an internship candidate today, and she (yay!) asked me what sorts of things you have to know to be a good candidate for our company. I told her a few of the technologies our current interns are using, but tried to make clear that the biggest thing about a career in technology is that you have to keep learning. That you have to keep humbly realizing you don’t (and can’t!) know it all. That you keep plugging away at deepening and widening your experience. That sometimes your experience tells you to bring in someone whose breadth and depth hits the problem from a different angle than your own.

Today was a fun day. Can’t wait to see what projects 3, 4, … and n, in k8s or other things, bring my way.

I spent most of last weekend at my alma mater, UMBC. Friday night, I met some new mentees through the CWIT mentoring program, and Saturday and Sunday were spent at HackUMBC. So, lots of opportunities to observe undergraduates in action and answer questions about what sorts of things my company does and who we hire.

The hackathon was a very interesting experience for me. Participants got started after lunch on Saturday and turned in their projects Sunday at 1. There was no guidance on what to build or who to build it with, other than that teams could consist of 1-4 participants. There were a few prizes offered by sponsors such as ourselves for which a team could go after – ours was for best data visualization but others sought best hack using Docker, best use of public financial data, or best use of Google Cloud Platform, just to name a few. There was nothing stopping a project from applying for multiple categories: I know we saw a project for our data visualization judging that used financial data and Docker containers – not sure if they hosted anything on Google Cloud Platform.

The goal of a hackathon isn’t only to win prizes, of course. It’s also supposed to give teams a chance to learn and apply new skills. The team that won our prize used Unity, a gaming engine. Other teams used d3.js or plot.ly or Google Maps + some HTML or even Minecraft (linking directly to that project – innovative idea). Some teams got farther than others: one team had a great concept and a locally installed Jupyter notebook (via Docker, if I remember correctly: check off a potential prize category) with a well-built out machine learning model that they could reason about and defend. But they just hadn’t gotten to hooking up their prototype UI to their data. Another team had a drop-down list to trigger a visualization, but could only as yet talk to their concept of the visualization. That didn’t win them our prize, but still gave those teams a good bit of interesting experiences to talk to us about.

Remember, these students had 24 hours to bring together a team, put together a project concept, and then execute on their concept. Now, I know practically that some of these folks team regularly together. And at least one team indicated they’d been scraping Twitter data ahead of the event to give them a leg up on building out their display that needed geo-located tweets. Still, though: I saw team formation happening in the hackathon Slack channel and at the tables in front of our sponsor area.

What was more amazing to me was that a few teams came up to our table and asked my guidance on what tools to use. Some of that happened late in the afternoon on Saturday. Meaning, they were picking their toolkits on the fly, and then building out their app without prior experience in at least portions of the stack. For a project that had a hard timeline, though admittedly loose requirements. Wow – the very thought gives me personally the shudders, were I in their shoes. Uh, I’d want to form my team knowing that folks had complementary skills that could come together to solve a generic set of problems. One team told me they didn’t know how to interact with databases and knew they wanted one, so they coded up a flat file database on the fly. I have to believe I’d have taken a different route, but kudos to them for pulling something off with it.

I’m trying to imagine how to use that hackathon idea for an event at my company or through BWIC. I’d have a hard time personally carving out a full weekend: attending the event during the day was a big enough lift, but many of the students stayed overnight. One indicated to me she’d had a great idea and burst of energy after her 20 minute power nap. Ugh. Been there, done that, don’t wanna go back! But maybe spreading it out over a week would work. Or constraining it to a day. It just looked like so much fun!