I don’t think anyone in the Digital Forensics world would dispute that Python is the most used language in forensic programming today. In fact, many of its more fanatical followers frequently remind us of its ostensibly long list of superior characteristics. To the extent I think sometimes people might forget that there exists other programming languages at all. Recognizing this, I knew I wanted to write a post discussing one of my favorite technologies — C# and .NET as a whole — but I could think of no better guest contributor to bring into that conversation than Eric Zimmerman, one of if not the most household name in forensic coding, and and a staunch supporter of the tech.
Eric is the mastermind behind KAPE, Registry Explorer, JumpList Explorer, AmCacheParser, and so many more. Like many readers, I was introduced to Eric’s work early on in my forensics career – right at the beginning, in fact, as part of the curriculum of my “forensics 101” course at the Canadian Police College. I am honored to chat with him about one of my favorite subjects!
forensicmike1: Thanks so much for taking part in this conversation Eric! I am curious to hear what brought you into the .NET world initially, and what is it that’s kept you there for all these years?
Eric Zimmerman: I initially started my development career in Access. When I outgrew that, I moved on to VB6 (way back in the pre-.NET days). Once .NET came out, I slowly switched to VB.NET because I already knew VB. I always wanted to do C#, but did not want to have to re-learn thing so I held onto it for a long time. In fact, osTriage v1 and 2 were both written in VB! Soon after osTriage v2 came out, I decided to force myself into C# for a few projects and I have never looked back from that point.
So for me, it is a matter of wanting to use a first class language on the platform I deal with the most, which is Windows. I am a big believer in the concept of doing Windows forensics on Windows, Mac forensics on a Mac, and so on. You are just asking for issues when you do not do things this way. For example, a very popular method for accessing volume shadow copies for Windows does not, at least in some cases, present the data for access the same way as native methods in Windows does. This leads to corrupt files being exported and obviously, that’s a problem when it comes time to process them. Does this happen all the time? No, but even once is enough that I would be hesitant to trust that method in any case that matters, unless I also verified getting the data in exactly the same way from Windows natively. At this point however, you are now doubling your work, so why bother with the non-Windows method at all?
I stay with .NET because it’s what I know and what works for a wide range of needs. I know it’s not going anywhere, and it has great IDEs and other resources for efficient development, debugging, logging, and so on.
The other huge advantage is it’s range of 3rd party controls that just do not exist anywhere else for creating amazing graphical user interfaces (GUIs). Things like grids, tree views, and a ton of other controls I use in my GUIs aren’t available so I wouldn’t be able to write something like RegistryExplorer in Python — and if I did it wouldn’t do what it can do on the Windows side.
forensicmike1: I couldn’t agree more! And I’ve seen this happen over and over to people as they make their way to C#. Forensically speaking, can you think of any other advantages to writing code in .NET?
Eric Zimmerman: With .NET, I know the runtime I need is going to be in place by default — or will be in the vast majority of cases. I do not have to worry about making a self-contained executable, or not handling Unicode correctly, or not being able to install something where I need it.
Going back to what I said earlier, I feel you should do Windows forensics on a Windows box, so this makes things a lot easier for end users of my software. With my stuff you can download and unzip my programs on any machine and it will most likely work the first time without issues. This can be on a forensics box doing dead box work, or live response stuff against a running system in the field.
Speed is also a big thing for me. I tend to do a lot of work to tune my code so that it is, first and foremost, as accurate as it can be. Once this is done, I tune for performance. As the old saying goes, speed is fine, but accuracy is final. When you look at forensics programs written in other languages (Rust being an exception that comes to mind), the performance is often terrible and it takes a lot of work to get the environment ready to even run an application. Sure, the developer can do some work to package a Perl or Python program into a self contained Windows executable, but that process can be painful and it still does not address the performance issues. Can performant code be written in Python? Maybe, but it involves redoing parts in Cython, or writing critical sections in C++ and so on. So while it is possible, to me it’s just not worth it, especially in light of the issues I mentioned above. Getting accurate data is of course paramount, so even one time where you might not get that accurate data is one too many to take the chance.
When writing forensic tools that target Windows artifacts, what Windows does and says should be the target we aim for. If you can exceed what Windows lets you see and do, all the better. Shedding light on data in a different way is always a good thing, but not at the expense of excluding or missing things (or the risk of doing so).
At the end of the day, I would rather my code run amazingly well on one platform, than poorly on five platforms.
forensicmike1: Aside from not many people in forensics being familiar, can you speak to any disadvantages?
Eric Zimmerman: The funny thing about that is, most people are using .NET all over the place every day if they use a Windows box. Just because they may not be aware of it, does not mean it isn’t there.
I don’t really see any disadvantages for it in the tool chains I design and use, but obviously it has been an issue in the past of being able to run .NET code on non-Windows platforms. This is becoming less and less of an issue with Microsoft becoming more involved in the open source world — remember that .NET Core is open source now — and this is furthered by being able to run PowerShell on Linux too.
So at some point in the not so distant future, the code I write would be cross platform (atleast the CLI ones). In some cases, the code can already run on .NET core and Standard. The big hold up for me personally in this regard is that .NET Core and Standard do not have a seamless way to make a single executable for each platform. I hate distributing 38 DLLs and the executable for programs, so until I can do this on Linux or a Mac the same way I can on Windows (i.e. giving you a single executable to run) I won’t be doing cross platform stuff full time.
For a lot of people, the biggest hurdle people have when it comes to using .NET is not a technical one, but rather bias towards Microsoft or Windows for some reason. Given how easy it is to stand up a VM these days, excuses like “I can’t run X because it is Windows only” just shouldn’t be a valid excuse anymore.
forensicmike1: Do you think programming is a legitimate specialization within the field of Digital Forensics — or is it something every examiner should atleast dabble in at this point?
Eric Zimmerman: Well, I don’t know if it’s forensic programming that is a specialty, or the ability to program in a way that is necessary for use in the kinds of work we do in forensics that is more important. In other words, you do not have to be IN forensics to be able to look at programming in the way I am speaking of. What does this look like in practical terms? For me, it means failing early and often (i.e. NEVER, EVER eat error messages or other “unknown” conditions), programming defensively (i.e. protecting the end user from themselves to a degree), sanitizing input, providing the ability to see diagnostic and trace messages for debugging purposes, robust output options, and so on. (Forensicmike1: This is great advice and I hope some vendors are reading!)
Not everyone is wired to be able to program at higher levels and I am certainly no expert in the field. In fact, not even 10 years ago I started looking for a way to process LNK files natively in one of my live response programs. Looking at a LNK file in a hex editor, I said to myself “I would never be able to program something to read these things”, but now I have native parsers for just about every key Windows artifact out there — all of which I did in C#. I learned how to code and parse things partly out of necessity (they didn’t exist prior to my work) or because the existing tools did not do the job (incomplete, inaccurate, slow, etc) and I thought I could do better. Of course, curiosity and wanting to solve a problem comes into it too (I do not want to even think about how many hours I have spent looking at shellbags).
With that said, no one is expected to walk into DFIR and be able to write a forensic parser for an artifact on day one. In fact, most people just don’t have a reason to do so. It is certainly beneficial to have at least some level of proficiency with programming so you can whip up some code to automate the mundate though, so this is a good reason to atleast get familiar with something like PowerShell, C#, Python, etc., even if it is limited to looping over thousands of log files looking for things and saving yourself the pain of doing it manually.
In your view are the major forensic software vendors doing enough to provide ways for established developers who do forensics as a primary job to integrate their creations? If not, any thoughts on what they could do better?
Eric Zimmerman: This is a tough one because of the different languages vendors write their programs in. Does a vendor use .NET, C++, or Delphi? Each in turn would have different ways for external users to hook into it when writing code.
My suggestion to vendors is to provide the ability to write plugins that can be used by the vendor’s product. X-Ways for example has an API that let’s you write such things. Several of my tools do as well (plugins in Registry tools, maps in event logs, targets and modules in KAPE).
(Forensicmike1: Funny that the vendor that uses Delphi is also the only one who has done any .NET Plugin work!)
The other avenue is to come up with a non-programming means (or a balance of programming and non-programming) to interact with and extend programs. Things like maps in EvtxECmd or batch file mode in RECmd are good examples here. Both allow end users to wield the capabilities of tools and extend them as far as they see fit, all without me being involved.
I think the biggest benefit for end users is designing open ended and extensible tools that people can then take to places the developers never thought of before. It is pretty cool to hear about some of the use cases and ways people have put my stuff to use. They find all kinds of new uses and ways to do things I never envisioned when I designed the programs.
By doing this, it’s not about the author of the program anymore, but rather it’s about the end-user and making their job easier, the data more clear, the work more efficient, and so on. Letting the end-user reduce the noise in order to find the signal THEY want to find is what is important.
forensicmike1: Final word goes to you- Any advice for up-and-coming forensic coders who may be hesitant to share their work with the world?
Eric Zimmerman: Throw that code out there! Remember, there will always be a first for everything and you were not good at anything the first time you did it (or even the first 100 times!). Put that work out there, get it into people’s hands, let them play with it, make suggestions, break it, and so on.
Do not let anyone tell you anything in this space is a “solved problem” because the best way by far to learn about an artifact is to write a parser for it. And you never know, you may just find long standing bugs in major products that people have just taken for granted and assumed were right for the past 20 years.
Even if no one ever uses your code on a case, the fact that you created something from nothing is a great feeling. Seeing your code do what you intended it to do, seeing all your unit tests pass for the first time, seeing the output come out of a program you wrote from start to finish is a magical thing. It still excites me when I get into a new project.
Share that code, talk about that project, seek out the experts in your field to review and help and provide feedback. I cannot tell you how valuable peers are to bounce ideas off of, test things, and push my ideas to even better places. Two people (among many) that come to mind for me and have done these kinds of things hundreds of times for me over the years are David Cowen and Matt Seyer. Why are they in a position to do this? Because they too took that chance way back in the day to put out code, take a risk, be vulnerable, and EXPLORE THAT DATA in an effort to understand how it works, why it works, and the best ways to leverage that data to help us tell the story of what happened on a computer. As Matt and I like to say, “Every byte counts!”. There is a reason for them to be there. Seek to find out exactly why they are there.
So, in summation, my advice would be:
- Take calculated risks.
- Learn from your mistakes.
- Leverage peers.
- Move the ball forward.
- Leave things better than you found them.