Sunday, November 2, 2014

Security Analysis of a Glacier Backup tool for Windows

Ok, so my career has shifted more to mobile app development and security. I've been looking into ways to backup personal stuff on my personal, non-work laptop, and I like the idea of using Amazon Glacier. It's cheap, and reasonably available (if you can wait half a day while they fetch your archives, that is). No problem. Just what I was looking for!

I found a great Windows client for Amazon Glacier. It has all the features I need, but in this day and age, one has to be increasingly suspicious of apps that might also contain malware. So, in this post, I want to talk about some skills I've picked up in the last 10 years at my previous job that are paying dividends in checking software to determine whether I trust it or not.

Note that this is fairly high level, except for the parts pertaining to reverse engineering .NET code.

First step in malware analysis. Search online! Has anyone already done this? No? Ok, step 2...

Download the app. You need to setup a virtual machine to run it in so that you can isolate your "real" computer from any potential malware infection. There are numerous ways to do this, and I'll mention a few and let you research what works for you. If you can install a copy of Windows inside a virtual machine, then you can play with it there (using Hyper-V, VMWare, VirtualBox, etc). You can also spin up a Windows copy in the cloud (using Windows Azure, Amazon AWS, Rackspace, etc). That will cost some money, but is a simple way to get Windows if you don't have a license to activate a local copy in a VM. If you work for a software engineering firm, you can almost certainly get a Windows key through your MSDN subscription. Ask around in your IT department. Students, same goes for you at many universities. Also, look into the DreamSpark program.

Now, install the app on your isolated environment. If it's an MSI file, you can open that and inspect the various tables of custom actions, install files, registry changes, etc using a tool called "Orca" from Microsoft. If you have Visual Studio and/or the Windows SDK installed, you can probably find an MSI file for Orca already on your hard drive.

If the installer is not MSI based (or even if it is and you want to be paranoid or don't know enough about MSI technologies to benefit from pre-install static analysis), then fire up SysInternals' ProcMon. You can log all registry, file, and TCP/IP activity for any process on your system. If you run that while the installer does its work, you'll have a good idea of whether it's REALLY doing what it says it is doing. Plus, you'll know where on the system to find the installed app (it may not just be in Program Files, for example) to begin studying further.

At this point, my particular app was discovered to be a .NET windows desktop application (get a good PE information utility to look at your file, and you can tell that way...or also if you open ILDASM and attempt to disassemble the code, it will inform you when it is not a .NET DLL/EXE file). So, I immediately opened ILDASM and tried to convert it from a binary to MSIL source code. I got this cryptic message saying "Protected module -- cannot disassemble". What?!? That was a new one to me. After some googling, I discovered you can set an attribute in your .NET assemblies that tells ILDASM not to disassemble it. Well...that keeps away the newbs, but all you have to do is undo that attribute and you'll be able to disassemble. Use some kind of PE file editor to zero out the attribute as described here.

With the disassembly, you'll likely notice that the code was obfuscated. This means that, during the build process, a tool was used to take the nice meaningful names that classes, methods, fields etc have in .NET and mangle them into 1 and 2 letter long cryptic names. In other words, the .NET metadata is obscured so as not to let you infer meaning from a class named "InstallMalwareOnBackGroundThread". :) Instead it will be named "nca" or "pbe2" or something!

In addition to this, many obfuscation tools screw up the MSIL sequence so that instead of just loading a string on the stack, you load an int, call an obscure method that returns a string, and then call the actual method of interest. Plus dozens of other annoying things...

But for us, that's no problem. Although seeing the code would be nice, my main focus is to know what .NET base class libraries the tool interacts with. This is often enough to infer whether you can reasonably trust a program or not.

So, I searched the IL code for "call" and "callvirt" to make a list of functions called within the program. I then further narrowed that list down to things that started with "System." to know what base class libraries are used. I came away with this list:

System.Management (WMI)
System.Net (probably HTTP/TCP stuff mostly, plus transparent SSL support by .NET)
System.IO (file/disk)
Microsoft.Win32 (registry access)
System.Security.Cryptography (encryption used by the app)

Ok, those are all somewhat reasonable things for my Glacier app to do in certain cases. Let's make sure they're being used properly.

So, I launched Sysinternals' ProcMon and began monitoring file/registry/networking for my program (filtering to include processes that start with the program's name). I also opened SysInternals' DebugMon tool in Administrator mode, and listened for Global Win32 messages. I opened the app, and gave code execution control to this new acquaintance of a tool! Shudder...

I then studied what happened at startup before I ever clicked around in the tool. A roaming profile was created, a log file was created, some non-threatening things were stored in the registry, and some polling thread kept looking for a queue.xml file on disk. Ok, whatever. I then attached to the process with WinDbg and ran the command ".dump /ma C:\Dumps\myfile.dmp". Then I shut the tool down, and shut down my monitoring tools.

The app's own log file had some nice helper debug messages for what it was up to, but of course those shouldn't be 100% trusted. I also looked in the queue file, but nothing interesting was there (presumably because I wasn't trying to download or upload to glacier).

Next, I decided a good old fashioned string search on the file might be handy. But this one is obfuscated, so I decided to search the dump file instead. Dump files are disk copies of exactly how memory was in the process at the moment you executed the dump. So, I loaded the dump file in WinDbg, loaded SOS (I knew the app was 64 bit, because it was in Program Files and not Program Files (x86), and I knew it was .NET 2.0 from ILDASM's output) from Framework64\v2.xxxxx\sos.dll. Then, I ran "!DumpHeap -strings". That produces an output of about 2-3 thousand strings, truncated after the first 100 characters of each line or so. Perusing through these led to some interesting insights.

For example:
"Unable to decrypt your Access Keys (wrong password?)"
Hmm...apparently my credentials are protected by password using symmetric cryptography on disk somewhere.

Perhaps these algorithms are used to store the Amazon credentials? Kind of funny, because elsewhere in the tool is an options screen where you can encrypt the files going to Glacier as AES 256, which is the best choice. Why use Triple DES I wonder? Or freaking RC2? Really?

"SELECT * From Win32_OperatingSystem"
Obviously a WMI query for what version of Windows you're on...

"SELECT * From Win32_processor"
A WMI query for processor info (maybe used to properly thread and throttle your Glacier access?)

May use WCF services...

What the heck? This is nowhere in the UI.

Perhaps uses a local SQL DB? Hmm...

I decided that SSL might get in the way of good information gathering, so I opted for making a new file named "myapp.exe.config" that contained the tracing found on this website. This let me log to a file on disk, BEFORE SSL kicks in! Sweet! I could see everything going to/from Glacier and validate it was what I wanted going across the wire.

At this point, I knew what kinds of things I wanted to look for, so I continued to fine tune my ProcMon filters for file/registry/network activity to alert me of unusual things and filter out expected things. Continuing to click throughout the app, I came to the conclusion I could trust this one. Here are my general findings.

The only web URLs it used were directed to Amazon AWS IP's or DNS names.
The log file only sent data I told it to over to Amazon.
My credentials were stored with...adequate, TripleDES crypto.
WMI was used appropriately, changing behavior based on Windows 8 vs 7, or single/multi core.
The local SQL usage was never found. Maybe the app doesn't use a SQL DB? This was a risk point, but one I decided was worth taking.
The app never contacted Dropbox. I have no idea why that string is in the code. I'm sufficiently convinced the app won't upload my keys to his dropbox, so I'm going to let this one go.

So in all, the app passes! And I have some peace of mind now about using it. At least far more than I originally did. I hope this is helpful to you. What RE tools do you use to measure whether you trust an app or not? I'd love to hear your thoughts in the comments!

Saturday, December 7, 2013

Off-topic: Wiping your hard drive

I want to document how I did this in case I need it in the future. This is a good way to wipe an entire drive quickly with any Linux LiveCD or USB stick. It only uses a program called dd.

Most computers, the primary hard drive is /dev/hda. So, to wipe it, issue this command:

dd if=/dev/zero of=/dev/sda bs=32M

Note that 32MB is a good buffer size for most modern drives. I'm getting 100 MB/s speed wiping using this. For a 500GB drive, that's taking me around 1 hour to wipe.

You can optionally check for a status in a separate terminal tab like so:
while true; do kill -USR1 1234; sleep 5; done

Replace "1234" with the PID of dd (which you can get with a "ps -A | grep dd"). When you flip back to the tab where dd is running, you will get a status update every five seconds. It tells you how much data has been copied, as well as your bytes / sec copied.


Wednesday, July 31, 2013

Off Topic: Supporting Javascript you didn't write

Hi everyone,

Today's topic doesn't deal with .NET / CLR topics at all. This blog tends to focus on that. But since most .NET developers I know tend to use JS/HTML/CSS with .NET (or ASP.NET), this post should be somewhat beneficial to the "usual" reader of this blog.

I am a support engineer. That means I often get asked to fix code I didn't write. If it's .NET code, I usually turn to debugger breakpoints and/or tracepoints (good info about those here if you are not familiar) or occasionally use a .NET profiler. The first goal behind these tools is often to answer the question "where am I in the code base?" By knowing what code at the function or line level is involved, you can narrow down what things might be broken as you troubleshoot.

And honestly, that is the name of the game with all debugging / troubleshooting, regardless of language or environment. We have a product with a fairly large javascript code base that is difficult for me to keep up with, because I only find myself troubleshooting problems related to it once every 3-6 months. As a result, I often have a large "learning curve" at the beginning of my support issue, as I find out what has changed and what has stayed the same in the code base. Tools like Firebug for Firefox or Chrome's JS debugger are certainly helpful, but sometimes it takes a lot of effort just to figure out what parts of the code base are involved when you click a button, or go to page xyz. You can always view the HTML and see what JS is running, but if the product is designed with a lot of architectural "fluff", you have a hard time deciphering what exactly is going to happen when you click the button. As an example, maybe the button runs a JS function named processAction() that takes a GUID: processCommand('12345-123-123-1234...');

Well that's just not helpful. :) How do I know what that GUID is related to? In our product, it changes every time you hit the page! Ugh.

Enter, JSCoverage. I recently found a great Chrome plugin that can be used to identify functions (or even lines of code) that execute whenever you navigate somehow in your product. This manual page gives you a rundown of how it works. It essentially boils down to these high level steps:

1) Instrument one or more .js source files. These are output in a folder of your choosing.
2) Make sure your output folder is hosted by a web server (such as IIS, aka mapped to a virtual directory, or in a subdir of a virtual directcory).
3) Navigate to the generated jscoverage.html file.
4) Use the Browser tab to launch your app in a separate window or tab.
5) The jscoverage.html tab lets you view live information about coverage as you perform actions in the app launched in step 4.

The architecture is fairly simple. There is a global object named _$jscoverage that contains counters for various JS source code files and different blocks within those files.


All of the counters start at 0 and go up from there. The great thing about this architecture is that you can exploit it in a way that the original author may or may not have intended! The _$jscoverage object is only initialized once. Subsequent loads of the page or JS actions/functions that execute just accumulate more hit counts on the counters. They don't clear out. QA people who are interested in *total* code coverage probably like this, but a support developer may be more interested in precise coverage of one or two behaviors instead of all behaviors performed in the web browsing session. So, we can use Chrome's Console to execute some ad-hoc javascript that clears the counters, like so:

for(var propName in _$jscoverage['my-source-file.js']) {
_$jscoverage['my-source-file.js'][propName] = 0;

Using this technique, we can clear the counters immediately prior to performing our action of interest. Then, once the action completes, switch back to the jscoverage pane to view coverage data for just that action. You may need to navigate off and back onto the Summary tab to see the results update.

For me, this technique greatly simplifies the support process. Now after reproducing my issue, I know that the problem lies within 10 functions instead of 1000 potential functions! That's a money maker, folks. Now, I know where to place breakpoints via Chrome's JS debugger to get the most mileage out of my support discovery process! Some days I just love my job. :)

Many thanks to the developers of JSCoverage.

Thursday, July 18, 2013

System.Xml ImportNode and the .NET Heap(s)

If you work with XML in .NET a lot, you have probably bumped into a scenario a few times in your career where you need to move or copy some XML fragment out of one document and into another. The typical coding sequence for that might look something like this:

XmlDocument source = new XmlDocument();

XmlDocument dest = new XmlDocument();

//move from source to dest
XmlNode n = source.SelectSingleNode("/source/myxml");
XmlNode nImported = dest.ImportNode(n, true);
XmlNode nAppended = dest.DocumentElement.AppendChild(nImported);

I have often wondered (but, until today, been too lazy to investigate) why ImportNode() and AppendChild() return XmlNode objects, and what one is supposed to do with them? Are they the same? Clones of the same? Etc.

Well, IronPython happens to have this nice feature where it shows you the memory address of the .NET object you reference in an expression via the ipy console.

Excerpt from an IronPython Console session

Wednesday, August 1, 2012

RedGate activation blues

So, my company makes use of the RedGate SQL Compare SDK tools. And let me just say, if you haven't bought these yet...totally worth doing. If you've ever wished for SQL functionality related to diff, merge, etc on your schema structures or your data itself, RedGate can do it. Period. I drink their kool-aid.

But this week, I hit a snag as I upgraded to a new developer laptop. It came time to re-activate my RedGate DLLs so that I could use them in development projects in Visual Studio. Typically, you simply compile the project, and a little dialog box asking you to activate your trial install of RedGate pops up while Visual Studio builds. Our projects wouldn't popup the dialog box. Instead, we received a build error in the Error List window:

Error      2             Exception occurred creating type 'RedGate.SQLDataCompare.Engine.ComparisonSession, RedGate.SQLDataCompare.Engine, Version=, Culture=neutral, PublicKeyToken=7f465a1c156d4d57' System.IO.FileNotFoundException: Could not load file or assembly 'file:///C:\Src\...\Bin\RedGate.Licensing.Client.UI.resources.dll' or one of its dependencies. The system cannot find the file specified.               C:\Src\...\licenses.licx     2

("Real" paths omitted to protect what I work on)

So, I searched for this satellite assembly on my entire hard drive. It was nowhere to be found! Now what do I do?

I had never used RedGate support before, so I decided to give it a try. I e-mailed, and explained my predicament. Within 24 hours, I received a response from their internal support team.

"Thanks for contacting Red Gate. The issue you've run into is that the Windows Form that asks for a serial number was written against .NET v2 and therefore needs to be invoked using a .NET v2 resource reader. There is no satellite DLL that is missing.
To make a long story short, create a new VS project and ensure the target Framework version is 2. Then add in a sample application and build it. The serial number dialog should pop up and ask for your serial number. Once you have activated successfully, you can then go back and compile your .NET 4 projects, since the licence has been created and there is no more need to ask for the serial number again."

Sure enough, my code was based on .NET 4 and therefore was not allowing RedGate's licensing mechanism to work properly. As suggested, I used a .NET v2 project to activate the copy. Another coworker pointed out that their installer places some samples at this path:

C:\Program Files (x86)\Red Gate\SQL Comparison SDK 10\Samples\Automating SQL Compare

So, I opened one of those, built, was prompted for my license, activated, and the sample built successfully.  I then opened my original project (.NET v4), and compiled it successfully! Yes!

So, I was unable to find much online about this issue. I wanted to blog so that others might experience less pain. This is for RedGate v 10.x, and possibly other versions.

Wednesday, July 13, 2011

ScintillaNET starter kit

I have searched long and hard across the Internet for a simple "hello world" style example of using ScintillaNET's control. This thing has it all. You name the language, and it will give you a textbox that can syntax highlight for that language, plus dozens of other features like code folding and so forth.
But how to use it? For those of us without much "unmanaged code" experience, the Scintilla web site's documentation can be very initimidating. ScintillaNET's documentation (at least at this point in time) is fairly thin. It simply says "for the details, look at Scintilla's documentation". Today I discovered the reason for that--it's actually surprisingly easy to use the ScintillaNET control! With a very flexible licensing agreement, it's a wonder this thing hasn't gone mainstream in the .NET community. Perhaps with this simple example, I can get that started.

I used Visual Studio 2010 to do this, but any version for .NET 2.0 or higher should suffice.

Step 1): Get the source code, build ScintillaNET, and reference ScintillaNET.dll from a Windows Forms project. (Note: WPF may or may not work with the WindowsFormHost XAML tag allowing Windows Forms controls to be loaded into the XAML structure. I haven't tried that.)

Step 2): Go to your visual designer, and look in the toolbox. There is a control named "Scintilla". Drag that to your form.

Step 3): You'll need to tell it what language to use for syntax highlighting. Scintilla supports a TON of languages:
In my case, I was making an XML editing control, so I chose to set the language to "xml" in my form's onload event (note, in this code snippet, scnMain is the name of the Scintilla control I dragged onto the form):
private void ctlScintillaNETXml_Load(object sender, EventArgs e)
scnMain.ConfigurationManager.Language = "xml";

Step 4): Set or get scnMain.Text as appropriate to get a string of code (or in my case, XML markup). That's it! Fire it up and watch the awesomeness that is Scintilla. How easy was that? Enjoy.

Friday, December 3, 2010

Powershell Functions - Evil Calling Convention Problem

In powershell, one might think that calling a function would look like this:

FindReplaceMany_Directory($SubDir, $fileType, $recursive, $findArray, $replaceArray)

given a function like this:
#Find replace files of a certain file type in a given directory.
#$recursive = true to parse sub-dirs as well
#$fileType = ""*.*"", ""*.txt"", etc.
function FindReplaceMany_Directory($dir, $fileType, $recursive, $findArray, $replaceArray)
... (omitted) ...

This is not the correct syntax. But surprisingly, it still works and ONLY the first parameter gets a proper value. Everything else gets a null.

To properly call this function, do this:
FindReplaceMany_Directory $SubDir $fileType $recursive $findArray $replaceArray