The Slightly Disgruntled Scientist

...now 7% more viral!


Valgrind and GDB: Tame the Wild C

| Comments

One thing I get asked a lot — almost daily, in fact — is: hey, why are you so amazing


…at identifying bugs related to undefined behaviour in C?

The answer is simple: by using Valgrind and GDB!

This tutorial is all about using Valgrind as part of your development workflow. Valgrind is an amazing tool for debugging, and I’ll start off by showing you what it actually does just as a standalone tool. From there I’ll show you how to use it in a systematic way to find errors via a debugger. Finally, you’ll see how you can actually add it to your code, so that you can catch runtime errors that might otherwise be concealed by the logic of your code.

So if, like me, you spend most of your waking life writing, maintaining and debugging embedded C code, then it’s time for you to crack open a console and put on your learning hat, and discover a few tricks that will make your life a great deal easier.

What are all these tools and concepts?

What is undefined behaviour?

This post assumes a basic level of knowledge about C, the standards that govern it, and the concept of undefined behaviour... but if you're new to these concepts, here's a quick summary and some references.

Unlike other languages languages (for example, Java), C programs are not required to keep runtime information about array bounds or whether memory accesses are valid. Neither are they required to initialise data to default values (except in very specific cases). If a programmer is not diligent about these things, their program can do something that is completely invalid — that is, undefined behaviour.

Undefined behaviour in a running C program means nothing less than: it is no longer possible to reason about your program. It simply isn’t. Cries of but it caaaaan’t be doing that! or shouldn’t x just be the last value? or I didn’t even have monkeys living in the server to begin with! mean nothing in the face of undefined behaviour.

This makes debugging very, very hard.

If you want to know more about undefined behaviour, refer to:

What is Valgrind? What is Memcheck?

Valgrind is not a single tool, but rather a set of tools for checking memory errors, cache usage, heap usage and other runtime behaviours, usually in C programs. This post focuses on Memcheck, a tool for identifying invalid or incorrect use of memory (stack or heap).

I am actually going to use the terms "Valgrind" and "Memcheck" interchangeably, since Memcheck is the default tool Valgrind uses when you run the command valgrind. Just be aware that there are other tools in there too.

The bugs that I’ve found using Valgrind are the worst of the worst, straight out of the C hall of shame. We’re talking about bugs that:

  • only appear on one person’s machine
  • seem to happen randomly, even in the same environment
  • don’t cause crashes, just give you the wrong output
  • crash, but the stack trace looks totally wrong (How did it crash there? I changed code somewhere else entirely!)
  • only occur at certain optimisation levels
  • only occur with newer compiler versions

Valgrind works by running your executable on a synthetic processer, and whichever tool you’ve selected inserts its own instrumentation code as it runs. You don’t need to recompile with Valgrind, or link with special libraries, or even run debugging builds (although it’s almost always the case that you should). Valgrind runs are significantly slower than normal runs though: about 50 times slower.

But since it cuts your debugging time down by a factor of about a thousand, it’s probably worth it.

The reality of basic science: technology is not alive

| Comments

This is a partial rebuttal of Matt Ridley’s The Myth of Basic Science, which makes the argument that technological progress is not driven by publicly funded scientific research (and presumably that we therefore don’t need it). I would like to focus on the claim that technology is akin to a living thing, and that because it is alive, it will inevitably progress whether basic science is funded or not.

Because that is bizarre.

For example, Ridley claims that:

technology is developing the kind of autonomy that hitherto characterized biological entities

No, it’s not.

Technology will find its inventors, rather than vice versa.

What does this even mean? What is the process by which this occurs? This really is starting to seem like personification taken way too literally.

By 2010, the Internet had roughly as many hyperlinks as the brain has synapses.

Rocks have many more atoms. Mycoplasma genitalium have many fewer genes. So what?

a significant proportion of the whispering in the cybersphere originates in programs […] rather than in people

None of that is occult, or beyond explanation, or even unexpected. Feeling mystical about programs you don’t understand doesn’t mean they’re anything like a living thing.

(Also, “cyber” — drink!)

Please, oh singularity, save us all from science writers and economists harping on about the “evolving living organism that is technium.”

Technology, even considered as a discrete entity, however you’d define it, is not alive. No, I don’t have a definition of “life.” You don’t either. But whatever it might be, it won’t include (a) rocks, (b) things made of rocks, (c) really intricate things made of rocks, or (d) abstract concepts.

Yes, I sometimes personify technology. No, that doesn’t mean I secretly think it’s alive.

Emergence

The concept Ridley is groping towards is that of emergence. Emergence happens when a system with simple rules and massive numbers of participants shows complex behaviour at a higher level. The behaviour of the system may be unpredictable and yet show little pockets of order (in short periods of time, or over short distances). Sometimes these pockets are ordered enough that we can model them with a new set of laws that have little to do with the microscopic ones… but we must always remember that we are still dealing with order emerging from chaos.

Board games, spots on a leopard, mathematics itself, Conway’s game of life, and the weather are all examples of emergence. So is the entire universe, since it’s made up of simple particles obeying simple rules, and yet shows every class of complex behaviour we know about, a lot of which we can simplify when we need to.

Life itself is an example of emergence, but here’s the important point: not all examples of emergence are alive.

Rubbish Review Debut: The Noontec N5 NAS

| Comments

I recently became the proud owner of a Noontec N5 network attached storage (NAS) enclosure. I bought it because I needed:

  1. Network access to the contents of a large hard drive.
  2. USB access to the contents of a large hard drive.

It’s hard to tell where to start with this amazing device, so let’s go with the all-important first impression. Nothing says factory quality control quite like a few dead cockroaches stuck to a random sticky pad inside the enclosure. From that point on, I knew I was in for a treat.

The cockroaches could not be removed. It’d probably void the warranty anyway.

Network setup

It assembled fine, so I powered it up and connected it to my network. It then insisted on hijacking my router’s IP address, acting as a DNS server, and generally screwing up my entire network. Seems reasonable. In order to access it I had to remove it from my network, connect a Linux box directly via ethernet, use ifconfig/route/etc to manually set up network access to it, and then configure it to not be monumentally stupid.

Easy as.

Then it was time to set up SMB. Seemed to go easy enough: my Mac machine could connect, my Windows 8 machine could connect, my Linux machine… not so much. I progressed through using smbclient, mount.cifs, and eventually even Wireshark to figure out what the problem was. You might think, “well, Linux has never been great at SMB, of course you need to do some work there.” But hold your judgement until you hear the problem: to authenticate SMB connections, the N5 uses NTLMv1. NTLMv1 has a number of terrific vulnerabilities that could be exploited by a 13 year old with a graphics calculator, so NTLMv2 was created in 1996 to address some of these issue. The N5 does not support NTLMv2. That is, the N5’s level of network security predates Internet Explorer v3.

No matter. I’ll just explicitly downgrade my security settings. Cool.

Side note: the N5’s web interface exposes all passwords in plain text. Super useful feature that.

During this process, by the way, I contacted Noontec for help. They have a website, of course — the support email address listed there is for another company and offer firmware downloads off Dropbox. Seems legit. When I contacted them via this address, they suggested I start by updating the firmware, and sent me a link to do so. The firmware completely changed the branding of the box (as reported by the web UI and network protocol responses). I initially worried about the potential for malware, but realised that even running a botnet off a NAS could only improve the functionality of the N5.

So now I can check off item one on my list, and all it took was manual network routing and byte-for-byte packet inspection. On to item two: USB access!

Qanda's Razor

| Comments

Imagine this situation: you and someone else, perhaps a friend or relative, are on different sides of a political issue. You both go to watch a debate, or panel discussion, or some similar public forum. You hear both sides argue their cases and it gradually occurs to you that it’s really one sided. The organisers have picked speakers for your side whose expertise isn’t really relevant, or who don’t really know what they’re talking about, or can’t really articulate a case.

You come out of the event, ready to say this, but your companion tells you first: “that was really one sided. They really set up my side to fail.”

How could both of you feel this way? Is it necessarily the case that one of you is right about the debate having an agenda, and one of you is wrong?

Well, maybe, sometimes. But I think we often leap to this conclusion far more often than it actually applies. I think what’s also likely is something I call the Qanda Illusion.

The illusion

The Qanda illusion applies to a situation where a debate, forum, discussion etc. features such poor presentation of both sides of the argument that people on either side will see bias against them.

This happens because there’s a fair chance you don’t know the opposing case as well as your own. You’ll notice every time your own advocates screw up, but you won’t notice the other side’s omissions. You will assume that the other side is fully utilising the chance to present its best arguments. You’ll hear the same arguments you’ve rebutted in your own head a thousand times, and wonder why no one on your side is addressing them.

But anyone on the other side of this will see exactly the same problem applied to their case!

The illusion is that although the debate appears to be skewed, there is no bias. There is only the pretence of evidence or other information and a failure to deliver all around.