writing safe code

January 27, 2016

What with the all the hoopla these days about the Internet of Things and the simply horrible (horrible, I tell you!) lack of security in many of these devices, I thought I’d throw my two cents into the discussion about simple but effective things you can do to help make such devices more secure. No, really. I’ll start my pontificating off by doing what I do best, writing a small, sloppy C++ program to illustrate a point.

#include <cstring>
#include <iostream>

int main() {
	char *inputBuffer = new char[10];
	char *inputBuffer2 = new char[10];

	std::cout << "Type something: ";
	gets(inputBuffer2);
	std::cout << "You typed 1st: " << inputBuffer2 << " : " << std::strlen(inputBuffer2) << std::endl;

	std::cout << "Type something: ";
	gets(inputBuffer);
	std::cout << "You typed 2nd: " << inputBuffer << " : " << std::strlen(inputBuffer) << std::endl;
	std::cout << "You typed 1st: " << inputBuffer2 << " : " << std::strlen(inputBuffer2) << std::endl;

	return 0;
}

The program throws out prompts asking you to enter strings of text. It places the string input into two character arrays, inputBuffer and inputBuffer2, using standard C’s gets(), one of the most unsafe ways to do this. As a consequence this program is Unsafe; if compiled with gcc/g++, the resultant compiled application will actually issue a “warning: this program uses gets(), which is unsafe.” warning when executed. Programmers who use standard C’s gets() are just begging to get mugged on dark streets in the big city.

The single biggest problem with this program is it’s using standard C’s gets() to read data and to place it into the two arrays. Why is that a problem you innocently ask? Both arrays are limited to 10 characters each. Standard C’s gets() doesn’t care one wit how many you enter. You can blow right past that limit by typing any string greater than 10 characters and gets() will happily put all that input into memory, starting with the zeroeth character location of both arrays. Absolutely no attempt is made to limit input. Now watch what happens when my small sloppy example’s run.

Type something: first input string
You typed 1st: first input string : 18
Type something: SECOND INPUT STRING
You typed 2nd: SECOND INPUT STRING : 19
You typed 1st: ING : 3

Oh my goodness, what happened? Because of the way I allocated those character buffers, and then the way I used them, my second input string overwrote my first. In other words my second input string data corrupted my first input string. Thus printing my first input string actually printed the last three characters of the second inputed string. And that’s because both character arrays were allocated with only ten characters each. How should it look if it were safe? Let’s look at the same code, but written not nearly as sloppily. I’ll replace the bare allocated character arrays with instances of C++’s std::string and use C++’s getline() standard library call.

#include <iostream>

int main() {
	std::string inputBuffer;
	std::string inputBuffer2;

	std::cout << "Type something: ";
	getline(std::cin, inputBuffer2);
	std::cout << "You typed 1st: " << inputBuffer2 << " : " << inputBuffer2.length() << std::endl;

	std::cout << "Type something: ";
	getline(std::cin, inputBuffer);
	std::cout << "You typed 2nd: " << inputBuffer << " : " << inputBuffer.length() << std::endl;
	std::cout << "You typed 1st: " << inputBuffer2 << " : " << inputBuffer2.length() << std::endl;

	return 0;
}

Now let’s run that code and see what happens:

Type something: first input string
You typed 1st: first input string : 18
Type something: SECOND INPUT STRING
You typed 2nd: SECOND INPUT STRING : 19
You typed 1st: first input string : 18

Now the strings are behaving the way you’d expect them to. Both are read in and neither one overwrites the other. The key to this is reasonable resource management. I used the C++ string class to provide simple but effective character resource management. No matter how many characters I typed in at the prompts, both string instances would have provided the necessary space internal to each instance to hold those characters, without having to worry that one would corrupt the other. So remember folks, in spite of what Linus says about C++, it’s actually better to use C++ than bare-bones C, even if fgets() is better than gets() in standard C.

I’m writing about this specific kind of programming because a lot of security exploits occur when resources aren’t properly managed and buffer overflows occur. Buffer overflows can result in everything from data corruption to the execution of arbitrary code, leading to permission escalation and all sorts of mean nasty things.

Good resource management can take you a long way in programming. Now if you can just remember not to hard code back doors into the code…