How to Dereference a nullptr without Crashing or Why You Should Check Pointers for nullptr
It “Works”
Let’s say there is a factory in your code that produces objects based on the requested type:
enum FOO_TYPE {
TYPE_A = 0,
TYPE_B
};
class FooFactory {
public:
std::unique_ptr<Foo> createFoo(const FOO_TYPE type) {
switch (type) {
case TYPE_A:
return std::make_unique<FooA>();
case TYPE_B:
return std::make_unique<FooB>();
default:
return nullptr;
}
return nullptr;
}
};
This looks pretty legitimate and does the job. If an incorrect type is
provided, the factory returns nullptr
.
The type of the produced object may be read from a network buffer that assumes
converting an integer to FOO_TYPE
by some legacy code in your code base. For
some reason, it doesn’t validate that the integer value actually corresponds to
FOO_TYPE
.
Let’s take a look at our Foo
, FooA
, and FooB
implementations:
class Foo {
public:
Foo(const int f) : foo{f} {}
int getSomeValue() const { return 42; }
private:
int foo;
};
class FooA : public Foo {
public:
FooA() : Foo(314) {}
private:
int bar{7};
};
class FooB : public Foo {
public:
FooB() : Foo(27) {}
private:
std::string baz{"some data"};
};
Using inheritance to inherit data isn’t the best pattern, but we still can encounter it in the wild.
Factory client code can look like this:
FooFactory f{};
std::cout << "TYPE_A: " << f.createFoo(TYPE_A)->getSomeValue() << std::endl;
std::cout << "TYPE_B: "<< f.createFoo(TYPE_B)->getSomeValue() << std::endl;
std::cout << "TYPE_?: "<< f.createFoo(static_cast<FOO_TYPE>(TYPE_B + 1))->getSomeValue() << std::endl;
In the case when the input is correct, everything works as expected:
TYPE_A: 42
TYPE_B: 42
TYPE_?: 42
But wait… Something is wrong. Shouldn’t we get a crash here because of
nullptr
dereference?
f.createFoo(static_cast<FOO_TYPE>(TYPE_B + 1))->getSomeValue()
No, It Doesn’t
The Foo
code contains one thing that can be easily missed: the
Foo::getSomeValue()
method always returns a constant value and doesn’t use
this
to access data members.
int getSomeValue() const { return 42; }
Assembly generated for this function:
Foo::getSomeValue() const:
push rbp
mov rbp, rsp
mov QWORD PTR [rbp-8], rdi
mov eax, 42 <-- Putting constant to a return value
pop rbp
ret
A single little change makes it crash - accessing the data member:
int getSomeValue() const { return foo; }
After this change the expected crash appears:
Program returned: 139
Program stdout
TYPE_A: 314
TYPE_B: 27
Program stderr
Program terminated with signal: SIGSEGV
Let’s take a look at the assembly this time:
Foo::getSomeValue() const:
push rbp
mov rbp, rsp
mov QWORD PTR [rbp-8], rdi
mov rax, QWORD PTR [rbp-8]
mov eax, DWORD PTR [rax] <---- member access
pop rbp
ret
Conclusion
This kind of issue can be really confusing, especially because, at first glance, the code “works.”
This is the reason why in such cases, std::optional
or std::expected
should
be used instead of returning nullptr
from a function in case of unexpected
input.