SumatraPDF is a Windows GUI application for viewing PDF, ePub and comic books written in C++.
A common need in GUI programs is a callback. E.g. when a button is clicked we need to call a function with some data identifying which button was clicked. Callback is therefore a combo of function and data and we need to call the function with data as an argument.
In programming language lingo, code + data combo is called a closure.
C++ has std::function<>
and lambdas (i.e. closures). Lambdas convert to std::function<>
and capture local variables.
Lambdas can be used as callbacks so problems solved?
Not for me.
I’ve used std::function<>
and I’ve used lambdas and what pushed me away from them were crash reports.
The problem with lambdas is that they are implemented as compiler-generated functions. They get non-descriptive, auto-generated names. When I look at call stack of a crash I can’t map the auto-generated closure name to a function in my code. It makes it harder to read crash reports.
Simplest solution that could possibly work
You should know up front that my solution is worse than std::function<>
in most ways. It’s not as nice to type as a lambda, it supports a small subset of std::function<>
functionality.
On the other hand it’s small, fast and I can understand it.
One thing you need to know about me is that despite working on SumatraPDF C++ code base for 16 years, I don’t know 80% of C++.
I get by thanks to sticking to a small subset that I do understand.
I don’t claim I’ve invented this particular method. It seems obvious in retrospect but it did take me 16 years to arrive at it.
Implementation of a simple callback in C++
A closure is code + data
A closure is conceptually simple. It combines code (function) and data:
using func0Ptr = void (*)(void*);
struct Func0 {
func0Ptr fn;
void* data;
void Call() { fn(data); }
};
There are 2 big problems with this.
First is annoying casting. You have to do:
struct MyFuncData { };
void MyFunc(void* voidData) {
MyFuncData* data = (MyFuncData*)voidData;
}
auto data = new MyFuncData;
auto fn = Func0{(void*)data, MyFunc}
Second is lack of type safety:
struct MyFuncData {};
void MyOhterFunc(void* voidData) {
MyOtherFuncData* data = (MyOtherFuncData*)voidData;
}
auto data = new MyFuncData;
auto fn = Func0{ MyOtherFunc, (void*)data };
We will call MyOtherFunc
with data of MyFunc
. This will likely crash.
The good thing is that pointer types are compatible. The machine instructions to call void Foo(void*)
are exactly the same as calling void Foo(FooData*)
.
We can solve the above annoyances with a bit of cleverness in the form of MkFunc0()
:
template <typename T>
Func0 MkFunc0(void (*fn)(T*), T* d) {
auto res = Func0{};
res.fn = (func0Ptr)fn;
res.userData = (void*)d;
return res;
}
void MyFunc(MyFuncData* data) { }
auto data = new MyFuncData;
auto fn = MkFunc0(MyFunc, data);
We no longer need to cast data from void*
in MyFunc
.
Trying to to create a mis-matched auto fn = MkFunc0(MyFunc, new MyOtherFuncData)
will error out. The compiler will notice that fn
and data
arguments don’t match.
We’ll make one improvement: ability to also create closure for functions without any arguments:
void MyFuncNoData() { };
Func0 fn = MkFuncVoid(MyFuncNoData);
The implementation cleverness: use a special, impossible value of a pointer (-1) to indicate a function without arguments.
The full implementation is:
using func0Ptr = void (*)(void*);
using funcVoidPtr = void (*)();
#define kVoidFunc0 (void*)-1
// the simplest possible function that ties a function and a single argument to it
// we get type safety and convenience with mkFunc()
struct Func0 {
void* fn = nullptr;
void* userData = nullptr;
Func0() = default;
Func0(const Func0& that) {
this->fn = that.fn;
this->userData = that.userData;
}
~Func0() = default;
bool IsEmpty() const {
return fn == nullptr;
}
void Call() const {
if (!fn) {
return;
}
if (userData == kVoidFunc0) {
auto func = (funcVoidPtr)fn;
func();
return;
}
auto func = (func0Ptr)fn;
func(userData);
}
};
template <typename T>
Func0 MkFunc0(void (*fn)(T*), T* d) {
auto res = Func0{};
res.fn = (func0Ptr)fn;
res.userData = (void*)d;
return res;
}
Func0 MkFuncVoid(funcVoidPtr fn) {
auto res = Func0{};
res.fn = (void*)fn;
res.userData = kVoidFunc0;
return res;
}
Closure with additional caller-provided argument
Func0
only addresses a use case of packaging a function and its own data.
Most of use cases for callbacks require passing additional arguments.
For example a list view control has onItemSelected(int itemIndex)
callback.
For that we need Func1
:
template <typename T>
struct Func1 {
void (*fn)(void*, T) = nullptr;
void* userData = nullptr;
Func1() = default;
~Func1() = default;
bool IsEmpty() const {
return fn == nullptr;
}
void Call(T arg) const {
if (fn) {
fn(userData, arg);
}
}
};
template <typename T1, typename T2>
Func1<T2> MkFunc1(void (*fn)(T1*, T2), T1* d) {
auto res = Func1<T2>{};
using fptr = void (*)(void*, T2);
res.fn = (fptr)fn;
res.userData = (void*)d;
return res;
}
We can now do:
struct OnListItemSelectedData { };
void OnListItemSelected(OnListItemChangedData* d, int selectedIdx) {
}
struct ListView {
Func1<int> onListItemSelected;
void listItemSelected(int idx) {
onListItemSelected.Call(idx);
}
}
auto lv = new ListView;
auto data = new OnListItemSelectedData;
lv.onListItemSelected = MkFunc1(OnListItemSelected, data)
In Func0
the argument must be a pointer because the type is forgotten when we put it in a struct. We rely on the fact that void foo(void*)
and void foo(Foo*)
are compatible and we can cast the argument and function.
But Func1
retains the type of second argument so it can be any type and the right call will happen.
We also don’t want to erase the second type to avoid casts when calling it and to serve as documentation.
We could write Func2
for 2 arguments, Func3
for 3 arguments etc. but I didn’t bother. If I need more than one argument, I can always use struct
to pack any number of arguments into a single one.
Fringe benefits
So is it worth it to use this over std::function<>
?
For me it does and I’ve refactored SumatraPDF to get rid of most of std::function<>
uses in favor of Func0
and Func1
.
Yes, std::function<>
is better in many ways.
It’s more flexible. My solution only supports void Foo()
, void Foo(T*)
and void Foo(T1*, T2)
. std::function<>
supports arbitrary number arguments of any type.
Compared to writing a lambda with variable capture, I need to write more code:
- define a
struct
for closure data
- allocate and initialize struct
- construct
Func0
or Func1
- delete the data (typically at the end of closure)
I decided writing this boilerplate doesn’t bother me.
There are fringe benefits of my approach.
On MSVC 64-bit std::function<>
is 64 bytes. Func0
and Func1
are 16 bytes.
Templated code is a highway to bloat. For every unique type, the compiler generates a new class definition on set of methods. Implementation of std::function<>
is gigantic compared to Func1
and Func2
.
Templated code is also a highway to slow compilation. Again, std::function<>
is at least order of magnitude more complicated so it’ll take order of magnitude longer to compile.
Finally, I understand my implementation. I don’t understand std::function<>
implementation. It’s scarier than Freddy Krueger. It’s scarier than Frankenstein’s monster.
In fact, I don’t think anyone understands std::function<>
including the 3 people who implemented it.