How to compile a decorator - C ++, Python and its own implementation. Part 1

This series of articles will be devoted to the possibility of creating a decorator in C ++, the features of their work in Python, and we will also consider one of the options for implementing this functionality in our own compiled language, using the general approach for creating closures - closure conversion and modernization of the syntax tree.



Disclaimer
, Python β€” . Python , (). - ( ), - ( , ..), Python «» .

Decorator in C ++


It all started with the fact that my friend VoidDruid decided to write a small compiler as a diploma, the key feature of which is decorators. Even during the pre-defense, when he outlined all the advantages of his approach, which included changing the AST, I was wondering: is it really impossible to implement these same decorators in the great and powerful C ++ and do without any complicated terms and approaches? Googling this topic, I did not find any simple and general approaches to solving this problem (by the way, I came across only articles about the implementation of the design pattern) and then sat down to write my own decorator.


However, before moving on to a direct description of my implementation, I would like to talk a little about how lambdas and closures in C ++ are arranged and what is the difference between them. Immediately make a reservation that if there is no mention of a specific standard, then by default I mean C ++ 20. In short, lambdas are anonymous functions, and closures are functions that use objects from their environment. So for example, starting with C ++ 11, a lambda can be declared and called like this:

int main() 
{
    [] (int a) 
    {
        std::cout << a << std::endl;
    }(10);
}

Or assign its value to a variable and call it later.

int main() 
{
    auto lambda = [] (int a) 
    {
        std::cout << a << std::endl;
    };
    lambda(10);
}

But what happens during compilation and what is lambda? In order to immerse yourself in the internal structure of the lambda, just go to the cppinsights.io website and run our first example. Next, I attached a possible conclusion:

class __lambda_60_19
{
public: 
    inline void operator()(int a) const
    {
        std::cout.operator<<(a).operator<<(std::endl);
    }
    
    using retType_60_19 = void (*)(int);
    inline operator retType_60_19 () const noexcept
    {
        return __invoke;
    };
    
private: 
    static inline void __invoke(int a)
    {
        std::cout.operator<<(a).operator<<(std::endl);
    }    
};


So, when compiling, the lambda turns into a class, or rather a functor (an object for which the operator () is defined ) with an automatically generated unique name that has an operator () , which takes the parameters that we passed to our lambda and its body contains the code that our lambda must execute. With this, everything is clear, but what about the other two methods, why are they? The first is the operator of casting to a function pointer, the prototype of which coincides with our lambda, and the second is the code that should be executed when our lambda is called upon its preliminary assignment to the pointer, for example like this:

void (*p_lambda) (int) = lambda;
p_lambda(10);

Well, there’s less by one riddle, but what about closures? Let us write the simplest example of a closure that captures the variable β€œa” by reference and increases it by one.

int main()
{
    int a = 10;
    auto closure = [&a] () { a += 1; };
    closure();
}

As you can see, the mechanism for creating closures and lambdas in C ++ is almost the same, so these concepts are often confused and lambdas and closures are simply called lambdas.

But back to the internal representation of closure in C ++.

class __lambda_61_20
{
public:
    inline void operator()()
    {
        a += 1;
    }
private:
    int & a;
public:
    __lambda_61_20(int & _a)
    : a{_a}
    {}
};

As you can see, we have added a new, non-default constructor that takes our parameter by reference and saves it as a member of the class. Actually, this is why you need to be extremely careful when setting [&] or [=], because the whole context will be stored within the closure, and this can be quite suboptimal from memory. In addition, we lost the operator of casting to a function pointer, because now for its normal call context is needed. And now the above code will not compile:

int main()
{
    int a = 10;
    auto closure = [&a] () { a += 1; };
    closure();
    void (*ptr)(int) = closure;
}

However, if you still need to pass the closure somewhere, no one has canceled the use of std :: function.

std::function<void()> function = closure;
function();

Now that we’ve roughly figured out what lambdas and closures are in C ++, let's move on to writing the decorator directly. But first, you need to decide on our requirements.

So, the decorator should take our function or method as an input, add the functionality we need to it (for example, this will be omitted) and return a new function when it is called, our code and the function / method code are executed. At this point, any self-respecting pythonist will say: β€œBut how so! The decorator must replace the original object and any call to it by name should call a new function! ” Just this is the main limitation of C ++, we can not stop the user from invoking the old function. Of course, there is an option to get its address in memory and grind it (in this case, accessing it will lead to an abnormal termination of the program) or replace its body with a warning that it should not be used in the console, but this is fraught with serious consequences. If the first option seems pretty tough at all,then the second, when using various compiler optimizations, can also lead to a crash, and therefore we will not use them. Also, the use of any macro magic here I consider redundant.

So, let's move on to writing our decorator. The first option that came to my mind was this:

namespace Decorator
{
    template<typename R, typename ...Args>
    static auto make(const std::function<R(Args...)>& f)
    {
        std::cout << "Do something" << std::endl;
        return [=](Args... args) 
        {
            return f(std::forward<Args>(args)...);
        };
    }
};

Let it be a structure with a static method that takes std :: function and returns a closure that will take the same parameters as our function and when called, it will just call our function and return its result.

Let's create a simple function that we want to decorate.

void myFunc(int a)
{
    std::cout << "here" << std::endl;
}

And our main will look like this:

int main()
{
    std::function<void(int)> f = myFunc;
    auto decorated = Decorator::make(f);
    decorated(10);
}


Everything works, everything is fine and in general Hurray.

Actually, this solution has several problems. Let's start in order:

  1. This code can only be compiled with version C ++ 14 and higher, since it is not possible to know the type returned in advance. Unfortunately, I have to live with this and I did not find other options.
  2. make requires std :: function to be passed to it, and passing a function by name leads to compilation errors. And this is not at all as convenient as we would like! We cannot write code like this:

    Decorator::make([](){});
    Decorator::make(myFunc);
    void(*ptr)(int) = myFunc;
    Decorator::make(ptr);

  3. Also, it is not possible to decorate a class method.

Therefore, after a short conversation with colleagues, the following option was invented for C ++ 17 and above:

namespace Decorator
{
    template<typename Function>
    static auto make(Function&& func)
    {
        return [func = std::forward<Function>(func)] (auto && ...args) 
        {
            std::cout << "Do something" << std::endl;
            return std::invoke(
                func,
                std::forward<decltype(args)>(args)...
            );
        };
    }
};

The advantages of this particular option are that now we can decorate absolutely any object that has an operator () . So for example, we can pass the name of a free function, a pointer, a lambda, any functor, std :: function, and of course a class method. In the case of the latter, it will also be necessary to pass a context to it when calling the decoded function.

Application options
int main()
{
    auto decorated_1 = Decorator::make(myFunc);
    decorated_1(1,2);

    auto my_lambda = [] (int a, int b) 
    { 
        std::cout << a << " " << b <<std::endl; 
    };
    auto decorated_2 = Decorator::make(my_lambda);
    decorated_2(3,4);

    int (*ptr)(int, int) = myFunc;
    auto decorated_3 = Decorator::make(ptr);
    decorated_3(5,6);

    std::function<void(int, int)> fun = myFunc;
    auto decorated_4 = Decorator::make(fun);
    decorated_4(7,8);

    auto decorated_5 = Decorator::make(decorated_4);
    decorated_5(9, 10);

    auto decorated_6 = Decorator::make(&MyClass::func);
    decorated_6(MyClass(10));
}


In addition, this code can be compiled with C ++ 14 if there is an extension for using std :: invoke, which needs to be replaced with std :: __ invoke. If there is no extension, then you will have to give up the ability to decorate class methods, and this functionality will become unavailable.

In order not to write the cumbersome "std :: forward <decltype (args)> (args) ..." you can use the functionality available with C ++ 20 and make our lambda boilerplate!

namespace Decorator
{
    template<typename Function>
    static auto make(Function&& func)
    {
        return [func = std::forward<Function>(func)] 
        <typename ...Args> (Args && ...args) 
        {
            return std::invoke(
                func,
                std::forward<Args>(args)...
            );
        };
    }
};

Everything is perfectly safe and even works the way we want (or at least pretends). This code is compiled for both gcc and clang 10-x versions and you can find it here . There will also be implementations for various standards.

In the next articles, we will move on to the canonical implementation of decorators using the Python example and their internal structure.

All Articles