Wednesday, March 3, 2010

When new features collide with old syntax


My C++ times are over. But Roker keeps on feeding me with input. And I could not let this one go by unnoticed.
To make it short, what do you think does the following program print?
#include <iostream>

struct X { };

class Y {
public:
    Y (const X& x) {
        std::cout << "fnord" << std::endl;
    }
};

int main() {
    Y d(X());
    return 0;
}
At first sight it looks like everything is ok. We know that references to temporaries must be const, which is the case here. So we think, d is going to be a Y object which was initilized with a temporary X object. In this case, the program should output "fnord".
But it doesn't. Instead it prints nothing. Mysterious you think? So did I. So I started debugging the issue while Roker was giggling. So, if d is not a Y object, what is it then? Let's ask the compiler and the C++ runtime:
#include <typeinfo>
//...
int main() {
    Y d(X());
    std::cout << typeid(d).name() << std::endl;
    return 0;
}
The program now prints "F1YPF1XvEE". Hmm, I can not decrypt this, can you? We need to demangle the name in order to see it's real C++ nature:
$ c++filt -t F1YPF1XvEE
Y ()(X (*)())
Who can decrypt this? If you can't, here is a howto on reading C type declarations, because the unfamiliar syntax for function types is a C heritage. And what does this type mean after all? It means, that d is a function which returns a Y object and expects a pointer to a function which returns a X object and expects nothing. Got it? If not, here is how I would write it, in case I would need such a type:
#include <iostream>
#include <typeinfo>

struct X { };
class Y { /*...*/ }; 

typedef X (Inner)();
typedef Y (Outer)(Inner);

int main()
{
    Outer d;
    std::cout << typeid(d).name() << std::endl;
    return 0;
}
This program prints Y ()(X (*)()) which is the type of d from the original program. Ok, so now we know, what the program does. It declares a function. That's why there is no X object ever created. So, although the code could mean what we originally intended, the compiler chooses to understand something else, which is also possible due to the C syntax rules. Although I still don't know what function declarations inside other functions are good for, the compiler again prefers them. Anyway, what's the C way of removing ambiguity from syntax? Right, it's parantheses:
//...
int main()
{
    Y o((X()));
    std::cout << typeid(o).name() << std::endl;
    return 0;
}
Now, the program prints "fnord" and "1Y" which is the mangled name for the type Y. To sum it up, this is yet another example for why I consider the C heritage to be a major source of trouble for C++ developers. The C syntax was never designed for temporary variables and combining both yields to what we have seen here: ambiguity, ugly workarounds and surprise and furstration for the developer. Don't get me wrong, though. I still think that C++ could hardly have been made better given the design goals which where taken those days. I truly have a love-hate for that language.

No comments:

Post a Comment