PrevUpHomeNext

Advanced Serialization

Polymorphic serialization
Pointer tracking
Interchangeable types
Unicode strings

User-defined C++ objects can be quite complex to serialize. This section describes some of the more advanced features of RCF's internal serialization framework, SF.

SF will automatically serialize polymorphic pointers and references, as fully derived types. However, to do this, SF needs to be configured with two pieces of information about the polymorphic type being serialized. First, SF needs a runtime identifier string for the polymorphic type. Second, it needs to know which base classes the polymorphic type will be serialized through.

As an example, consider the following polymorphic class hierarchy.

class A
{
public:
    A() : mA() {}
    virtual ~A() {}

    void serialize(SF::Archive &ar)
    {
        ar & mA;
    }

    int mA;
};

class B : public A
{
public:
    B() : mB() {}

    void serialize(SF::Archive &ar)
    {
        SF::serializeParent<A>(ar, *this);
        ar & mB;
    }

    int mB;
};

class C : public B
{
public:
    C() : mC() {}

    void serialize(SF::Archive &ar)
    {
        SF::serializeParent<B>(ar, *this);
        ar & mC;
    }

    int mC;
};

Note that SF::serializeParent<>() is used to invoke base class serialization code. If you try to serialize the parent directly, e.g. by calling ar & static_cast<A&>(*this), SF will detect that the parent class is actually a derived class, and will try to serialize the derived class once again.

We want to implement polymorphic serialization of A * pointers, for use in the X class:

class X
{
public:
    X() : mpA() 
    {}
    
    void serialize(SF::Archive &ar)
    {
        ar & mpA;
    }

    A * mpA;
};

RCF_BEGIN(I_Echo, "I_Echo")
    RCF_METHOD_R1(X, Echo, X)
RCF_END(I_Echo)

First we need to suppply runtime identifiers for the B and C classes.

SF::registerType<B>("B");
SF::registerType<C>("C");

The runtime identifiers will be included in serialized archives when B and C objects are serialized, and will allow the deserialization code to construct objects of the appropriate type.

We also need to specify which base types the B and C classes will be serialized through. In this case, B and C objects will be serialized through an A pointer.

SF::registerBaseAndDerived<A, B>();
SF::registerBaseAndDerived<A, C>();

With the SF runtime now configured, we can run this code:

RcfClient<I_Echo> client(( RCF::TcpEndpoint(port)));

X x1;
x1.mpA = new B();
X x2 = client.Echo(x1);

x1.mpA = new C();
x2 = client.Echo(x1);

The polymorphic A pointers contained in the X objects will be serialized and deserialized, as fully derived polymorphic types.

Finally, it's the applications responsiblity to delete the X::mpA pointer that SF creates upon deserialization. The easiest way to ensure this happens, is to use a C++ smart pointer rather than a raw pointer. SF supports a number of smart pointers, including:

  • std::auto_ptr<>
  • boost::scoped_ptr<>
  • boost::shared_ptr<>

So for example, we can write X as:

class X
{
public:
    X() : mpA() 
    {}

    void serialize(SF::Archive &ar)
    {
        ar & mpA;
    }

    typedef boost::shared_ptr<A> APtr;
    APtr mpA;
};

If you serialize a pointer to the same object twice, SF will by default serialize the entire object twice. This means that when the pointers are deserialized, they will point to two distinct objects. In most applications this is usually not an issue. However, some applications may want the deserialization code to instead create two pointers to the same object.

SF supports this through a pointer tracking concept, where an object is serialized in its entirety, only once, regardless of how many pointers to it are serialized. Upon deserialization, only a single object is created, and multiple pointers can then be deserialized, pointing to the same object.

To demonstrate pointer tracking, here is an an I_Echo interface with an Echo() function that takes a pair of boost::shared_ptr<>'s:

typedef 
    std::pair< boost::shared_ptr<int>, boost::shared_ptr<int> > 
    TwoPointers;

RCF_BEGIN(I_Echo, "I_Echo")
    RCF_METHOD_R1(TwoPointers, Echo, TwoPointers)
RCF_END(I_Echo2)

Here is the client-side code:

boost::shared_ptr<int> spn1( new int(1));
boost::shared_ptr<int> spn2( spn1 );

TwoPointers ret = client.Echo( std::make_pair(spn1,spn2));

If we make a call to Echo() with a pair of shared_ptr<>'s pointing to the same int, we'll find that the returned pair points to two distinct int's. To get them to point to the same int, we enable pointer tracking on the client-side:

RcfClient<I_Echo> client(( RCF::TcpEndpoint(port)));

client.getClientStub().setEnableSfPointerTracking(true);

, and also on the server-side, since we are returning the pointers back to the client:

class EchoImpl
{
public:
    template<typename T>
    T Echo(T t)
    {
        RCF::getCurrentRcfSession().setEnableSfPointerTracking(true);
        return t;
    }
};

The two returned shared_ptr<>'s will now point to the same instance.

It is worth keeping in mind that pointer tracking is relatively expensive. It requires the serialization framework to track all pointers and values that are being serialized, with significant performance overhead.

SF guarantees that certain types are interchangeable, as far as serialization is concerned. For example, it is possible to serialize a pointer, and subsequently deserialize it into a value:

// Serializing a pointer.
int * pn = new int(5);
std::ostringstream ostr;
SF::OBinaryStream(ostr) << pn;
delete pn;

// Deserializing a value.
int m = 0;
std::istringstream istr(ostr.str());
SF::IBinaryStream(istr) >> m;
// m is now 5

Essentially, T * and T have compatible serialization formats, for any T. SF also guarantees that smart pointers are interchangeable with native pointers, so it is fine to serialize a boost::shared_ptr<> and then deserialize it into a value. Here is another example:

// Client-side implementation of X, using int.
class X
{
public:
    int n;

    void serialize(SF::Archive & ar, unsigned int)
    {
        ar & n;
    }
};

// Server-side implementation of X, using shared_ptr<int>.
class X
{
public:
    boost::shared_ptr<int> spn;

    void serialize(SF::Archive & ar)
    {
        ar & spn;
    }
};

// Even with different X implementations, client and server can still 
// interact through this interface.

RCF_BEGIN(I_EchoX,"I_EchoX")
RCF_METHOD_R1(X, Echo, X)
RCF_END(I_EchoX)

The following table lists the sets of types that SF guarantees to be interchangeable:

Sets of interchangeable types

T, T *, std::auto_ptr<T>, boost::shared_ptr<T>, boost::scoped_ptr<T>

Integral types of equivalent size, e.g. unsigned int and int

32 bit integral types, enums

STL containers of T (including std::vector<> and std::basic_string<>), where T is non-primitive

STL containers of T (except std::vector<> and std::basic_string<>), where T is primitive

std::vector<T> and std::basic_string<T>, where T is primitive

std::string, std::vector<char>, RCF::ByteBuffer

The reason for the exceptions concerning std::vector<> and std::basic_string<>, is that SF implements fast memcpy()-based serialization for these containers, if their elements are of primitive type.

When a std::wstring is serialized through SF, it is serialized as a UTF-8 encoded string, in order to ensure portability. On platforms with 16 bit wchar_t, SF assumes that any std::wstring passed to it, is encoded in UTF-16, and converts between UTF-16 and UTF-8. On platforms with 32 bit wchar_t, SF assumes that any std::wstring passed to it is encoded in UTF-32, and converts between UTF-32 and UTF-8.

If you have std::wstring objects, encoded in something other than UTF-16 or UTF-32, you would need to either convert them to an 8-bit represention (std::string) yourself, before serializing, or write a wrapper class around std::wstring, with a customized serialization function.


PrevUpHomeNext