RefPtr and PassRefPtr Basics

Darin Adler
first draft, 2007-03-24

History

Many objects in WebKit are reference counted. The pattern used is that classes have member functions ref and deref that increment and decrement the reference count. Each call to ref has to be matched by a call to deref. When the reference count hits 0, the object is deleted. Many of these classes create new objects with a reference count of 0; this is referred to as the floating state. An object in floating state must have ref and then deref called on it before it will be deleted. Many classes in WebCore implement this by inheriting from the Shared class template.

Back in 2005, we discovered that there were many memory leaks, especially in WebCore editor code, caused either by mismatches of ref and deref calls or by objects that were created with new that never got a ref call at all and remained in the floating state.

We decided that we’d like to use smart pointers to mitigate the problem. However, some early experiments showed that smart pointers led to additional manipulation of reference counts that hurt performance. For example, if a function took a smart pointer as a parameter and returned that same smart pointer as a return value, just passing the parameter and returning the value would increment and then decrement the reference count two to four times as the object moved from one smart pointer to another. So we looked for an idiom that would let us use smart pointers and avoid this reference count churn.

The inspiration for a solution came from the C++ standard class template auto_ptr. These objects implement a model where assignment is transfer of ownership. When you assign from one auto_ptr to another, the donor becomes 0.

Maciej Stachowiak devised a pair of class templates, RefPtr and PassRefPtr, that implement this scheme for WebCore’s intrusive reference counting.

Raw pointers

When discussing smart pointers such as the RefPtr class template we use the term raw pointer to refer to the C++ language’s built in pointer type. Here’s the canonical setter function, written with raw pointers:

// example, not preferred style

class Document {
    [...]
    Title* m_title;
}

Document::Document()
    : m_title(0)
{
}

Document::~Document()
{
    if (m_title)
        m_title->deref();
}

void Document::setTitle(Title* t)
{
    if (t)
        t->ref();
    if (m_title)
        m_title->deref();
    m_title = t;
}

RefPtr

RefPtr is a simple smart pointer class that calls ref on incoming values and deref on outgoing values. RefPtr works on any object with both a ref and a deref member function. Here’s the setter function example, written with RefPtr:

// example, not preferred style
 
class Document {
    [...]
    RefPtr<Title> m_title;
}

void Document::setTitle(Title* t)
{
    m_title = t;
}

Use of RefPtr alone can lead to reference count churn.

// example, not preferred style
 
RefPtr<Node> createSpecialNode()
{
    RefPtr<Node> a = new Node;
    a->setSpecial(true);
    return a;
}

RefPtr<Node> b = createSpecialNode();

The node object starts with a reference count of 0. When it’s assigned to a, the reference count is incremented to 1. The reference count is incremented to 2 to create the return value, then decremented back to 1 when a is destroyed. Then the reference count is incremented to 2 to create b, and then decremented back to 1 when the return value of createSpecialNode is destroyed.

(This analysis ignores the possibility that the compiler might implement the return value optimization. If the compiler does, some of the reference count churn may be mitigated.)

The overhead of reference count churn is even greater when both function arguments and return values are involved. The solution is PassRefPtr.

PassRefPtr

PassRefPtr is like RefPtr with a difference. When you copy a PassRefPtr or assign the value of a PassRefPtr to a RefPtr or another PassRefPtr, the original pointer value is set to 0; the operation is done without any change to the reference count. Let’s take a look at a new version of our example:

// example, not preferred style

PassRefPtr<Node> createSpecialNode()
{
    PassRefPtr<Node> a = new Node;
    a->setSpecial(true);
    return a;
}

RefPtr<Node> b = createSpecialNode();

The node object starts with a reference count of 0. When it’s assigned to a, the reference count is incremented to 1. Then a gets set to 0 when the return value PassRefPtr is created. Then the return value is set to 0 when b is created.

However, as the Safari team learned when we started programming with PassRefPtr, the rule that a pointer becomes 0 when it’s assigned to another variable can easily lead to mistakes.

// example, not preferred style
 
static RefPtr<Ring> g_oneRingToRuleThemAll;

void finish(PassRefPtr<Ring> ring)
{
    g_oneRingToRuleThemAll = ring;
    ...
    ring->wear();
}

By the time wear is called, ring is already 0. To avoid this, we recommend PassRefPtr only for function argument and result types, copying arguments into RefPtr local variables.

static RefPtr<Ring> g_oneRingToRuleThemAll;

void finish(PassRefPtr<Ring> prpRing)
{
    RefPtr<Ring> ring = prpRing;
    g_oneRingToRuleThemAll = ring;
    ...
    ring->wear();
}

Mixing RefPtr and PassRefPtr

Since we recommend use of RefPtr in all cases except when passing arguments to or returning values from a function, there will be times when you have a RefPtr and wish to transfer ownership as PassRefPtr does. RefPtr has a member function named release which does the trick. It sets the value of the original RefPtr to 0 and constructs a PassRefPtr, without changing reference counts.

PassRefPtr<Node> createSpecialNode()
{
    RefPtr<Node> a = new Node;
    a->setCreated(true);
    return a.release();
}

RefPtr<Node> b = createSpecialNode();

This keeps the efficiency of PassRefPtr while reducing the chance that its relatively tricky semantics will cause problems.

Mixing with raw pointers

When using a RefPtr to call a function that takes a raw pointer, use get.

printNode(stderr, a.get());

However, there are operations that can be done on a RefPtr or PassRefPtr directly, without resorting to an explicit get call.

RefPtr<Node> a = createSpecialNode();
Node* b = getOrdinaryNode();

// the * operator
*a = value;

// the -> operator
a->clear();

// null check in an if statement
if (a)
    log("not empty");

// the ! operator
if (!a)
    log("empty");

// the == and != operators, mixing with raw pointers
if (a == b)
    log("equal");
if (a != b)
    log("not equal");

// some type casts
RefPtr<DerivedNode> d = static_pointer_cast<DerivedNode>(a);

Normally, RefPtr and PassRefPtr enforce a simple rule; they always balance ref and deref calls, guaranteeing a programmer can’t miss a deref. But in the case where we have a raw pointer, already have a reference count, and want to transfer ownership the adoptRef function should be used.

// warning, requires a pointer that already has a ref
RefPtr<Node> node = adoptRef(rawNodePointer);

To transfer from a RefPtr to a raw pointer without changing the reference count, PassRefPtr provides the releaseRef function.

// warning, results in a pointer that must get an explicit deref
RefPtr<Node> node = createSpecialNode();
Node* rawNodePointer = node.release().releaseRef();

Since releaseRef is rarely used, it’s provided only in the PassRefPtr class, hence the need to call release, then releaseRef. If we find this is used often we could provide releaseRef for RefPtr too.

Guidelines

We’ve developed these guidelines for use of RefPtr and PassRefPtr in WebKit code.

Local variables

Data members

Function arguments

Function results

New objects

Improving this document

What frequently asked questions are not covered by this document?

Which of these topics should also be covered by this document?

Any other ideas about improving the clarity, scope, or presentation?