Friday, July 18, 2008

Restricting method access in Perl objects

This a geeks-only entry in the "Perl: Handy, but Ugly" series...

I often want to restrict access to certain methods in a class. One classic example is public and private methods. As another, I've written a class for data storage with both read and write methods, and sometimes I want an instance to be read-only, and other times write-only. I could implement this with an internal read/write flag. However, while I want that flag to be flippable, I don't want just anyone flipping it. That sort of thing is hard to do in Perl because it doesn't believe in enforced privacy.

Fortunately, Perl does believe in being powerful and flexible. So I've found a neat way of wrapping object instances in what I call adapters, which expose only a subset of the object's methods.

The basic desiderata are as follows:

  1. The adapter should be an object wrapping another object.
  2. It should only define the methods it exposes, so that the wrapped object's unexposed methods aren't even there
  3. There should be no way of getting to the wrapped object through the adapter (otherwise, you can get to the unexposed methods)
  4. Finally, I don't want to write a new adapter for every class I want to wrap, or every subset of methods I want to expose

Wait a second, you say. I want adapters to be classes defining a custom set of methods, but I don't want to write a new adapter each time? Yes. And because Perl is "Handy, but Ugly", I can do it.

The trick is that Perl gives you direct access to the symbol table: that magical hash that knows what reference you mean when you use a variable or subroutine name in your code. And since a class is just a set of symbols, it's possible to create a class entirely on the fly just by inserting the proper subroutine references into the symbol table.

With that, I present my AdapterFactory perl module. It's fairly well commented, so I'll leave groking it as an exercise for the reader. A couple of hints:

  • With no strict, a string can be dereferenced as if it were a reference to the variable whose name is the string's value. This works only for non-lexical variables (i.e., those not defined with "my"). For instance, $h = "hash"; %$h is equivalent to $h = \%hash; %$h, or %hash
  • For some reason, even with use strict, strings on either side of the arrow operator can be dereferenced to the package or method whose name is the string value. For instance, $p = "Package"; $m = "new"; $p->$m() is equivalent to Package->new()
  • The symbols for a package are kept in a hash with the name of the package plus "::". Thus, symbols for package "foo" are kept in hash %foo::
  • The * sigil is used to set values in the symbol table
# Author: Pedro DeRose
# Creates adapters, or objects that wrap another object, but expose only a
# subset of its methods. Useful for separating public/private methods, or
# restricting functionality. Does not provide any handle to the object itself.
# Usage example:
#     use AdapterFactory qw(defineAdapter adapt)
#     defineAdapter('Foo::Public', [ qw(get set print) ]);
#     my $fooAdapter = AdapterFactory::Foo::Public->new($fooObj);
#     my $barAdapter = adapt('Foo::Public', $barObj);
#     defineAdapter('Foo::Private', { secret => [ 'default' ] });
#   Defines the AdapterFactory::Foo::Public adapter exposing the get(), set(),
#   and print() methods, then creates adapters wrapping $fooObj and $barObj.
#   Finally, defines the AdapterFactory::Foo::Private adapter exposing the
#   secret() method, and specifies that "default" should always be passed to it.
package AdapterFactory;
use Exporter 'import';
@EXPORT_OK = qw(defineAdapter adapterDefined adapt);

use strict;

# Keep map of adapter to object as a lexical variable so that adapter objects
# don't store the object themselves, where other code can get to it.
my %adapterToObj;

# Defines a new adapter class whose name is the name of this class, plus "::"
# then the given name appended (e.g., given name "Foo::Bar", the name is
# "AdapterFactory::Foo::Bar"). It wraps the object passed to its new()
# constructor, exposing the specified methods. Methods can be specified in two
# ways. When an array reference of method names, they are called directly. When
# a map from method name to an array reference of arguments, the adapter's
# methods call the wrapped object's methods with the given arguments always
# appended. See the usage example above for how to use the adapter.
#   name: the name of the adapter class
#   methods_r: reference to methods to expose
#   returns true if the definition was successful, false otherwise
sub defineAdapter {
    my ($name, $methods_r) = @_;
    $name or die "Missing name";
    ref($methods_r) eq 'HASH' or ref($methods_r) eq 'ARRAY' or die "Bad methods";

    if(adapterDefined($name)) {
        warn "Adapter $name already exists.";
        return undef;

    # Lots of symbol table manipulation, so stop yer whining
    no strict;

    # Compose the adapter class name
    my $class = __PACKAGE__."\::$name";

    # Turn method array ref into method hash ref with no method arguments
    if(ref($methods_r) eq 'ARRAY') { $methods_r = { map { ($_ => []) } @$methods_r }; }

    # Directly create symbol table entry for each exposed method.
    foreach my $method (keys %$methods_r) {
        my @args = defined($methods_r->{$method})? @{$methods_r->{$method}} : ();
        *{"$class\::$method"} = sub {
            # Look up object using adapter's reference, then call the method
            my $self = shift;
            return $adapterToObj{$self}->$method(@_, @args)

    # Create the constructor last, so it clobbers any "new" method in methods_r
    *{"$class\::new"} = sub {
        my ($class, $obj_r) = @_;

        # Map the given obj to this adapter
        my $self = {};
        bless($self, $class);
        $adapterToObj{$self} = $obj_r;

        return $self;

    return 1;

# Returns whether an adapter with the given name is already defined
#   name: the name of the adapter class
#   returns true if an adapter with the name is defined, false otherwise
sub adapterDefined {
    my ($name) = @_;
    no strict;
    return scalar(%{__PACKAGE__."\::$name\::"});

# Creates and returns an adapter for a given object. Equivalent to calling the
# new() constructor on the adapter created with the given name, and passing the
# given object.
#   name: the name of the adapter class
#   obj_r: reference to the object being wrapped
sub adapt {
    my ($name, $obj_r) = @_;
    $name or die "Missing name";
    UNIVERSAL::isa($obj_r, 'UNIVERSAL') or die "Object must be a blessed reference";

    # Create and return the adapter
    my $class = __PACKAGE__."\::$name";
    return $class->new($obj_r);


Thursday, July 3, 2008

Perl: Handy, but Ugly

In what will probably be a many-part series, here's an oddity of Perl that had me tearing out my hair for a couple of hours...

If you know Perl well, feel free to skip this paragraph. Perl has a handy but ugly notion of context. Specifically, code execute in either a scalar or a list context: if a single value is expected, the code executes in scalar context; if a list of values is expected, it executes in list context (that's vague, but good enough for now). Then, code behaves differently depending on the context.

One example of context is getting the length of a list. Given a list @foo = ('a', 'b', 'c'), then @foo in scalar context is the length of @foo. Thus, $x = @foo sets the single value $x to 3 (the code executes in scalar context because $x is a single value, so Perl expects a single value assigned to it).

Now for a pop quiz. If @foo = ('a', 'b', 'c'); $x = @foo sets $x to 3, what does $x = ('a', 'b', 'c') do? Turns out it sets $x to c. Fascinating, isn't it?

The reason is that the comma does different things in list and scalar contexts. In a list context, comma is the list building operator. Thus, ('a', 'b', 'c') in list context (such as when assigned to the list variable @foo) returns a list with three items. However, in scalar context, comma is like C's comma: it executes both its left and right operands, then returns the result of the right. For instance, 'a', 'b' returns b, and 'a', 'b', 'c' returns c. Thus, when we assign ('a', 'b', 'c') to a single value, the code executes in scalar context, returning c.

Of course, I wasn't lucky enough to have this bite me in such a simple form. Instead, consider this (still heavily simplified) example:

sub foo {
    $a = "hello";
    $b = "world";
    return ($a, $b);

print join(" ", foo()) . "\n";
print scalar(foo()) . "\n";
I naively thought this would print hello world then 2. Instead, we get hello world then world. Today's lesson, then: when returning lists from functions, assign them to a list variable first.