Finding Circular Reference leaks in Perl
Dynamic type languages such as Perl, Ruby, PHP, and Python free you as the developer from managing memory in your application. However, it isn’t a fool proof solution that you won’t have memory leaks in your application. You as the developer should be aware of how the underlying garbage collector of your preferred language works to accommodate for the inadequacies of its garbage collection algorithm.
Currently there are two ways of doing garbage collection; mark and sweep and reference counting. The Perl interpreter uses the latter. Reference counting is a fairly simple garbage collection technique. Each time you declare an instance, the reference count increments by one. When your program reaches the end of scope, objects with a reference of one get collected. However, if your object has a reference count of two it is kept. The one main draw back of reference counting is the fact it can’t deal with circular references. This is when both objects point to each other and they never get garbage collected.
On the other hand, Ruby and Java use the mark and sweep garbage collector. I personally have mixed feelings about it, since I don’t know exactly when my objects will be collected. The way mark and sweep garbage collection works, is it does not collect anything for a period of time. At intervals when the heap gets full, it runs its garbage collection. The downside to this is you don’t know exactly know when this happens and if there are lots of objects to be collected this leads to “stutters” and unresponsiveness of the application. If you have ever used a Java swing application you might have noticed these stutters, this is when garbage collection is taking place. However, it’s not as gloomy as I set the pretense to be with the mark and sweep garbage collection. Mark and sweep garbage collection can handle cyclic references unlike with reference counting, which is a huge boon to its usefulness. There has been much work done on mark and sweep garbage collection, specifically with generational mark and sweep collectors that try to fix the unresponsiveness issue. Java currently uses a generation GC, and Ruby hopes to obtain a generational GC for the Ruby 2.0 interpreter. Ideally a generational garbage collector would be the preferred GC for a long-standing process.
With that little garbage collection background out of the way, lets look at the life cycle of a instance in reference counting garbage collector.
Here is an example of how reference counting works ideally:
foreach (1..5) { my $i = 5; $i + 5; print $i . ‘n'; } # $i should be garbage collected when it goes out of scope.
Unlike mark and sweep garbage collection with reference counting, you know exactly when your instance gets collected.
Here is a very simple problematic case for reference counting:
foreach (1..5) { my $a; my $b; $a->{b} = $b; $b->{a} = $a; } # since both are pointing to each other they will never get collected.
This is a fairly simple case of where reference counting falls right on its face. Usually this isn’t a problem since most Perl scripting revolves around short-lived scripts. However, with frameworks such as Catalyst that are long running perl scripts this becomes an issue quickly. Thankfully, with Perl it is extremely easily to nail memory leaks, more so than with Ruby or Java. Enter Devel::Cycle and Devel::Peek, both of these modules can be installed from cpan. Both Devel::Cycle and Devel::Peek can assist you in tracking down the memory leak in a relatively short time.
use Devel::Cycle; use Devel::Peek; foreach (1) { my $parent = {name => 'victor' }; my $child = {name => 'victor jr' }; $parent->{child} = $child; $child->{parent} = $parent; find_cycle($parent); # find_cycle belongs to Devel::Cycle # which prints out the # circular reference to STDOUT Dump($parent); # Dump belongs to Devel::Peek , its extra verbose # which prints out the reference count to STDOUT }
# Sample output
# ibook:~/Desktop victori$ perl blah.pl
# Cycle (1): <-- find_cycle tells you literly where the cyclic reference leak is at.
# $A->{'child'} => %B
# $B->{'parent'} => %A
#
# SV = RV(0x1817898) at 0x1800ec8
# REFCNT = 1
# FLAGS = (PADBUSY,PADMY,ROK)
# RV = 0x18006dc
# SV = PVHV(0x1830980) at 0x18006dc
# REFCNT = 2 <-- Notice the reference count of 2 , we know we have a leak
# FLAGS = (SHAREKEYS)
# IV = 2
# NV = 0
# ARRAY = 0x404e60 (0:6, 1:2)
# hash quality = 125.0%
# KEYS = 2
# FILL = 2
# MAX = 7
# RITER = -1
# EITER = 0x0
# Elt "name" HASH = 0xe6e17f14
# SV = PV(0x1801460) at 0x1800ea4
# REFCNT = 1
# FLAGS = (POK,pPOK)
# PV = 0x401730 "victor"
# CUR = 6
# LEN = 8
# Elt "child" HASH = 0x33ec6b5
# SV = RV(0x1817870) at 0x1832ca4
# REFCNT = 1
# FLAGS = (ROK)
# RV = 0x1800484
# SV = PVHV(0x18309b0) at 0x1800484
# REFCNT = 2
# FLAGS = (SHAREKEYS)
# IV = 2
# NV = 0
# ARRAY = 0x404db0 (0:6, 1:2)
# hash quality = 125.0%
# KEYS = 2
# FILL = 2
# MAX = 7
# RITER = -1
# EITER = 0x0
# Elt "parent" HASH = 0xa99c4651
# SV = RV(0x18178a0) at 0x1832c44
# REFCNT = 1
# FLAGS = (ROK)
# RV = 0x18006dc
So how do we fix this? Quite simple, all we do is weaken the reference count using weaken(). Here is a proper way of patching up the memory leak we introduced in our program.
use Devel::Cycle; use Devel::Peek; use Scalar::Util qw/weaken/; foreach (1) { my $parent = {name => 'victor' }; my $child = {name => 'victor jr' }; weaken($parent->{child} = $child); # we weaken the reference at the parent and all is well. $child->{parent} = $parent; find_cycle($parent); # find_cycle belongs to Devel::Cycle which prints out the # circular reference to STDOUT Dump($parent); # Dump belongs to Devel::Peek , its extra verbose # which prints out the reference count to STDOUT }
We weaken the reference at the parent level to set it back to a reference count of 1, so when it reaches the end of scope it will be collected and the memory leak will be no more.
Hopefully this is a good primer for other Perl coders out there who are facing memory leaks in their running long running perl scripts.
Note that you can do automated testing of these leaks with Test::Memory::Cycle, which wraps around Devel::Cycle.
Thanks for the article.
It is unbelievable in this day and age that a memory-managed language would use reference counting. Bad, Perl. Bad.
@Ted
Why is it unbelievable? Python uses it too. I imagine quite a few other interpreted languages do as well. They all have practical reasons for it.
Most notably, low overhead and ease of implementation.
I would have to agree with j_king, reference counting garbage collection is extremely predictable compared to say the JVM GC algorithms. Sure some patterns such as the strategy pattern might be more difficult to implement correctly without leaking, but at the end of the day garbage collection times are predictable.
** by strategy pattern I mean a strategy implementation that will need access to the parent instance creating a circular reference leak.
The problem with Perl is that it has no permanently weak reference solution and in OO each time you issue ($self)=@_ you get a copy which is a strong reference that you have to weaken and weaken again whenever you pass it, and check if it has been weakened. Unfortunately even blessing a reference does not create an instance that could be permanently weakened with some real instance method – that is a method only for a particular instance -, for a Perl instance is only a data-structure associated to a package, but without its own internaliae. You can use Moose as a workaround for Moose has weak option and may be a nice thing to use Moose, still it is a workaround.
Howdy! This is kind of off topic but I need some advice from an established blog.
Is it tough to set up your own blog? I’m not very techincal but I can
figure things out pretty fast. I’m thinking about setting up my own but
I’m not sure where to begin. Do you have any tips or suggestions?
Cheers
I really enjoy the article post. Cool. ekgfgebaccge
I’d like to thank you for the efforts you’ve put
in writing this site. I am hoping to view the same
high-grade content from you in the future as well.
In truth, your creative writing abilities has motivated me to get my own site now 😉
Hi I am so glad I found your site, I really found you by error, while I was looking on Yahoo for something else, Anyways I am here
now and would just like to say cheers for a marvelous post and a all round entertaining blog
(I also love the theme/design), I don’t have time to read it all
at the moment but I have bookmarked it and
also included your RSS feeds, so when I have time I will be back to read a
lot more, Please do keep up the great jo.
The city will pay off is provide having worked myself personally in a vacuum.
Basement waterproofing contractors to have to when they are reliable enough to
handle the construction. We can hope these two cannot meet, then a copper repipe process is too
high in favour of the business in Maryland. Yes, plumbing is a single location to be the first steps in ensuring
that your customer. In comparison to present professional paint contractor.
Make sure you choose to hire a company that does not
depend on your employees’ wages.
This page is losing value with each off-topic post. Please nuke them or nuke the whole page. You’re promoting bad Internet behaviour!