Browsing the tag solaris
Not all malloc implementations are created equal
Leave a comment | Filed under administration benchmark mainI have recently blogged about swapping malloc implementations for the JVM to help boost multi-threaded performance. Well there is yet another malloc implementation that solaris comes with that is optimized for single threaded performance; bsdmalloc. I just recently switched our perl interpreter to use bsdmalloc and got 33% faster performance with our perlbal proxy.
You can try out multiple malloc implementations by setting LD_PRELOAD environment variable.
LD_PRELOAD="/usr/lib/libbsdmalloc.so" perl somecode.pl
So here is the rule of thumb for which malloc implementation to use for your application.
libumem = For multithreaded applications. umem avoids thread heap contention and is highly optimized for multi-threaded applications.
bsdmalloc = For single threaded applications. PHP/Perl/Python and Ruby will fall into this category.
Applying the right malloc implementation to your resource intensive application can see a nice performance benefit.
For those interested in how we run Fabulously40.
1. Single server, OpenSolaris / 8Gigs RAM / Quad Xeon x5355 / 100Mbit line.
2. Static and dynamic data cached up front on varnish
3. Even though Nginx can handle L7 load balancing, Perlbal offers better flexibility with its plugin system
4. Jetty application servers easily scale out by using memcached as the session store
5. Write intensive operations are done asynchronously via the ActiveMQ message store system
6. One PostgreSQL database on RAID1 with a hot standby database on a third disk.
The application can do 6,000+req/sec and 80-120req/sec without the varnish cache. The platform uses Wicket, Hibernate and Spring for it’s internals.
There you have it.

Apparently Solaris comes with some crummy settings for web hosting. Here are the settings I have used to improve our web performance at our service.
victori@fab40:/etc/rc2.d# netstat -sP tcp | grep -i drop tcpTimRetransDrop = 6029 tcpTimKeepalive = 2467 tcpListenDrop = 27327 tcpListenDropQ0 = 0 tcpHalfOpenDrop = 0 tcpOutSackRetrans = 99988
If tcpListenDrop is above 0, you have more connections than the system can handle with the default settings. Increasing tcp_conn_req_max_q accordingly should fix the issue. Raise the number incrementally until tcpListenDrop stops gradually increasing.
The tcp_conn_req_max_q default is 1024.
/usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q 8192 /usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q0 8192
Lower the anonymous port range to support the larger connection queue that was defined.
/usr/sbin/ndd -set /dev/tcp tcp_smallest_anon_port 2048
Up the buffer size for transmissions, to you know….. actually make use of that 100mbit connection?
/usr/sbin/ndd -set /dev/tcp tcp_xmit_hiwat 1048576 /usr/sbin/ndd -set /dev/tcp tcp_recv_hiwat 1048576 /usr/sbin/ndd -set /dev/tcp tcp_max_buf 2097152
To persist these settings across a reboot just write out the contents to /etc/rc2.d/S99netoptimize bash file
I was told Solaris was configured out of the box for today’s hardware? wtf?
OpenSolaris uses a single-threaded malloc by default for all applications. The JDK that is compiled for Solaris fails to be linked against mtmalloc or the newer umem malloc implementation that is multithread optimized. In a multithreaded application using a single threaded malloc can degrade performance. As memory is being allocated concurrently in multiple threads, all the threads must wait in a queue while malloc() handles one request at a time, this is called heap contention. To get around this contention point you can force the JDK to use the umem malloc.
LD_PRELOAD=/usr/lib/libumem.so /opt/jdk1.7.0/bin/java start.jar or LD_PRELOAD=/usr/lib/libmtmalloc.so /opt/jdk1.7.0/bin/java start.jar
This simple fix has really improved performance on our web service fabulously40. The application went from serving 120req/sec uncached to 170req/sec. Not bad no?
This also works wonders for mysql and varnish, two applications that really put those threads to use. We have dropped 100ms in response time with varnish by just using umem for the malloc implementation.
Our logs at fabulously40 grow quite large each day. I needed away to rotate our nginx logs so it does not grow uncontrollably.
Here is the recipe
sh# logadm -w /opt/extra/nginx/logs/fab40.access.log -s 100m -a 'kill -USR1 `cat /opt/extra/nginx/logs/nginx.pid`' # check /etc/logadm.conf sh# cat /etc/logadm.conf # if you wish to test it out right away, run logadm sh# /usr/sbin/logadm
Simple no?
I *really* needed session affinity for our wicket application. HAproxy does session affinity but can’t be reconfigured at runtime without a restart. Perlbal is much more configurable, it lets you add and remove nodes in a pool at runtime. This makes deploying a new version of our web application a lot easier. I have the ability to test a new version of our application before putting it back into the pool of active nodes.
This is my first attempt at writing a sticky sessions plugin for Perlbal.
Update 06/26/09 Now on github perlbal-plugin-stickysessions
Update 04/30/09 Added Perlbal::XS::HTTPHeaders Support. Faster header parsing performance.
Update Fixed the Set-Cookies merge bug with the way Perlbal handles headers.
use Perlbal;
use strict;
use warnings;
use Data::Dumper;
use HTTP::Date;
use CGI qw/:standard/;
use CGI::Cookie;
use Scalar::Util qw(blessed reftype);
# LOAD StickySessions
# SET plugins = stickysessions
#
# Add $self->{service}->run_hook(‘modify_response_headers’, $self);
# To sub handle_response in BackendHTTP after Content-Length is set.
#
sub load {
my $class = shift;
return 1;
}
sub unload {
my $class = shift;
return 1;
}
sub get_backend_id {
my $be = shift;
for ( my $i = 0 ; $i <= $#{ $be->{ service }->{ pool }->{ nodes } } ; $i++ )
{
my ( $nip, $nport ) = @{ $be->{ service }->{ pool }->{ nodes }[$i] };
my $nipport = $nip . ‘:’ . $nport;
if ( $nipport eq $be->{ ipport } ) {
return $i + 1;
}
}
# default to the first backend in the node list.
return 1;
}
sub decode_server_id {
my $id = shift;
return ( $id - 1 );
}
sub get_ipport {
my ( $svc, $req ) = @_;
my $cookie = $req->header(‘Cookie’);
my %cookies = ();
my $ipport = undef;
%cookies = parse CGI::Cookie($cookie) if defined $cookie;
if ( defined $cookie && defined $cookies{ ‘X-SERVERID’ } ) {
my $val =
$svc->{ pool }
->{ nodes }[ decode_server_id( $cookies{ ‘X-SERVERID’ }->value ) ];
my ( $ip, $port ) = @{ $val } if defined $val;
$ipport = $ip . ‘:’ . $port;
}
return $ipport;
}
sub find_or_get_new_backend {
my ( $svc, $req, $client ) = @_;
my Perlbal::BackendHTTP $be;
my $ipport = get_ipport( $svc, $req );
my $now = time;
while ( $be = shift @{ $svc->{ bored_backends } } ) {
next if $be->{ closed };
# now make sure that it’s still in our pool, and if not, close it
next unless $svc->verify_generation($be);
# don’t use connect-ahead connections when we haven’t
# verified we have their attention
if ( !$be->{ has_attention } && $be->{ create_time } < $now - 5 ) {
$be->close("too_old_bored");
next;
}
# don’t use keep-alive connections if we know the server’s
# just about to kill the connection for being idle
if ( $be->{ disconnect_at } && $now + 2 > $be->{ disconnect_at } ) {
$be->close("too_close_disconnect");
next;
}
# give the backend this client
if ( defined $ipport ) {
if ( $be->{ ipport } eq $ipport ) {
if ( $be->assign_client($client) ) {
$svc->spawn_backends;
return 1;
}
}
} else {
if ( $be->assign_client($client) ) {
$svc->spawn_backends;
return 1;
}
}
# assign client can end up closing the connection, so check for that
return 1 if $client->{ closed };
}
return 0;
}
# called when we’re being added to a service
sub register {
my ( $class, $gsvc ) = @_;
my $check_cookie_hook = sub {
my Perlbal::ClientProxy $client = shift;
my Perlbal::HTTPHeaders $req = $client->{ req_headers };
return 0 unless defined $req;
my $svc = $client->{ service };
# we define were to send the client request
$client->{ backend_requested } = 1;
$client->state(‘wait_backend’);
return unless $client && !$client->{ closed };
if ( find_or_get_new_backend( $svc, $req, $client ) != 1 ) {
push @{ $svc->{ waiting_clients } }, $client;
$svc->{ waiting_client_count }++;
$svc->{ waiting_client_map }{ $client->{ fd } } = 1;
my $ipport = get_ipport( $svc, $req );
if ( defined($ipport) ) {
my ( $ip, $port ) = split( /\:/, $ipport );
$svc->{ spawn_lock } = 1;
my $be =
Perlbal::BackendHTTP->new( $svc, $ip, $port,
{ pool => $svc->{ pool } } );
$svc->{ spawn_lock } = 0;
} else {
$svc->spawn_backends;
}
$client->tcp_cork(1);
}
return 0;
};
my $set_cookie_hook = sub {
my Perlbal::BackendHTTP $be = shift;
my Perlbal::HTTPHeaders $hds = $be->{ res_headers };
my Perlbal::HTTPHeaders $req = $be->{ req_headers };
return 0 unless defined $be && defined $hds;
my $svc = $be->{ service };
my $cookie = $req->header(‘Cookie’);
my %cookies = ();
%cookies = parse CGI::Cookie($cookie) if defined $cookie;
my $backend_id = get_backend_id($be);
if ( !defined( $cookies{ ‘X-SERVERID’ } )
|| $cookies{ ‘X-SERVERID’ }->value != $backend_id )
{
my $backend_cookie =
new CGI::Cookie( -name => ‘X-SERVERID’, -value => $backend_id );
if ( defined $hds->header(‘set-cookie’) ) {
my $val = $hds->header(‘set-cookie’);
$hds->header( ‘Set-Cookie’,
$val .= "\r\nSet-Cookie: " . $backend_cookie->as_string );
} else {
$hds->header( ‘Set-Cookie’, $backend_cookie );
}
}
return 0;
};
$gsvc->register_hook( ‘StickySessions’, ‘start_proxy_request’,
$check_cookie_hook );
$gsvc->register_hook( ‘StickySessions’, ‘modify_response_headers’,
$set_cookie_hook );
return 1;
}
# called when we’re no longer active on a service
sub unregister {
my ( $class, $svc ) = @_;
$svc->unregister_hooks(‘StickySessions’);
$svc->unregister_setters(‘StickySessions’);
return 1;
}
1;
Ill keep it short; SMF > rc.d
# ps -ef | grep -i thttpd root 3919 6938 0 15:45:16 pts/6 0:00 grep -i thttpd webservd 18619 592 0 Jun 11 ? 0:54 /opt/extra/sbin/thttpd -C thttpd.conf
Now…
# kill -9 18619 # ps -ef | grep -i thttpd webservd 4017 592 0 15:47:33 ? 0:00 /opt/extra/sbin/thttpd -C thttpd.conf
Automatic service restarts. Solaris is like a nice Cadillac compared to Linux.



