<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Onoclea Blog - English Edition</title>
    <link rel="alternate" type="text/html" href="http://blog.onoclea.com/en/" />
    <link rel="self" type="application/atom+xml" href="http://blog.onoclea.com/en/atom.xml" />
    <id>tag:blog.onoclea.com,2008-09-12:/en//1</id>
    <updated>2008-09-15T06:47:59Z</updated>
    
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type 4.21-en</generator>

<entry>
    <title>Backing up &quot;big&quot; Subversion repositories</title>
    <link rel="alternate" type="text/html" href="http://blog.onoclea.com/en/2008/09/backing-up-big-subversion-repositories.html" />
    <id>tag:blog.onoclea.com,2008://1.6</id>

    <published>2008-09-15T05:15:23Z</published>
    <updated>2008-09-15T06:47:59Z</updated>

    <summary>Going big in numbers Doing your regular backups is one thing, but doing it right is sometimes a quite different story. Especially when you happen to stumble upon some extreme situations - like e.g. a reasonably small (in terms of...</summary>
    <author>
        <name>Paweł J. Sawicki</name>
        
    </author>
    
        <category term="Knowledge Base" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="subversion" label="subversion" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blog.onoclea.com/en/">
        <![CDATA[<h2>Going big in numbers</h2>

<p>Doing your regular backups is one thing, but doing it right is sometimes a quite different story. Especially when you happen to stumble upon some extreme situations - like e.g. a reasonably small (in terms of actual disk size) <a href="http://subversion.org">Subversion</a> repository that has a rather high number of committed revisions.</p>]]>
        <![CDATA[<h2>The problem</h2>

<p>Some time ago our backup software started to returned strange looking information about one of our <a href="http://subversion.tigris.org"><span class="caps">SVN</span></a> repositories - it reported that the process was taking a very long time to complete. A full weekly backup that started on Monday night, hasn't finished till Wednesday morning. That wasn't normal...</p>

<p>Although original repo was quite small (500MiB), after being dumped, compressed and encrypted, it weighted over 11GiB! Given the final size, the reason why it took so long for a full backup to complete (remember - dump, compress and finally encrypt) was quite obvious. But why was the dump so enormous? Was there anything wrong in our setup? Other repositories were backup up as normal, and all of them could be restored (we did a full diff, just to make sure).</p>

<h2>A bit of a background</h2>

<p>After some investigation it turned out that the repo was quite small in size (about 500MiB) but it had a reasonably high number of committed revisions (way over 25k).</p>

<p>It wasn't anything strange as we keep our whole configuration in  <a href="http://subversion.tigris.org">Subversion</a>, which is then distributed by <a href="http://reductivelabs.com/projects/puppet">Puppet</a> among all of the servers. As a safe guard, all servers report and commit back all the changes made to their configuration. </p>

<p>They do it on regular intervals, run asynchronously from <code>cron</code>, every 4 to 6 minutes. If anything gets changed, machine sends back the current state of its configuration. Those changes are typically small, but given the number of servers, revisions build up quite fast.</p>

<h2>What can be done about it?</h2>

<p>The "problem" was in the way we did our backups. Initially, all our <a href="http://subversion.tigris.org">Subversion</a> repositories were dumped using <code>svnadmin dump</code> and then piped through a set of other tools like <code>bzip2</code>, <code>gpg</code> etc. Finally, they were distributed among different destinations.</p>

<p><code>svnadmin dump</code> uses a binary safe and portable text format to store the backup dumps. Unfortunately, if you have loads of revisions, and those commits are quite small in size, the overhead becomes very high - high enough to create such abnormal situations.</p>

<p>Even if the commit includes only a single line inside some file, it's still described by a number of attributes, that get dumped during the backup process. Multiply that by a factor of 25k and you get the point.</p>

<h2>Solution</h2>

<p>The solution turned to be rather easy.</p>

<p>Instead of doing <code>svnadmin dump</code> we make a <code>hotbackup</code> copy of the repository, <code>tar</code> and then feed through the original pipeline (<code>bzip2</code>, <code>gpg</code> and so on).</p>

<p>There is one catch, though. This requires extra space of roughly twice the size of the largest repository being backed up. But given the overall results, it was worth it in our case, since the size of the repository backup dropped dramatically - from 11GiB to something around 100MiB.</p>

<h2>Current state</h2>

<p>As a side note, currently our repository holds slightly above 700MiB of data and about 32k revisions. The backup still weights roughly 100MiB and takes only minutes to complete.</p>

<h2>So what?</h2>

<p>Always see through your toolbox before you start complaining or making panic movements :)</p>]]>
    </content>
</entry>

<entry>
    <title>Maintenance free machines...</title>
    <link rel="alternate" type="text/html" href="http://blog.onoclea.com/en/2008/05/maintenance-free-machines.html" />
    <id>tag:blog.onoclea.com,2008://1.5</id>

    <published>2008-05-23T10:16:37Z</published>
    <updated>2008-09-12T13:07:37Z</updated>

    <summary>There are some servers that &quot;Just Work&quot; (TM) (R). The trouble is, that sometimes you just have to take them down......</summary>
    <author>
        <name>Paweł J. Sawicki</name>
        
    </author>
    
        <category term="Knowledge Base" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="humorunix" label="humor unix" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blog.onoclea.com/en/">
        <![CDATA[<p>There are some servers that "Just Work" (TM) (R). The trouble is, that sometimes you just have to take them down... </p>]]>
        <![CDATA[<p>OK, it's geeky, but still - I was almost crying when I had to reboot those two machines in order to make some kernel upgrades:</p>

<textarea name="code" class="bash" cols="60" rows="7">
[root@spot.***.pl ~]# hostname
spot.***.pl
[root@spot.***.pl ~]# uname -a
Linux spot.***.pl 2.6.15-1.1833_FC4smp #1 SMP Wed Mar 1 23:56:51 EST 2006 i686 i686 i386 GNU/Linux
[root@spot.***.pl ~]# uptime
 04:06:16 up 647 days, 17:38,  2 users,  load average: 3.75, 2.42, 1.05
</textarea>

<textarea name="code" class="bash" cols="60" rows="7">
[root@backup.***.pl ~]# hostname
backup.***.pl
[root@backup.***.pl ~]# uname -a
Linux backup.***.pl 2.6.17-1.2142_FC4 #1 Tue Jul 11 22:41:14 EDT 2006 i686 athlon i386 GNU/Linux
[root@backup.***.pl ~]# uptime
 04:08:35 up 545 days, 15:03,  3 users,  load average: 0.03, 0.25, 0.15
</textarea>]]>
    </content>
</entry>

<entry>
    <title>Python 2.4 vs. 2.5 - sys.exit() throwing an Exception.</title>
    <link rel="alternate" type="text/html" href="http://blog.onoclea.com/en/2008/05/python-24-vs-25---sysexit-throwing-an-exception.html" />
    <id>tag:blog.onoclea.com,2008://1.4</id>

    <published>2008-05-18T16:32:19Z</published>
    <updated>2008-09-15T05:56:36Z</updated>

    <summary>Some couple of days ago I came across a rather annoying problem - I had a simple Python program that worked perfectly on my computer, while deployed on the target machine - it failed. What turned to be the problem?...</summary>
    <author>
        <name>Paweł J. Sawicki</name>
        
    </author>
    
        <category term="Knowledge Base" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="gotcha" label="gotcha" scheme="http://www.sixapart.com/ns/types#tag" />
    <category term="python" label="python" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blog.onoclea.com/en/">
        <![CDATA[<p>Some couple of days ago I came across a rather annoying problem - I had a simple Python program that worked perfectly on my computer, while deployed on the target machine - it failed. What turned to be the problem?</p>]]>
        <![CDATA[<p>Well, the thing is that I don't read change logs :) If I did, I wouldn't have wasted about a half an hour looking for my mistake.</p>

<p><span class="caps">OK, </span>so for those of you that are still reading this, here goes the detailed version.</p>

<textarea name="code" class="python" cols="60" rows="10">
$ cat exit.py <br />
import sys

try:<br />
    print "try"<br />
    sys.exit(0)<br />
except Exception, e:<br />
    print "except:", e<br />
    sys.exit(1)<br />
$ python2.4 -V<br />
Python 2.4.5<br />
$ python2.5 -V<br />
Python 2.5.2<br />
</textarea>

<p>And now the results:</p>

<textarea name="code" class="sh" cols="60" rows="10">
$ python2.5 exit.py; echo $?<br />
try<br />
0<br />
$ python2.4 exit.py; echo $?<br />
try<br />
except: 0<br />
1<br />
</textarea>

<p>So, 2.5 behaves as I expected it to be, while 2.4 seemed to get it quite strange... why, oh why? Here's why:</p>

<p><a href="http://docs.python.org/lib/module-exceptions.html#l2h-122">Python Module Exceptions</a></p>

<p>Solution - <span class="caps">RTFM</span>! :)</p>]]>
    </content>
</entry>

<entry>
    <title>Postfix + Dovecot LDA - Impossible is nothing!</title>
    <link rel="alternate" type="text/html" href="http://blog.onoclea.com/en/2008/02/postfix-dovecot-lda---impossible-is-nothing.html" />
    <id>tag:blog.onoclea.com,2008://1.3</id>

    <published>2008-02-01T14:56:48Z</published>
    <updated>2008-09-15T06:53:58Z</updated>

    <summary>For over a year now I&apos;ve been struggling with something I thought would never get solved - using catchall adresses with Dovecot&apos;s deliver (LDA)......</summary>
    <author>
        <name>Paweł J. Sawicki</name>
        
    </author>
    
        <category term="Knowledge Base" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="postfix" label="postfix" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blog.onoclea.com/en/">
        <![CDATA[<p>For over a year now I've been struggling with something I thought would never get solved - using catchall adresses with Dovecot's deliver (LDA)...</p>]]>
        <![CDATA[<p>Though the setup I have (postfix + dovecot) isn't really complicated, it does have some nifty features. Among many of them is the fact that I use catchall addresses in most of my domains. On the top of that I wanted to use Sieve to filter some of my mail on the<br />
server side (especially those "you always read them" daily log reports).</p>

<p>But there was a problem. I've even made a post on the Dovecot mailinglist:</p>

<p><a href="http://www.dovecot.org/list/dovecot/2006-March/012064.html">http://www.dovecot.org/list/dovecot/2006-March/012064.html</a></p>

<p>In general - dovecot cannot mimic postfix' way of handling catch-all addresses - there was nothing even similar to the "table search order" found in my favorite <span class="caps">MTA.</span> So I made a rather ugly patch... and it worked. But it was such a dirty hack that merging it with the dovecot's trunk wasn't really an option. So whenever a new dovecot release was<br />
published I had to manually prepare my "own" version. So I ended up having two separate setups depending on the client's demands - either a vanilla dovecot without my patch (that was updated on a regular basis) or a hacked version that... well... should be updated :)</p>

<p>Couple of days ago I was contacted by Maciej Paczesny asking if there was any progress with the problem I reported back in 2006.</p>

<p>After exchanging some emails and thoughts I think I've finally found a neat way of getting "things done right".</p>

<p><a href="http://www.postfix.org/canonical.5.html">http://www.postfix.org/canonical.5.html</a></p>

<p>All you have to do in fact is to rewrite the recipient address using postfix' <code>recipient_canonical_maps</code> table. That's all. Thanks to this little trick, dovecot-deliver receives an address that is final and unique so it doesn't have any troubles locating a proper message store directory.</p>

<p>Technically speaking, you must define <code>recipient_canonical_maps</code> in such a way that it would return final and unique address for any of your users (here we can benefit from postfix' table search order, so catch-all addresses do work!). Then postfix will rewrite the envelope <code>To:</code> header, pass it to dovecot and voila - problem solved.</p>

<p>Thanks Maciej! :)</p>]]>
    </content>
</entry>

<entry>
    <title>HP ProCurve Swtiches - Pure Beauty</title>
    <link rel="alternate" type="text/html" href="http://blog.onoclea.com/en/2007/01/hp-procurve-swtiches---pure-beauty.html" />
    <id>tag:blog.onoclea.com,2007://1.2</id>

    <published>2007-01-06T22:57:12Z</published>
    <updated>2008-09-15T05:46:10Z</updated>

    <summary>You may say that the subject is over-exaggerated, but see this piece of &quot;automation&quot;......</summary>
    <author>
        <name>Paweł J. Sawicki</name>
        
    </author>
    
        <category term="Knowledge Base" scheme="http://www.sixapart.com/ns/types#category" />
    
    <category term="howto" label="howto" scheme="http://www.sixapart.com/ns/types#tag" />
    
    <content type="html" xml:lang="en" xml:base="http://blog.onoclea.com/en/">
        <![CDATA[<p>You may say that the subject is over-exaggerated, but see this piece of "automation"...</p>]]>
        <![CDATA[<textarea name="code" class="sh" cols="60" rows="13">
$ cat download/update<br />
cd os<br />
put H_08_106.swi secondary<br />
$ eval `ssh-agent`<br />
Agent pid 25066<br />
[manthios@adder ~]$ ssh-add id_dsa<br />
Enter passphrase for id_dsa:<br />
Identity added: id_dsa (id_dsa)<br />
$ cd download; sftp -b update host<br />
sftp&gt; cd os<br />
sftp&gt; put H_08_106.swi secondary<br />
Uploading H_08_106.swi to /os/secondary<br />
Connection to host closed by remote host.<br />
$<br />
</textarea>

<p>And that's it - the switch is upgraded!</p>]]>
    </content>
</entry>

</feed>
