Linux containers to use in a startup

To me it is important to have a sustainable server startegy that will allow me to move my environment from server to server quickly. What is more everyone needs a new machine with exactly same set of attributes. It usually takes a good hour to set up everything without virtualization and the problem gets bigger and bigger the more your traffic increases.

What I use for a small protals is usually LXC. It is very lightweight, you can have 2 or more virtual containers on one machine and still access all files connecting to the parent machine via SSH or a program like winSCP.

To create/destroy LXC container

sudo lxc-create -t user -n C1
sudo lxc-destroy -n C1

To clone existing container

sudo lxc-clone -o C1 -n C2

Files created in container C1 in directory /home/user are then visible on a host machine in  var/lib/lxc/CN/rootfs/home/user/, which is very convenient.

To start your container

sudo lxc-start -n C1 -d

If you want to start a container and connect to it immediately run:

sudo lxc-start -n C1

When sudo lxc-ls returns the container in a second line it means that it’s been run and you can connect to it. There’s two methods to do that:

  • ssh <container's IP> – you need to check your IP using ifconfig when being in a container
  • sudo lxc-console -n C1

To stop existing container:

sudo lxc-stop -n C1

For reference how to rename your containers: http://www.bonusbits.com/main/HowTo:Rename_LXC_Container

Now the interesting part, to move your conatiner to another host you’ll most like do something like

  • lxc-create -n <new container>
  • rsync -avzr [old host:]"/var/lib/lxc/<old container>/*" [<new host>:]/var/lib/lxc/<new container>/ --exclude 'home/ubuntu/instance/' --exclude 'home/ubuntu/build/' --exclude 'nohup.out' --exclude "*.log*"
  • @rsync -avzr [old host:]/data/lxc/<old container>/rootfs/home/ubuntu/code/.metadata [<new host>:]/var/lib/lxc/<new container>/rootfs/home/ubuntu/code/
  • in file [<new host>:]/var/lib/lxc/<new container>/config change all occurences of <old container> to <new container>, add bew IP and new hwaddr
  • in file [<new host>:]/var/lib/lxc/<new container>/rootfs/etc/hostname zmień <old container> to <new container>
  • in file [<new host>:]/var/lib/lxc/<new container>/rootfs/etc/hosts change <old container> to <new container>
  • in file [<new host>:]/etc/hosts add an entry <new container>
  • lxc-start -n <new container>
  • Add a file to autostart on host: ln -s /var/lib/lxc/<new container>/config /etc/lxc/auto/<new container>.conf
  • If you want to forward any port (most likely 25) run iptables -t nat -A PREROUTING -p tcp --dport 25 -j DNAT --to-destination <container's ip>:25.
  • If you're using any load balancing on nginx add this to your config on load-balancer:/etc/nginx/sites-enabled/routing

References

Command lists and a guide: https://help.ubuntu.com/12.10/serverguide/lxc.html
Introduction link: https://www.ibm.com/developerworks/linux/library/l-lxc-containers/

Some useful ubuntu admin commands for a startup

Below a bunch of useful command you might need to use everyday when monitoring your servers..

 

how to monitor your CPU and mem usage?

htop

mass rename files, for example when you want to change all semicolons “;” to a comma

rename -v "s/;/,/g" *.JPG

delete unnecessary duplicated files

find . -name "*(2)*" -delete

find a number of file downloads in nginx’s access log from yesterday

cat /var/log/nginx/access.log.1 | grep -P "GET /files/|GET /yourfolder/[0-9]+/yourfolder" | wc -l

– synchronize files from one server to anothr

rsync -av --bwlimit=8000 root@<your_IP>:/home/instance/attchs /home/instance/

find and replace “foo” to “bar” in all *.php files

find . -type f -name "*.php" -print | xargs sed -i 's/foo/bar/g'

find files that include a new line in the name (useful when you want to fix this)

find / -name "*[Enter here]
*" -print

fix the time on ubuntu

ntpdate ntp.ubuntu.com

find a process that blocks port 8080

netstat -tulpn | grep 8080

when in fish shell move foreground process to background

Ctrl-Z
bg
exit

Parsing external emails in Python

Our startup application gets a few thousands emails a day from every possible mailbox including local ones. We parse those raw emails and display them in safe HTML 5 format to our users.

Finding python librar(ies) that were suitable for this task was quite challenging. We’ve tested at least 10 different solution before arriving to a conclusion that perfect one doesn’t exist.

Having a huge test set of 100 000+ emails from all around the world and most of the main email clients I’ve compiled a list of 30 comprehensive tests. I’ve copy-pasted the interesting parts of the email, written my expected parsed result. In time I’ve added multiple alternative expected results as different libraries were producing things not exactly intuitive, but still valid and displayable.

One test was failed by every single library or combination that we tried.

First, stripping outlook tags, was nicely done by genshi


def test_should_strip_outlook_tags(self):
    msg = u'''
          <p>text</p>
          <!--[if gte mso 9]><o:custom></o:custom><![endif]-->
          <p>after</p>
    expected = u''' 
                 <p>text</p>
                 <p>after</p> '''
               '''

 

Then one of the weird things outlook tends to do is out of the blue equation mark

 

msg = u'''
         <p>text</p>
         <p = class=3DMsoNormal>inner</p>
        '''

expected = u'''
            <p>text</p>
            <p class="3DMsoNormal">inner</p>
           '''

Unfortunately no library could get along with this type of contorted html:


@unittest.skip('This is valid for HTML5, however html5lib processes this against expectations')
def test_should_strip_self_closed_b_without_rendering_it_further(self):
    msg = u'''
         <div>
              <div class="bold" />
              not_bold_content
         </div>
         should_not_be_bold
         '''

    expected = u'''
                  <div>
                      <div class="bold"></div>
                      not_bold_content
                  </div>
                  should_not_be_bold
               '''

What we got instead was self closed tag. This is actually pretty big issue, as self closed tag like this causes chrome to display entire page as bolded.. So we ended up fixing library itself.