Backward compatible
Scrolling back in screen

A few years ago I discovered screen, a nice Linux tool that enables you to detached from terminal with commands running and all in the background. You can even connect later from a different computer and continue where you left off. I initially used it for rtorrent, but now I also use it to administer remote computers, for example when I start to do something that might take more than a day, I can log back in tomorrow. Also loggin in from home/work to complete some task, etc. Another use is administering remote computers on dial-up (yes, there are some) or slow and unstable 3G connections. Even if connection breaks down, I can log in later and pick up where it stopped.

One of the annoying “problems” with screen is that shift+page up/down does not scroll the buffer. This is due to the fact that screen has its own buffers. To work with them you need to enter the “copy mode” using Ctrl+a followed by [. Since I use non-English keyboard that’s Ctrl+a, AltGr+f. Hard to remember when you don’t use it often.

I use Konsole, and I found a way to make it work by adding the following lines to .screenrc (in my home directory):

termcapinfo xterm|xterms|xs|rxvt ti@:te@
Reducing dentry (slab) usage on machines with a lot of RAM
Recently I switched my main website from 2-core AMD 4GB RAM machine to 8-core 16GB RAM Intel i7 one. I also switched from CentOS 5 to CentOS 6. I set up everything the same, but suddenly the system was using much more RAM than before. And I’m not talking about filesystem cache here. I thought that increasing RAM would only increase filesystem cache, but something else was occupying RAM like crazy. Looking at output of “free”, “top” and “ps” I simply could not determine what eats RAM because running processes were fine.
So, I googled a little bit, and found that problem was in dentry cache used by Linux kernel. You can see the kernel memory usage with “slabtop” command, and my dentry was crazy, something like 5GB and growing. Googling even more, I found horror stories about servers going down, OOM killing vital processes like Apache or MySQL, etc. So I wanted to stop this.
Quick fix is to clear the cache manually. Some people even “solved” this problem by adding the command to cron job.
echo 2 > /proc/sys/vm/drop_caches
On the MRTG screenshot you can see the dentry cache size in megabytes marked as a blue line. 4000 means 4GB of cache. I have 16GB, remember. When you run the drop_caches command above, you get the effect marked by the red arrow.
I did not like the approach of adding this to crontab, so I investigated further, asked at mailing lists, learned that Linus himself says that “unused memory is dead memory” and that’s why kernel is hungry. Still, I decided to reduce the hunger and added this to /etc/sysctl.conf
vm.vfs_cache_pressure=10000
That did slow it down, but it was still growing. You can run sysctl -p to apply changes to the running kernel without restarting. Next I added these as well:
vm.overcommit_ratio=2vm.dirty_background_ratio=5vm.dirty_ratio=20
However, it was still growing, and I decided to leave it be and see what happens. Is my server going to crash, become unavailable, or something. 24 hours later, dentry was again going up like crazy and suddenly it dropped. By itself. See the blue arrow in the screenshot. It seems like kernel figure out that RAM is going to be exhausted, filesystem cache would be reduced, etc. After this point, everything went back to normal.
I tried this experiment again, about a week later, with same results. High-rise, drop and things going back to normal. So, if you’re worried your dentry cache is growing like crazy, don’t. Just tweak those settings in sysctl and wait for at least 48 hours before drawing any conclusions.

Reducing dentry (slab) usage on machines with a lot of RAM

Recently I switched my main website from 2-core AMD 4GB RAM machine to 8-core 16GB RAM Intel i7 one. I also switched from CentOS 5 to CentOS 6. I set up everything the same, but suddenly the system was using much more RAM than before. And I’m not talking about filesystem cache here. I thought that increasing RAM would only increase filesystem cache, but something else was occupying RAM like crazy. Looking at output of “free”, “top” and “ps” I simply could not determine what eats RAM because running processes were fine.

So, I googled a little bit, and found that problem was in dentry cache used by Linux kernel. You can see the kernel memory usage with “slabtop” command, and my dentry was crazy, something like 5GB and growing. Googling even more, I found horror stories about servers going down, OOM killing vital processes like Apache or MySQL, etc. So I wanted to stop this.

Quick fix is to clear the cache manually. Some people even “solved” this problem by adding the command to cron job.

echo 2 > /proc/sys/vm/drop_caches

On the MRTG screenshot you can see the dentry cache size in megabytes marked as a blue line. 4000 means 4GB of cache. I have 16GB, remember. When you run the drop_caches command above, you get the effect marked by the red arrow.

I did not like the approach of adding this to crontab, so I investigated further, asked at mailing lists, learned that Linus himself says that “unused memory is dead memory” and that’s why kernel is hungry. Still, I decided to reduce the hunger and added this to /etc/sysctl.conf

vm.vfs_cache_pressure=10000

That did slow it down, but it was still growing. You can run sysctl -p to apply changes to the running kernel without restarting. Next I added these as well:

vm.overcommit_ratio=2
vm.dirty_background_ratio=5
vm.dirty_ratio=20

However, it was still growing, and I decided to leave it be and see what happens. Is my server going to crash, become unavailable, or something. 24 hours later, dentry was again going up like crazy and suddenly it dropped. By itself. See the blue arrow in the screenshot. It seems like kernel figure out that RAM is going to be exhausted, filesystem cache would be reduced, etc. After this point, everything went back to normal.

I tried this experiment again, about a week later, with same results. High-rise, drop and things going back to normal. So, if you’re worried your dentry cache is growing like crazy, don’t. Just tweak those settings in sysctl and wait for at least 48 hours before drawing any conclusions.

Safe way to dual-boot Linux and Windows 7

I had a client’s machine installed with Windows 7 and some free hard disk space for Linux. I decided not to install the Linux boot loader because:

  • I did not have Windows install/rescue CD at hand
  • in case something goes wrong I could not boot into Windows
  • I had some experience in the past with Windows XP where it simply did not work

Since re-installing Windows or even fixing Windows if it became unbootable was not an option, I decided to play safe: use Windows’ boot loader to boot up Linux.

I did this in past with Windows XP. Basically, you save Linux boot loader into some file (it’s only 512 bytes) and then tell Windows’ boot loader to load it. On WindowsXP this means editing boot.ini file in C:. To create the linux boot loader file, install linux boot loader into root partition (for example, with LILO, if you installed Linux in /dev/sda4, then lilo.conf should read boot=/dev/sda4) and then read the first sector into a file:

dd if=/dev/sda4 of=linux.boot bs=512 count=1

This will create file named linux.boot which you need to copy to C:\ disk of your Windows machine (use the USB stick or network for this).

On Windows7 there is no boot.ini, you have to use Microsoft’s tool, named BCDEdit. BCD stands for Boot Configuration Data. You need to run BCDedit as administrator. Hit the Start button, then go to All programs and then to Accessories. Right-click the Command prompt and “Run as administrator”.

Now, we need to enter a couple of commands:

bcdedit /create /d "Linux" /application BOOTSECTOR

If will show something like

The entry {12345678-0000-1111-9999-112233445566} was successfully created.

That number is a unique identifier for boot menu entry. You need to use it in subsequent commands:

bcdedit /set {12345678-0000-1111-9999-112233445566} device boot
bcdedit /set {12345678-0000-1111-9999-112233445566} device partition=c:
bcdedit /set {12345678-0000-1111-9999-112233445566} PATH \linux.boot
bcdedit /displayorder {12345678-0000-1111-9999-112233445566} /addlast

You might need to prepend C: in the second line if it does not work this way.

Reboot and enjoy.

Access computers behing firewall with SSH

At our company we manage 100+ Linux computers remotely. Those are mostly clients for our ERP application, and sometimes you simply need to log in to fix something or help the user. Most of them are behind the firewall. In the past, we always had a deal with client’s IT staff to open a certain port on their firewall and forward it inside to SSH port at our machine. This works nice, but there are cases when IT guys have a hard time setting it up, or when ISP is simply blocking any possibility of doing so.

Last year I managed to set up reverse SSH to work around this. How this works? Basically, you need to have one publicly accessible server. The remote client logs into it using SSH and then opens a TCP port locally (on the server). After that, you can ssh to that port on the server machine and it tunnels back to ssh server on the remote workstation.

This was easy to set up manually, but we need a permanent connection. You can place the ssh command in some script at the client and make sure it runs, but there are times when this does not work so robust. Especially over mobile (3G, GPRS, EDGE) connections SSH session gets dumb and although it looks alive there it does not send any data back or forth.

Enter autossh. This great program starts the tunnel (no need to remember all the parameters to ssh client) and makes sure it stays up. Every 10 minutes (configurable) it checks if connection is still alive, and restarts it if data cannot be sent.

Things that hamper my productivity
Almost each day I’m facing obstacles that cut my workflow, make me go
around or just make me go mad. Here are some that repeat every once in
a while or just happened recently:

1. Firefox crashing
2. Google Apps multi-login failure
3. Linux terminal
4. Liquidweb routing
5. Xorg server killing keyboard
6. stuck SSH sessions

Ok, let’s go into details:

1. Firefox crashing

One of the best features Firefox has is crash-recovery, and that with
good reason. However, it still is not perfect. It happens often that I
have 10+ tabs open, one of those crashes FF and when I restart all my
GMail sessions are lost (see point 2. for more pain), I have to log in
again. Same with some other websites. Sending report to mozilla takes
forever, and even though there are sites I reported like 50+ times
while using FF 2, 3, 3.5 and 3.6, it still crashes FF 4. I wish Chrome
was easier to install on Slackware and if it had all the extensions I
needed (TamperData, Firebug, RequestPolicy, Screengrab are a must)

2. Google Apps multi-login failure

Having to log into accounts in exact order is painfull. What’s even
worse, once browser saves the cookie it is impossible to login into
any of domain-based accounts directly. You have to log into regular
GMail account first. Maybe I should take a job offer to go to work for
Google and help them fix this ;)

3. Linux terminal

I spend about 20% of my working time in terminal, mostly using ssh to
access remote computers or using make to build/install programs and
packages. KDE’s Konsole is the best tool I used. However, these is one
problem with resizing. I still haven’t been able to determine the
exact way to reproduce it, but switching from 80x25 to fullscreen at
some point starts a strange behavior. Parts of typed text get lost,
overwritten. This happens only when command cannot fit in 80 
characters. Maybe it happens when you get one command that is too long
for the fullscreen terminal, and after that something gets messed up.
I never managed to catch it, but it does annoy me. The only was to get
proper line wrapping back is to normalize the window size so that the
terminal is 80x25 again and then you should forget about fullscreen
until you log out and log in again.

4. Liquidweb routing

For more than a week, there has been some routing problem in
Liquidweb, and it does not look like it’s going to be fixed any time
soon. Searching on Google yielded some results, i.e. other people see
this problem, but it seems to be only sporadic in US, and I guess LW
does not care about the rest of the world. Some of my websites are
hosted at DTH and I’m accessing from Europe. Using some different ISP
in Europe makes it work, but traceroute shows completely different
path that does not go through LW at all. To cut the story short, I
don’t have access to our main bug/issue tracker, 3 company websites,
and one web service I’m using. I have to build SSH tunnels to my other
servers and reconfigure my local system to deal with this. It’s not
unsolvable, but it’s a major PITA.

5. Xorg server killing keyboard

When I’m working all day at full speed, I get this at least once. It
only happens on my 2 desktop computers, the laptop running same
version of Linux kernel and Xorg works just fine. At some time the
keyboard simply stops responding. I can you the mouse though. I tried
replacing the keyboard, mouse and motherboard, problem is still here.
This leaves the only conclusion: it must be software. It’s either
Linux or Xorg. My guess is Xorg, because I can use mouse to log out of
KDE, and then keyboard magically starts working again and I can type
password at KDM login to log back in.

6. stuck SSH sessions

I guess there is some configuration on my client’s network routers to
simply “lose” stale network connections. I log in via SSH and some
20-30 minutes later the session is stuck. The connection is not
dropped, it just stays there, waiting.

Do you have some stuff that really get’s on your nerve on a daily
basis? Please share…

Changing timezone on CentOS 5.5

I wanted to change the timezone of my CentOS server to be UTC-3 or -4.
I followed some instructions on the Internet, and did this first:

# rm /etc/localtime 
# ln -s /usr/share/zoneinfo/Etc/GMT-4 /etc/localtime

However, the clock (“date” command) now showed 4 hours more instead of
less. I tried something like:

# ln -s /usr/share/zoneinfo/Atlantic/South_Georgia /etc/localtime

and that worked fine. At least for the system date and time. After
this, I rebooted the system to make sure everything will be alright
afterward. However, PHP’s date() now showed 4 hours more instead of
less. MySQL was ok. I got really confused by this, and I digged into
/etc/php.ini and changed the date.timezone setting to:

date.timezone = Atlantic/South_Georgia

And restarted Apache:

/etc/init.d/httpd restart

It seems to work fine now.

Creating screencast with audio on Linux

By screencast I don’t mean slideshow, but real-time recording of
screen. I used the following software:

- recordmydesktop
- mencoder

Recordmydesktop is great program, it only has one subtle bug: it does
not allow X or Y coordinate to be zero, so I had to move all the
windows 1 pixel to the right. No big deal. I recorded on a 1024x768 
area using 1680x1050 screen, so there was plenty of space off-camera
that I could use the stage content and record everything in a single
go.

I used mencoder to convert the video from Ogg/Theora to other formats.
Although I prefer Ogg, many hardware DVD players do not support it (in
fact, it’s hard to find one that does).

I had problems with sound setup, although everything was at max in
KMix, it was not loud enough. I have same problems using Skype, so
this is some problem with my computer, not the software. Luckily,
mencoder can also manipulate volume, so I increased it during
conversion. I used a line like this one to invoke mencoder:

mencoder -ovc lavc -oac mp3lame -o video.avi -lameopts abr:br=128:vol=9 -mc 0 video.ogv

At first I used -oac lavc, but audio and video were out of sync, so I
switched to lame.

Printing from Windows machine to CUPS printer via Samba

I have a laser printer installed on a Linux box which is working from
Linux correctly. I can also print from other Linux machines in the
network via CUPS. One of the machines in the network runs Windows. I
shared the printer via Samba, so that Windows can “see” it via
standard Windows networking. Windows has driver for this printer
installed, but CUPS won’t allow it to print. The trick is to configure
CUPS to allow “raw” data to be sent directly to printer. To do this,
edit the file /etc/cups/mime.convs and uncomment this line (it’s near
the end of file):

application/octet-stream application/vnd.cups-raw 0 -

Depending on the default CUPS setup for your machine, you might also
need to edit the file /etc/cups/mime.types

After this, just restart CUPS and you can print from Windows box

/etc/rc.d/rc.cups stop
/etc/rc.d/rc.cups start
A few useful Linux commands

…I learned listening to TuxRadar radio…

xxd - hexadecimal dump of a file, works both ways (you can edit the
dump and save back to file)

xinput list - list and set up input devices for X window system

Linux is easy or hard…

…it depends how much you know about it.

There is 100000 ways to do anything on Linux. Of those, only 3 ways
are doable by mere mortals, of which only 2 ways fit what you might
understand, and only 1 way is the way YOU WANT to do it.

Linux is really hard for people who don’t have time (or a Linux-guru
friend to help them) to find those 2 ways. Linux is easy for those who
find their true path :D