Thursday, February 21, 2008

Long Story Short: The old x86-32 Linux 1GB Memory limit problem [*]

You'll find lots of articles around the internet discussing the old 1GB Linux i386 memory limit, but I found no one that presents the problem in a short and direct manner. I hope below four points can fill this gap:

The usual programming practice is that every address points to a single byte, changing this means changing lots of current programs core code and algorithms. --> (1)

The 4GB address space needs to be shared between user-space and kernel space. If it wasn't shared, as in Ingo Molnar's 4G/4G patch, a TLB flush would occur for every transition from user-space to kernel space. --> (2)

The usual split was 3GB for user-space and 1GB for kernel-space. This means the kernel can not map more than 1 GB of virtual memory addresses in its Page Table Entries. --> (3)

Each memory dereference in an x86 protected-mode processor passes through its MMU unit, so to access a memory frame, a translation must exist from a virtual address to the frame's physical address. --> (4)

From (4) and as said, in (3), we only have 1GB of available kernel virtual addresses. Thus, the kernel can not access more than 1 GB of memory.

(*) Speaking of kernels with no HIGHMEM support enabled.

Friday, February 08, 2008

[Howto] LXRng on Ubuntu 7.10

It's been a long time since my last post and also since my latest linux-kernel development activity. I'm getting close to graduation and college is taking all of my time. My latest work on any non-college related stuff dates back to November, which is a very long three months period. Before this period I helped in developing some small parts of SMACK [1]. I remember those days very well, since this was my first time to contribute something not related to kernel janitors stuff and simple one-liner bug fixing. I was very happy in this active smack development period.

Now I'm trying to get back in the kernel development mood by following LKML, testing development trees and continue editing my kernel notes. While hacking the Linux Kernel, LXR is my main friend. It helps me in getting info about any function or structure in a matter of seconds. Due to some internet connection problems here in Egypt, I decided to deploy it locally. LXR's main author, Arne Georg, forked a new lxr version named LXRng which I decided to give it a try. Unfortunately the project is relatively new and the documentation was not very user friendly. The build process made me scratch my head for a while and I felt it's very worthy to write a simple ubuntu installation howto.

So without further ado, here we go:

1- Install 'git' scm and clone lxr's developer git tree as follows:
$ sudo apt-get install git-core
$ git-clone git://lxr.linux.no/git/lxrng.git

2- Install the postgresql database and it's client apps:
$ sudo apt-get install postgresql-8.2
$ sudo apt-get install postgresql-client-8.2

3- Install the Xapian search engine library and its Perl bindings:
$ sudo apt-get install libxapian15
$ sudo apt-get install libsearch-xapian-perl

4- Install apache and its perl module and as mentioned in documentation, install some of the HTML/Web Perl modules:
$ sudo apt-get install apache2 libapache2-mod-perl2
$ sudo apt-get install libcgi-simple-perl libcgi-ajax-perl libhtml-parser-perl libtemplate-perl

5- Install two more CPAN modules needed by the indexing process. One of them is used to check the memory usage and the second to draw a shiny toolbar.
$ sudo apt-get install libterm-progressbar-perl libdevel-size-perl

6- Now assuming 'ahmed' is your username that you are actually logged in with and the one that you did all the above commands with, run:
$ sudo -i
$ su - postgres
$ createuser ahmed # Answer 'yes' when asked about superprivileged access
$ exit

7- Create a suitable database for LXR and add the HTTP daemon user as an unprivileged DB user as follows:
$ createdb lxrng
$ createuser www-data

8- Now we are getting close to some of the interesting parts. We'll copy the template configuration file and modify it to represent our needs:
$ cp lxrng.conf-dist lxrng.conf
$ vim lxrng.conf

9- A lot of options can be specified in the config file, which is a perl script that you'll need to carefully modify. First you have two options, either to use a git repository, so you can cross-reference all of your project's history or you can cross-reference a single source directory.

You'll need to edit libXapian search files path, initialized in $search. Make sure the specified path exists with suitable permissions for current user or following steps will fail.

For referencing a whole git repository:
You'll need to edit below variables:
.... a- Your git repository path, initialized in $gitrepo
.... b- Your default version, where LXR will directly open, initialized in the line 'ver_default' => 'v2.6.20.3'.
.... c- If your git repository contains much more history than what you actually need, you can only specify the needed versions to be cross referenced by editing the line 'ver_list' => ['v2.6.23', 'v2.6.24'].
.... d-
The version names given at points b and c must exactly exist in the listing of the command "$git-tag -l". Otherwise, you'll face several failure points during the build and running time.

For referencing an isolated source directory (like a kernel.org-release tar file):
Assuming:
@@Source@@: an absolute path to desired project's source dire
@@Project_Name@@: Desired project name/alias
@@LXR_ROOT@@: Absolute path to downloaded lxrng source

$ mkdir source
$ (cd source && ln -s @@Source@@ @@Project_Name@@)
$ sed -i 's/$gitrepo/$plainrepo/g' lxrng.conf

Write directly after the end of conf file 'Use' statements:
use LXRng::Repo::Plain;
my $plainrepo = LXRng::Repo::Plain ->new('@@Source@@');

Then modify below variables as stated:
ver_list => ['@@Project_Name@@']
ver_default => '@@Project_Name@@'

10- Create database tables and cross-reference your source repository:
$ ./lxr-db-admin linux --init
$ ./lxr-genxref linux

11- Have a break and hang out with your friends for a while if you have a huge source repository like the Linux kernel.

12- Last step is to configure apache to point to lxr cgi scripts. This is simply done by the following steps:
$ cp apache2-site.conf-dist-perl apache2-site.conf
$ ln -s $pwd/apache2-site.conf /etc/apache2/sites-enabled/010-lxrng
$ vim apache2-site.conf
$ # Replace @@LXRROOT@@ with LXR source directory root.
$ # Assure that user www-data has read permissions to LXRROOT files.

13- Congratulations!. At last, we're done. You can begin exploring kernel sources at: http://localhost/lxr. Enjoy your time.

Thanks to Arne Georg Gleditsch for providing this amazingly useful peace of software.

See you the next post :).

[1]: You can know more about SMACK from the SMACK homepage.

Friday, September 07, 2007

Unix Administration (II): Security (C)

It's a new project here at Intel and a secure Unix development server was needed. I took the task of deploying and administrating that server. It's a classical TRAC + SVN setup but since the server will hold proprietary code, Encryption was needed for all outgoing traffic. HTTPS will be the backend for all TRAC pages and svn transactions.

After securing the internal server components as documented in previous articles, securing the server network interfaces was needed. Iptables is the prime utility for this gaol. Below steps assume a little familarity with the TCP/IP protocol and with iptables. If it's your first time, read this wonderfully written and detailed Iptables tutorial. Here we go:

1- Three custom chains will be created for configruation modularity.
valid-src: assure a sane source address for incoming IP packets, otherwise drop the packet
valid-dst: assure a sane destination address for outgoing IP packets, otherwise drop the packet
log-drop-{in,out}: log packets in a non-flooding way, then drop them

iptables -N valid-src
iptables -N valid-dst
iptables -N log-drop-in
iptables -N log-drop-out

2- Define rules for each custom chain to meet their objectives.
a- To achieve valid-{src,dst} objectives, we'll:
I) Drop packets claiming a private network source address II) Drop packets claiming _our_ external IP as their source address III) Drop packets related to multicast subnets

iptables -A valid-src -s 10.0.0.0/8 -j DROP
iptables -A valid-src -s 172.16.0.0/12 -j DROP
iptables -A valid-src -s 192.168.0.0/16 -j DROP
iptables -A valid-src -s 224.0.0.0/4 -j DROP
iptables -A valid-src -s 240.0.0.0/5 -j DROP
iptables -A valid-src -s 127.0.0.0/8 -j DROP
iptables -A valid-src -s 0.0.0.0/8 -j DROP
iptables -A valid-src -d 255.255.255.255 -j DROP
iptables -A valid-src -s 169.254.0.0/16 -j DROP
iptables -A valid-src -s $EXTERNAL_IP -j DROP
iptables -A valid-dst -d 224.0.0.0/4 -j DROP

b- log-drop-in and log-drop-out will be used to log the needed packets with a little unique prefix to identify their source. They will also be used to avoid flooding our system logs by keeping logging minimal to 10 entries/hour. After packets got logged, they will be quietly dropped.

iptables -A log-drop-in -m limit --limit 10/hour -j LOG \
--log-prefix "filter (blocked) INPUT:"
iptables -A log-drop-in -j DROP

iptables -A log-drop-out -m limit --limit 10/hour -j LOG \
--log-prefix "filter (blocked) OUTPUT:"
iptables -A log-drop-out -j DROP

3- Allow all traffic passing through the loopback interface

iptables -A INPUT -i lo -j ACCEPT
iptables -A OUTPUT -o lo -j ACCEPT

4- To get the best of our valid-src and valid-dst chains, we'll let any packets that pass through the "filter" table INPUT/OUTPUT chains jump to valid-src/dst before getting processed any further.

iptables -A INPUT -i $EXTERNAL_INT -j valid-src
iptables -A FORWARD -i $EXTERNAL_INT -j valid-src
iptables -A OUTPUT -o $EXTERNAL_INT -j valid-dst
iptables -A FORWARD -o $EXTERNAL_INT -j valid-dst


5- Filter incoming traffic.

a- Prevent SYN floods. Do not accept more than 5 TCP handshakes per second.

iptables -A INPUT -p tcp -m limit --limit 5/second \
--tcp-flags SYN,RST,ACK,FIN SYN --dport 443 -j ACCEPT

b- Accept all other normal (non-handshake) https traffic to our server without limits.

iptables -A INPUT -p tcp -m state --state ESTABLISHED,RELATED \
--dport 443 -j ACCEPT
# Also allow all outgoing apache https packets
iptables -A OUTPUT -p tcp --sport 443 -j ACCEPT

c- Disable accepting pings by dropping echo requests and by not sending echo replies

iptables -A INPUT -p icmp --icmp-type echo-request -j DROP
iptables -A OUTPUT -p icmp --icmp-type echo-reply -j DROP

d- Using a default DROP policy, all packets that didn't match any INPUT rule, will be logged then dropped.

iptables -A INPUT -j log-drop-in

6- Filter outgoing traffic
a- Allow server to access DNS services

iptables -A OUTPUT -p udp --dport 53 --sport 1024:65535 -j ACCEPT
iptables -A INPUT -p udp --sport 53 --dport 1024:65535 -j ACCEPT

b- Allow server to access http{,s} services. This is done by allowing sending tcp packets only to ports 80 and 443 and receiving non-handshake tcp packets only from ports 80 and 443.

iptables -A OUTPUT -p tcp -m state --state NEW,ESTABLISHED,RELATED \
-m multiport --dport 80,443 \
-m multiport --sport 1024:65535 -j ACCEPT

iptables -A INPUT -p tcp -m state --state ESTABLISHED,RELATED \
-m multiport --sport 80,443 \
-m multiport --dport 1024:65535 -j ACCEPT

c- Using a default DROP policy, all packets that didn't match any OUTPUT rule will be logged then dropped.

iptables -A OUTPUT -j log-drop-out

Issues:

Using above iptables rules will block traceroute traffice. Traceroute either uses UDP or ICMP echo packets with incrementing TTL values to do its job. While scanning the logs to monitor the blocked traceroute traffic nature, I discovered that it uses a different UDP destination port for each incrementing TTL packet.

filter (blocked) OUTPUT: TTL=2 ID=37822 PROTO=UDP SPT=37816 DPT=33440
filter (blocked) OUTPUT: TTL=3 ID=37823 PROTO=UDP SPT=37816 DPT=33441
filter (blocked) OUTPUT: TTL=3 ID=37824 PROTO=UDP SPT=37816 DPT=33442

Allowing such unreliable traffic will further complicate our Iptables rules. To avoid this problem, tcptraceroute utility can be used which sends TCP SYN packets - already allowed in our configuration - instead of above UDP packets.

Tuesday, August 21, 2007

Constructing a minimal Debian Linux USB disk

So I'm close to the end of my two-month Intel internship and everything is still great. One of the OEMs who were given some of the alpha installation disks created in the previous post asked about a minimal version of the project that can fit in a small 512 MB flash memory. I was given that task so here are the steps to create such a GNU/Linux flash disk:

1- Zero the whole USB disk
$ dd if=/dev/zero of=/dev/sdb

2- Create a new partition table with a one ext2 partition.

Note: At first I decided to divide the USB to two partitoins. This worked well on all the machines I could reach and even on some engineering samples Intel classmates I found around the corner. Unfortunately this caused GRUB to be very slow in loading the kernel and the initrd image (about 2 minutes!) on our board. This board, which is a not-released minimal essential series intel board, may have a buggy bios or something new somewhere. I tried to know the reason of the slowness but time started to run-out so I falled back to the one-partiton usb disk method.

$ parted /dev/sdb "mklabel msdos mkpart primary 0 -0 toggle 1 boot"
$ mkfs.ext2 /dev/sdb1

3- Label the partition as "HELIO_ROOT" to avoid the differenence of USB root directory names from machine to machine. On some machines the usb disk appears as an sda1, but on others it appears as sbd1. Accessing the root partition by label will solve most of those problems (actually the UUID technique is the most effective method and the one used by the latest Ubuntu feisty).
$ e2label /dev/sdb1 HELIO_ROOT

4- Install minimal Debian etch distribution files as follows:
$ debootstrap --arch i386 etch $CHROOT_DIR http://ftp.debian.org/debian

5- Prepare the installation environment under a chrooted folder. This way of building linux distributions assure that the new born distribution won't be tainted with the host environment.
Chrooted environment can't see the real /dev, /proc and /sys folders which are needed for hardware access and for internet access too. To solve this problem I created the below script which binds the needed folders to the chrooted similar folders (aka dev, proc, tmp, ..). It can also let X applications run from the chrooted environment easily.

#
# chroot.sh - Chroot to a different environment while being able
# to access the internet and run X applications
# Written by: Ahmed S. Darwish under the license of your choice
#
export CHROOT_ENV=$CHROOT_DIR

# We need files under /dev for various block and character devices operations
sudo mount --bind /dev/ $CHROOT_ENV/dev/
# We need files /proc and /sys files for chrooted environment internet access
sudo mount --bind /proc/ $CHROOT_ENV/proc/
sudo mount --bind /sys/ $CHROOT_ENV/sys/

# Let the chrooted environment X apps see the host X windows session files
sudo mount --bind /tmp/ $CHROOT_ENV/tmp/

# Let our host X accept connections from all local clients (so from the chrooted env. too)
xhost +
sudo cp ~/.Xauthority $CHROOT_ENV/root/
sudo cp ~/.ICEauthority $CHROOT_ENV/root/

# Do the chroot and global variable modification in a subshell not to affect the host session
(
export TERM="xterm"
export SHELL="/bin/bash"
export USER="root"
export USERNAME="root"
export PATH="/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin"
export PWD="/"
# Have the illusion of a first shell level
export SHLVL="1"
export HOME="/root"
export LOGNAME="root"
export DISPLAY=":0"
export XAUTHORITY="$HOME/.Xauthority"
export COLORTERM="$TERM"

sudo chroot "$CHROOT_ENV" /bin/bash -i
)

6- Chroot to the minimal environment by executing the above script. As you may have noticed, the chrooted environment needs a little tweaking.

Let partitons that contain temorary files or logs be mounted on a TMPFS system not to eat the
limited USB disk size. We won't need those files after shutdown anyway and logs may grow very large after a long period of use.

Add the following to the chrooted /etc/fstab:
LABEL=HELIO_ROOT / ext2 defaults,errors=remount-ro,noatime 0 1
proc /proc none defaults 0 0
tmpfs /tmp tmpfs defaults,noatime 0 0
tmpfs /var/lock tmpfs defaults,noatime 0 0
tmpfs /var/log tmpfs defaults,noatime 0 0
tmpfs /var/run tmpfs defaults,noatime 0 0
tmpfs /var/tmp tmpfs defaults,noatime 0 0

Add debian apt repositories in the apt repositories file (/etc/apt/sources.lst):
deb http://ftp.debian.org/debian etch main non-free contrib
deb-src http://ftp.debian.org/debian etch main non-free contrib

/etc/mtab is written a lot and it may ruin the flash lifetime. Let it by a symlink to /proc/mounts:
$ rm -f /etc/mtab && ln -s /proc/mounts /etc/mtab

$ echo "127.0.0.1 localhost" > /etc/hosts

7- Recover disk space wasted in locale fils and localized man pages by installing the localepurge utility. After installation, localepurge script will automatically delete all the local files of any newly installed debian package.
$ apt-get install localepurge
$ localepurge

8- Install a linux kernel image and the grub boot loader.
$ apt-get install linux-image-386
$ apt-get install grub
$ grub-install --root-directory=/ /dev/sdb

9- Create the file /boot/grub/menu.lst with your favourite options or with :
# No menu to user by default
timeout 0
# pretty red colours ;)
color red/black black/red

title Mini-Heliocreek USB disk powered by Debian etch
root (hd0,0)
kernel /vmlinuz root=LABEL=HELIO_BOOT init=/sbin/init
initrd /initrd.img
savedefault
boot

10- Add a default system user named "intel" where everything will run under his behalf:
$ groupadd intel
$ useradd --home /home/intel --shell /bin/bash --gid intel

11- Enable audio by downloading the ALSA package and adding the "intel" user to the audio group:
$ apt-get install alsa-base alsa-oss
$ sed -i 's/audio:\(.*\)$/audio:\1:intel/' /etc/group

12- Enable default user "intel" autologin using a very easy to use getty program named `rungetty' already available on the debian repositroy.
Note: The plain old getty package can do it but "rungetty" is less error prone in the autologin part.
$ apt-get install rungetty
$ sed -i '/getty/d' /etc/inittab
$ cat >> /etc/inittab << face="courier new">1:2345:respawn:/sbin/rungetty --autologin intel
EOF

13- Let the X server autostarts when the intel user logs in by calling startx in his .bash_profile file:
$ cat > /home/intel/.bash_profile <<>
if [ -z "$DISPLAY" ] ; then
while ! `pidof X` ; do
startx -- -ignoreABI
done
fi
EOF

14- When the xserver starts, it executes the command found under $HOME/.xinitrc file. Let this file launch our application.
The scenario begins when the user is automatically logged in using "rungetty" launched by "init", then his default shell "/bin/bash" is executed which run the .bash_profile script which starts X. X - following the orders of the .xinitrc file - runs our application under its context.

$ cat > /home/intel/.xinitrc <<>
# Disable the default ugly X background, let it be black
/usr/bin/xsetroot -solid black
xrandr -s 0
/usr/bin/xset s noblank
# Disable screensavers!
/usr/bin/xset s off
# Disable DPMS (Energy star) features
/usr/bin/xset -dpms
xhost +
# Our application sometimes launches mplayer, mplayer fails if the
# application was run in the foreground as the main X application.
# The hack is to run the application in the background and run an
# idle always-on application in the foreground.
OUR_APPLICATION &
openbox
EOF

15- Clean the local cached repository of retrieved package files
$ apt-get clean

16- Exit cleanly from the chrooted environment typing "exit" then:
$ umount $CHROOT_DIR/dev
$ umount $CHROOT_DIR/sys
$ # And so on for /proc and tmp/ folders

17- Copy the distribution files to the usb stick:
$ sudo mount /dev/sdb1 /mnt/usb_stick
$ sudo cp -rpvf $CHROOT_DIR/* /mnt/usb_stick
$ sudo umount /mnt/usb_stick

18- Try the usb stick using the qemu emulator
$ qemu /dev/sdb

If any problem is found, post a comment and I'll try to help as much as I can.

Thanks, See you the next post :)