[HOWTO] iPXE boot a Live diskless Mageia 4 ramdisk

Here you'll find a place for solutions and hints.

Please use one of the support subforums below for questions or if you have any issues and need support.

[HOWTO] iPXE boot a Live diskless Mageia 4 ramdisk

Postby syschuck » Apr 29th, '15, 23:37

Hi fellow magicians. What follows is a tutorial to on how to create a diskless Live Mageia 4 that PXE boots from a server. The motivation for this came when trying to build a High Performance Computer cluster using Mageia. The cluster consists of 1 head node with a RAID5 (8 TB disk), 16GB RAM, dual AMD Quad core Opterons. It will act as a router and NAT for the 16 slave nodes also dual quad core Opterons, 16GB RAM and 80GB disks. While not the fastest cluster, it has some it muscles when combined. Because of the amount of RAM on each node, I wanted to boot a live Mageia system and use the 80GB disks drives as swap. So there are three majors steps needed. First we need to setup the server. Then we need to configure iPXE. Finally we need a live image to boot. Once this is done, you could pxe boot this live image to any system in your network for a great thin client if there is enough RAM on the system. This tutotial will also show you how to install FOG (The Free Opensource Ghost server), how to build an iPXE boot system, and finally how to make an initramfs live image that can be booted out of RAM. So let get started. The server ip address for the cluster side is 10.0.10.1. Use what you need for your network topology.

We need several server functions on such as dhcpd, tftpboot, iPXE, and apache. iPXE is a new pxe server with some very cool features. You could use pxelinux as an alternative but iPXE has some very cool features that make it worth it. To make installation of all of these services as simple as possible, I recommend downloading FOG from svn and manually installing it. http://fogproject.org/ It recognizes Mageia and will install all the services you need to network boot. If you service Windows systems, having FOG is a bonus. To download FOG.
Code: Select all
 svn checkout svn://svn.code.sf.net/p/freeghost/code/trunk fog

Once downloaded;
Code: Select all
 cd ./fog/bin; ./installfog.sh

This will use urpmi to download all of the services you will need on the server side to get things going. Once everything is built. you will have a nice fog server preference for taking snapshots of your laptop, desktop and rolling out 1000's of images across a network. FOG is awesome as it's own tool. If you have any troubles with fog, they have a great forum and knowledgeable experts on iPXE, tftpboot, and dhcpd. Once all of that seems to be working its time to play with iPXE.

FOG uses iPXE to do some pretty magical image ghosting. The default boot code for iPXE is located in /var/lib/tftpboot/undionly.kpxe. This gets run by dhcpd and is specified in /etc/dhcpd.conf
Code: Select all
# DHCP Server Configuration file. /etc/dhcpd.conf
# see /usr/share/doc/dhcp*/dhcpd.conf.sample
# This file was created by FOG

# Definition of PXE-specific options
# Code 1: Multicast IP address of bootfile
# Code 2: UDP port that client should monitor for MTFTP responses
# Code 3: UDP port that MTFTP servers are using to listen for MTFTP requests
# Code 4: Number of seconds a client must listen for activity before trying
#         to start a new MTFTP transfer
# Code 5: Number of seconds a client must listen before trying to restart
#         a MTFTP transfer

option space PXE;
option PXE.mtftp-ip    code 1 = ip-address;
option PXE.mtftp-cport code 2 = unsigned integer 16;
option PXE.mtftp-sport code 3 = unsigned integer 16;
option PXE.mtftp-tmout code 4 = unsigned integer 8;
option PXE.mtftp-delay code 5 = unsigned integer 8;
option arch code 93 = unsigned integer 16; # RFC4578

use-host-decl-names on;
ddns-update-style interim;
ignore client-updates;
next-server 10.0.10.1;

# Specify subnet of ether device you do NOT want serviced.  For systems with
# two or more ethernet devices.  Block dhcpd from internet port for HPC cluster
# Note: 136.165 is ours.   Use yours.
subnet 136.165.0.0 netmask 255.255.0.0 { }

# IP address for cluster slave nodes.
subnet 10.0.10.0 netmask 255.255.255.0 {
        option subnet-mask              255.255.255.0;
        range dynamic-bootp 10.0.10.10 10.0.10.254;
        default-lease-time 21600;
        max-lease-time 43200;
        option domain-name-servers      10.0.10.1;
        option routers      10.0.10.1;
        filename "undionly.kpxe";
# Old school pxeboot would use
#      filename "pxelinux.0";
}


When the undionly.kpxe boots it reads the configuration in
Code: Select all
 /var/lib/tftpboot/default.ipxe
and this is where we can get creative. By default fog installs this line in the default.ipxe
Code: Select all
#!ipxe
cpuid --ext 29 && set arch x86_64 || set arch i386
params
param mac0 ${net0/mac}
param arch ${arch}
param product ${product}
param manufacturer ${product}
param ipxever ${version}
param filename ${filename}
isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
:bootme
chain http://10.0.10.1/fog/service/ipxe/boot.php##params


iPXE however is a small configuration language all in itself. See: http://ipxe.org/ So lets add a menu, and some options; My default.ipxe looks like this;
Code: Select all
#!ipxe
:MENU
menu
item --gap -- ---------------- iPXE boot menu ----------------
item TS TerminalSession Live
item Mageia Mageia4 installer Live
item SL Scientific Linux64 7.1 Live Kde
item hostinfo Computer Details
item shell ipxe shell
item ubcd Universal Boot CD
item fog Fog menu
choose --default shell --timeout 5000 target && goto ${target}
 
# Terminal server linux works
:TS
initrd http://10.0.10.1/live/TSLive.iso
chain ${boot-url}/memdisk iso raw ||
goto MENU
 
# Mageia 4 live diskless
:Mageia
initrd ${base-url}/custom-initramfs.cpio.gz
chain  ${base-url}/vmlinuz audit=0 rescue
goto MENU

# Scientific Linux Doesn't work.
:SL
sanboot http://10.0.10.1/live/SL71LiveDVDkde.iso
goto MENU
 
:hostinfo
echo This computer : ||
echo MAC address....${net0/mac} ||
echo IP address.....${ip} ||
echo Netmask........${netmask} ||
echo Serial.........${serial} ||
echo Asset number...${asset} ||
echo Manufacturer...${manufacturer} ||
echo Product........${product} ||
echo BIOS platform..${platform} ||
echo ||
echo press any key to return to Menu ||
prompt
goto MENU
 
:shell
shell ||
goto MENU

# UBCD didn't work either.
:ubcd
initrd http://10.0.10.1/live/ubcdlive02b.iso root=/dev/ram0 rw ramdisk_size=20000000
chain ${boot-url}/memdisk iso ||
goto MENU
 
:fog
cpuid --ext 29 && set arch x86_64 || set arch i386
params
param mac0 ${net0/mac}
param arch ${arch}
param product ${product}
param manufacturer ${product}
param ipxever ${version}
param filename ${filename}
isset ${net1/mac} && param mac1 ${net1/mac} || goto bootme
isset ${net2/mac} && param mac2 ${net2/mac} || goto bootme
:bootme
# chain http://10.0.10.1/fog/service/ipxe/boot.php##params
chain http://10.0.10.1/fog/service/ipxe/boot.php?mac=${net0/mac} ||
prompt
goto MENU

:return
autoboot


Basically we are just interested in adding the Mageia selection to the default.ipxe
Code: Select all
 
:Mageia
initrd ${base-url}/custom-initramfs.cpio.gz
chain  ${base-url}/vmlinuz audit=0 rescue


Now we need to build the custom initramfs.cpio.gz and get a copy of vmlinuz. First, lets create a directory called diskless and a /mnt/iso
Code: Select all
mkdir /diskless
mkdir -p /mnt/iso


I found that Magie 4 boot.iso works very well as the base initramfs and vmlinuz kernel. (also I'm doing 64 bit)
Code: Select all
 
wget https://www.mageia.org/en/downloads/get/?q=Mageia-4.1-Boot-nonfree-x86_64-CD.iso

Once downloaded, lets mount it and extract the old initramfs
Code: Select all
mount -o loop Mageia-4.1-Boot-nonfree-x86_64-CD.iso /mnt/iso
cd /mnt/iso/isolinux/x86_64
cp vmlinux /var/lib/tftpboot
cd /diskless
xzcat /mnt/iso/isolinux/x86_64/all.rdz | cpio --extract
umount /mnt/iso


This will extract the initramfs from the network installer and build a near complete file system. Now lets add your packages. You may want to modify this for your own system needs. This is what I thought would be good. This is the diskless.sh program.
Code: Select all
 
#!/bin/bash
# This program builds a mageia root filesystem for "diskless" nfs exported filesystem.
# Run as root! CBS: 2013  Public domain.                                                                                                                               
                               
#
# DESTINATION=$1
DESTINATION="/diskless"
# SERVER_IP=IP4 address of DNS/NFS server hosting diskless filesystem.
SERVER_IP=10.0.10.1

if [ -d $DESTINATION ]; then
echo "Destination ${DESTINATION} Exists Proceed anyway? (Y/N) "
read A
if [ "$A" == "N" ]; then
exit
fi
else
echo "Building root filesystem for diskless nfsroot in ${DESTINATION}"
mkdir -p $DESTINATION
fi


# STAGE1  Prepare the destination
rpm --root $DESTINATION --initdb
urpmi --root $DESTINATION --auto basesystem-minimal

# Install basic commands and utilities
urpmi --root $DESTINATION --auto locales-en nfs-utils unfs3 bash-completion colorprompt openssh-clients openssh-server emacs-nox netkit-telnet-server netkit-telnet rsh
 rsh-server task-c-devel task-c++-devel

# Install developers code, mpi, and what ever, etc.. here.
# urpmi                                                                                                                                                               
 

# STAGE2  Things in /etc
#
echo "nameserver" $SERVER_IP >> $DESTINATION/etc/resolvconf/resolv.conf.d/base
touch $DESTINATION/var/lib/random-seed
echo "node00-0" > $DESTINATION/etc/hostname
cp -f /etc/hosts /etc/passwd /etc/shadow /etc/group /etc/gshadow $DESTINATION/etc
echo "NETWORKING=yes" >> $DESTINATION/etc/sysconfig/network
echo "NEED_IDMAP=yes" >> $DESTINATION/etc/sysconfig/nfs-common
echo "READONLY=yes" > $DESTINATION/etc/sysconfig/readonly-root
#                                                                                                                                                                     
 
# Modify rwtab                                                                                                                                                         
 
# Note: This is important for a readonly root filesystem.                                                                                                             
 
#                                                                                                                                                                     
 
sed -i '/\/var\/lib\/dbus/d' $DESTINATION/etc/rwtab
echo "empty   /root
files   /var/run
files   /var/lock
files   /var/lib/dbus" > $DESTINATION/etc/rwtab.d/custom
#
# This is /etc/fstab for the nfsroot file system. You may want to edit this as you see fit.
#

echo "# /etc/fstab: static file system information.
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
proc            /proc           proc    defaults        0       0
none            /               tmpfs   defaults        0       0
none            /tmp            tmpfs   defaults,rw,noatime,mode=1777,nosuid,size=512M  0       0
none            /var/run        tmpfs   defaults        0       0
none            /var/lock       tmpfs   defaults        0       0
none            /var/tmp        tmpfs   defaults,rw,noatime,mode=1777,nosuid,size=512M  0       0
none            /root           tmpfs   defaults,rw,noatime,mode=1777,nosuid,size=512M  0       0
# ${SERVER_IP}:/home  /home           nfs     sync,hard,intr,rw,nolock,rsize=8192,wsize=8192  0       0
# ${SERVER_IP}:/usr/local /usr/local  nfs     sync,hard,intr,rw,nolock,rsize=8192,wsize=8192  0       0
# 10.0.0.1:/raid0  /raid0               nfs     sync,hard,intr,rw,nolock,rsize=8192,wsize=8192  0       0
# 10.0.0.1:/raid1  /raid1               nfs     sync,hard,intr,rw,nolock,rsize=8192,wsize=8192  0       0
" > $DESTINATION/etc/fstab


You may want to add yellow pages and ssh/rsa keys and to your image. Run the diskless.sh and your have just about everything setup. Before everything is setup we need to make a modification to the file /diskless/init

init is the shell script run at boot time that gets everything started. It builds temporary ramdisk that finds the linux filesystem that is then overlayed on top of ramdisk. Well kind of. It's complicated but eventually it runs a command called /sbin/switch_root which does the file system overlay and then calls systemD to start. Because we don't to overlay a filesystem on our ramdisk all we need to do is start systemD. (/sbin/init is symlinked to systemd). So using vi, or your favorite editor edit /diskless/init and remark out the whole last section and add the exec statement. This code is at the very bottom of /diskless/init.

Code: Select all
CAPSH=$(command -v capsh)
SWITCH_ROOT=$(command -v switch_root)
PATH=$OLDPATH
export PATH

#if [ -f /etc/capsdrop ]; then
#    . /etc/capsdrop
#    info "Calling $INIT with capabilities $CAPS_INIT_DROP dropped."
#    unset RD_DEBUG
#    exec $CAPSH --drop="$CAPS_INIT_DROP" -- \
#        -c "exec switch_root \"$NEWROOT\" \"$INIT\" $initargs" || \
#    {
#       warn "Command:"
#       warn capsh --drop=$CAPS_INIT_DROP -- -c exec switch_root "$NEWROOT" "$INIT" $initargs
#       warn "failed."
#       action_on_fail
#    }
#else
    unset RD_DEBUG
#    exec $SWITCH_ROOT "$NEWROOT" "$INIT" $initargs || {
#       warn "Something went very badly wrong in the initramfs.  Please "
#       warn "file a bug against dracut."
#       action_on_fail
#    }
#fi
warn "Executing init on initramfs"
exec /sbin/init


Now we are ready to create our new ramdisk image; /var/lib/tftpboot/custom-initramfs.cpio.gz For that I use this shell script I call make_initramfs.sh
Code: Select all
#!/bin/sh
cd /diskless; find . -print0 | cpio --null -ov --format=newc | gzip -9 > /tftpboot/custom-initramfs.cpio.gz


That's it. Now your ready to PXE boot your clients. If all goes well, you should see an iPXE menu, select mageia4 and watch. :shock:

You will probably want to modify your /diskless image and add your tweaks to it. I simply wanted text base nodes on my cluster so I have systemD just going multiuser without a GUI, but you may want to add the graphical system by removing rescue from the /var/lib/tftpboot/default.ipxe mageia option.

There are a lot of refinements that could be done to make this boot mageia as a thin client, perhaps with an nfs mounted /home for your users, and yellow pages (ypclient / ypserver) to manage user passwords. Perhaps we can find methods to run Xen or Virtualboxs on the cluster nodes (creating a Citrix like virtualization cluster environment). I need to see if dkm can be used on the /diskless image for the Nvidia and cards. Perhaps someone can explore that a little further in the future. Anyway, have fun and enjoy your pxe booted diskess live mageia image.

:D
syschuck
 
Posts: 15
Joined: Jul 4th, '13, 04:06

Re: [HOWTO] iPXE boot a Live diskless Mageia 4 ramdisk

Postby doktor5000 » Apr 30th, '15, 12:02

Thanks for sharing :D

You might want to look at http://colin.guthr.ie/2011/06/network-b ... -goodness/ as there are also some other hints related to PXE
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 17629
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany


Return to The magician suggests...

Who is online

Users browsing this forum: No registered users and 1 guest