RSS Feed

Create LiveUSB in a general way

Posted on

We can produce our own LiveUSB manually. The following are the steps that based on Arch Linux distro.

  1. Build a kernel that supports AUFS, SquashFS, and optionally vfat support. So that we can mount these filesystems. In Arch Linux, we can build linux-pf for these supports. This package can be found from AUR.
  2.  Then we can generate the initramfs. In Arch Linux, we can use mkinitcpio to create an initramfs image. For instance,
    mkinitcpio -k /boot/vmlinuz-linux-pf -c mkinitcpio-custom.conf -g initramfs.img

    (In Arch Linux, it is recommended to add in some modules in the mkinitcpio configuration file: zram, squashfs, loop, fuse, aufs, vfat)

  3. Besides that, we also need to prepare a squashfs so that it will act as a read only filesystem with workable programs. In order to create the squashfs, we can prepare it by using the existing filesystem.
    The important folders are: bin, etc, home, lib, mnt, opt, root, sbin, srv, usr, var.
    The important folders but let them empty are: dev, media, proc, run, sys, tmp.
  4. Then we can prepare a USB pendrive so that it can boot by using syslinux. syslinux will run the syslinux.cfg, which can be located at /syslinux or /boot/syslinux. The simplest way to make the syslinux.cfg to boot is something like
    DEFAULT myliveusb
    LABEL myliveusb
    KERNEL vmlinuz-linux-pf
    APPEND initrd=initramfs.img

    By this, it will use the kernel vmlinuz-linux-pf which is built at the beginning (in step 1), and initrd with the initramfs.img that is generated (in step 2). These files need to be put at the same path with syslinux.cfg.

  5. In order to boot properly, we need to modify the “init” file in the initramfs.img. This image is actually compressed cpio. We can extract the data by
    zcat ../initramfs.img | cpio -i

    The extracted data contains some scripts, kernel modules, and libraries that are required to boot up. So, we just modify the init script.
    In the init script, we may need to disable the fsck_root (in Arch Linux) so that it will not fsck the root file.
    Then, we need to add in the statements to mount the USB pendrive and created squashfs.
    Therefore, we add in the following statements to somewhere before the initramfs switch the root.

    #The statements are based on
    mkdir -p /mnt/usb /squashfs /tmpfs
    mount -t vfat -o ro /dev/disk/by-label/MYPENDRIVE /mnt/usb
    mount -t squashfs -o ro,loop /mnt/usb/path-to/root.sfs /squashfs
    mount -t tmpfs none /tmpfs
    #cd /tmpfs ; mkdir -p tmp var/log ; chmod 1777 tmp ; cd / #This step is optional
    mount -t aufs -o br=/tmpfs:/squashfs none /new_root #where the new_root is the root which will be switched according to Arch Linux

    Finally we need to re-compess by

    find . | cpio -o -H newc | gzip > ../initramfs.img

    If we read through the init script, at the end there is a statement that calls “switch_root” which will switch to the /new_root. Then, the system will boot successfully.

The AUFS is required because SquashFS is read-only. The following is the explanation of the statements above,

mount -t vfat -o ro /dev/disk/by-label/MYPENDRIVE /mnt/usb

is to mount the current USB pendrive to the /mnt/usb. We use the /dev/disk/by-label because it is the easiest way by referring the pendrive based on the label. /dev/sd{a,b,c} and so on are not applicable because different computer may have different number of storages. After this,

mount -t squashfs -o ro,loop /mnt/usb/path-to/root.sfs /squashfs

we mount the root.sfs which is created by mksquashfs. Then,

mount -t tmpfs none /tmpfs

we mount the tmpfs to a new folder /tmpfs. So that we use the memory for read-write operation. We do not use the default /tmp because it is used by initramfs.

cd /tmpfs ; mkdir -p tmp var/log ; chmod 1777 tmp

This is to create a tmp folder with the 1777 ownership as /tmp in the root. Finally,

mount -t aufs -o br=/tmpfs:/squashfs none /new_root

we mount the tmpfs and squashfs unionly to the /new_root

Because tmpfs is writable, squashfs is read-only, but they are mounted unionly in the new_root, as a result after switch_root, then it can boot and write to the new_root file system, as all the data are actually stored in the memory (as tmpfs occupies the memory).

By this, we can also create our own filesystem as a file (formated with ext4), and mount it unionly to the /new_root. So, that we can “save” the data and restore it during the next boot. (I have not yet tested this step by editing the init script.)

AlphaOS a really great LiveUSB

Posted on

These days, want to compile old projects. Firstly, I compiled all my code using Arch Linux. Then I decided to install my project to a LiveUSB, so that I need not to partition or use virtual machine on the target computer.

So I planned to use KNOPPIX.  I tried Linux Mint in VirtualBox, since it is based Debian, but failed to compile my code due to the CEGUI version is not the latest as Arch Linux.

As a result, I decided to use ArchPup (Puppy Linux). But ArchPup is superseded by AlphaOS. And, it solved my problem.

However, there are some limitations. Firstly, the latest version only support 64-bit computer. Secondly, problem of hardware driver. I wanted to install the broadcom-wl, but I cannot get the linux-headers. And the linux-headers of AlphaOS is different from Arch (because of different built). AlphaOS lacks of documentation on how can we produce or customise it. As a result, I can only use the default kernel and run my project without network connection.

Since the official site mentioned about Linux Live Kit. So, I tried to create my “dream LiveUSB” from the Arch Linux in the VirtualBox by this Linux Live Kit. However, I failed. I tried to compile the kernel with linux-pf (because it supports AUFS), then build the image using Linux Live Kit. But the OS was freeze in VirtualBox if I use the linux-pf kernel. This may be the caused by virtualbox-guest-modules.

Now, I am still trying to build my dream LiveUSB.

Traditional Chinese (BIG5) in the LANG=zh_CN.UTF8 locale

Posted on

Recently, I tried to play Tecmo Koei Sangokushi 12 PK Traditional Chinese version (三國志12威力加強版繁體中文) on WINE using PlayOnLinux. It works fine, but there is a problem that is the game can only be run in LANG=zh_CN.UTF-8 locale instead of zh_TW.UTF-8 locale (there are some reasons behind). As a result, if I enter some Chinese characters using fcitx, the output will be converted. Such as 一 becomes 珨. This is because when using fcitx in the zh_CN.UTF-8 locale, the character will be encoded as zh_CN.UTF-8. However, because the game itself is Traditional Chinese, the output will be encoded as possibly BIG5. This can be proved by using iconv with the following command,

echo "一" |iconv -f utf8 -t gb18030|iconv -f big5 -t utf8 #results 珨

Therefore, I need to convert this faulty character back to the character I intended. Thus, the iconv can be used to reverse the result by following command,

echo "珨" |iconv -f utf8 -t big5|iconv -f gb18030 -t utf8 #results 一

Because the game does not allow copy-paste, I can only solve this problem programmatically by creating a(n) fcitx module with the iconv. The module is available here.

The fcitx module I created works partially satisfying. It still has a problem which I cannot solve. I am not sure what is the root problem, either iconv, fcitx, WINE, the game itself, or other reason. That is, some of the Chinese characters such as 自 and 何 are not able to be entered and result question marks (?).

Note: In order to use the module, that is to convert the GB18030 to BIG5 (yet still UTF8), because BIG5 is Traditional Chinese, that means we need to enable the “Simplified Chinese To Traditional Chinese” module in fcitx. Then only input the Traditional Chinese characters will work. Such as entering 會 instead of 会. Because BIG5 does not have the character for 会, but 會.

Sangokushi 12 PK Traditional Chinese in WINE problem with zh_CN.UTF-8 locale

Sangokushi 12 PK Traditional Chinese in WINE problem with zh_CN.UTF-8 locale

Because of the question mark problem as mentioned above, I can only give up playing the game with WINE. There is no choice but only to play the game with Windows in the VirtualBox which works fine in the Chinese (Taiwan) locale.

Preserving text information in LibreOffice/OpenOffice Impress when producing PDF

Posted on

PDF file is usually my favourite format when distributing the documents to other end users. In Windows, we can install PDFCreator, subsequently there is a virtual printer which can be used to “print” the documents as a PDF format.

In Linux, there is no PDFCreator. Yet the most common alternative is CUPS-PDF. CUPS-PDF is depending on CUPS (Common Unix Printing System). If we want to do any printing in Linux, this is the package that we need.

However, in the recent CUPS-PDF (my current version is 2.6.1), if we print the file as PDF, all the text information are lost. The text is no more a vector data but the raster data. That is, if we zoom into it, they are pixelated. Moreover, we cannot highlight and copy the text. This makes the CUPS-PDF become less useful, and it is my last solution to produce the PDF file.

When using LibreOffice Writer and Calc, instead of using CUPS-PDF, I can “export directly as PDF”, which is a built-in export function. By this, the text information are all preserved.

However, when using the web browser like Firefox or Chromium, there is no “export directly as PDF” function. Fortunately, when we want to print, among the printers there is a “Print to File” printer. This is the solution in web browser. By “Print to File”, all the text information are preserved in PDF so that we can select and copy the text if needed.

LibreOffice Impress is a little different. We can also “export directly as PDF”, yet the output is the PDF with each slide as a page. This is not what we intend if we want to print some handouts. Because each page is a slide, and the background is colourful.

LibreOffice Impress actually has “Print to File” like Firefox or Chromium. However, we cannot see the “Print to File” printer like Firefox or Chromium. To get this, we go to File > Print. There are several tabs. In the Options tab, we can see the option “Print to File”. Check it, now the “OK” button becomes “Print to File…” button. As a result, we can customise our handouts, such as change the handouts to “Black & White”, 6 pages per sheet, and so on. After customising the settings, press the “Print to File…” button, we will produce a PDF file as handouts instead of slides and all the text information preserved.

Arch Linux, Sabayon, Gentoo

Arch Linux

I am an Arch Linux user, and I tried Arch Linux since March 2011. So far, Arch Linux works fine in almost everything. However, sometimes there are some issues which I face.

  1. Upgrading some libraries especially glibc, may cause Java related software cannot work, because these Java packages is not updated yet. Besides that, library like “icu” also causes LibreOffice cannot work sometimes.
  2. Sometimes, the latest software with new file format may not be supported in other computers. Similarly, some latest features does not work on other computers. For instance, when using PHP, in the later version we can write a statement such “$item = myFunc()[0];” where myFunc() is a function returning an array, and I want to access the first element immediately. By uploading such PHP script to the web hosting, because of the older version of PHP on the web hosting server, this statement does not work. That is, the best is not the best.
  3. Bluetooth problem. This is quite a long term problem. Pairing with bluetooth device is not smooth. I cannot mount the ObexFTP on the Android phone. Yet, there was no problem at all during the early time when I just using Arch Linux. Furthermore, I never successfully receive the file sent by phone through the bluetooth. Thirdly, sending the file to the phone through the bluetooth does not work with “blueman”, but only with “gnome-bluetooth”.
  4. Missed upgrades. I have one netbook, which is installed with Arch Linux but seldom used. When I want to use it, this requires a “great” upgrade. Sometimes this requires some manual configurations. Besides that, sometimes I need to download more than 1G size of packages for the upgrade. Thus, I will copy the cached packages from the frequently used computer to this old laptop. However, since it is a great leap, some dependencies are missing. This can be solved by checking the dependencies with “testdb”. Yet, the process is not easy. Because I experienced once which the upgrade caused the OS unbootable, due to the changes of systemd.

However, there are some advantages that I like about Arch Linux, which make me reluctant to look for alternative.

  1. AUR and PKGBUILD. This allows me to make my own packages easily and share to AUR. Then using the pacman helper such as yaourt to install all the packages by resolving the dependencies on AUR. To simplify it, this AUR and PKGBUILD allows me to extend my custom packages easily, far more easier than Debian package manager.
  2. Again, package manager rules. To use a distro, we must learn to use the package manager. Learning other software is not learning Linux, but learning the software themselves. The beauty of pacman is the simplicity. The categories of the packages also as simple as “core”, “extra”, and “community”. Searching, installing, uninstalling, cleaning, listing installed packages, listing files owned, and so on, are easily be used through the pacman command.
  3. Rolling release. Because of this, I no need to upgrade and see a surprising yet useless feature.
  4. Latest software. This is a double-edged sword. Latest with the new and useful feature, which may not be supported everywhere.
  5. Wiki, comprehensive instructions to configure the system.

So, why don’t I try out other distros which may have these features and yet more stable (in the sense of non-latest software)? So, I firstly tried Sabayon.


I tried Sabayon, since it is based on Gentoo, another rolling release distro. Yet, Sabayon is different from Gentoo, because the package manager, Entropy, will download the pre-compiled packages; Gentoo requires to compile the packages (except the packages like firefox-bin and libreoffice-bin).

I tried Sabayon with VirtualBox. Installed, yet I failed to upgrade. This is because the virtual hard disk is not enough, and my actual harddisk free space is also not enough. Thus, I gave it up.

The installation is easy, as it is using the GUI installation. However, I always wondering, after installation, I still need to do a lot of customisations. There is no much different from minimal installation then customisation and GUI installation then a lot of customisation.


So, because of not enough disk space, I installed Gentoo on VirtualBox. The official handbook, I personally found that it is not very clear, especially emerge-webrsync and “emerge –sync”. Because of network connection interrupted, I cannot update the Gentoo. Then this caused a lot of troubles.

Besides that, since I am using Arch Linux, I did not know that Gentoo is using OpenRC instead of systemd. Thus, in the make.conf file, I added the systemd and try to installed systemd as written in the official handbook. As a result, emerge produces dependency hell. And because I lack of knowledge about “emerge” usage, I did not know how to resolve all of these problems.

As a result, I tried again, re-install Gentoo, and learn more about emerge. Now I have successfully installed the desktop environment, Xfce4. There are more things to go, especially “overlay”. Because I have some software that I must use, which are not provided in the repositories.

Now, one disadvantage I found about Gentoo, is the compilation time of the packages. This is because compilation requires quite a lot of time and also high CPU usage. I wondering how much time I need if my laptop is not very powerful. Is it worth to compile these packages? Does these cached packages require more disk space than pre-compiled, or just similar? Or Sabayon may be more suitable for me, because no need compilation with Entropy?


Joining video parts together

Have you downloaded the videos online, such as Youku, Tudou, or even YouTube? Have you downloaded the videos which the uploaders split the them into several parts?

Whatever your answer is, you may face the same problem as me.

I downloaded the videos to watch later. But the videos are split into several parts. I wish to watch it as a whole (because it should be one big file). So, I created this script to solve the problem. This script requires MP4Box (in the gpac package) and FFmpeg.

To use the script,

./ 'video_part*.mp4' "video.mp4"

where the first argument is same as the “find . -iname ‘video_part*.mp4′”, so that if the files are video_part1.mp4, video_part2.mp4, video_part3.mp4, …, they will be joined together; 2nd argument is the output file. It will use the MP4Box to join the file, which is fast.

However, sometimes the videos we downloaded are FLV format. This is solved by FFmpeg, but it will convert to MP4 as well, and the conversion is slow.

./ -flv 'video_part*.flv' "video.mp4"

Just add “-flv”, it will use FFmpeg to convert and join the FLV videos into MP4. Actually, it not only converts from FLV, but any format supported by FFmpeg.

Therefore, if you downloaded a series of videos, and each series are split, then you may

for i in {01..20} ; do cmd=`echo "./ 'video${i}_part*.mp4' \"video${i}.mp4\""` ; sh -c "$cmd" ; done

The command above will convert and join the video01_part*.mp4 into video01.mp4, video02_part*.mp4 into video02.mp4, …, and so on until video20.mp4.

Gambler’s fallacy

Posted on

Referring to my previous post about gambler’s fallacy, I was totally wrong after I pondering more about this.

In an example of tossing a coin, we know that to get a “tail” is 0.5 probability and “head” is 0.5 probability. That means, each result should fairly appear once. And in the experiment, if we tossed the coin 1000 times, then we will get the result of “tail” appeared around 500 times and “head” another 500 times.

And in my previous post, I mentioned that, if I tossed the coin 10 times, and all the results are “tail”, then, as a gambler’s fallacy, I will feel that next toss or next 10 tosses should be probably “head”, so that the probability will be 0.5 and 0.5.

However, the problem is the “time to start tossing” restricted my thinking, thus I have a feeling as mentioned above.

In the experimental probability, the more we toss the coin, and collect the results, then the more accurate our results. For example, calculating the probability by tossing the coin 1000 times is better than calculating the probability by tossing the coin 100 times. Thus, it is not valid by tossing the coin ONCE and conclude that, “tossing the coin will ALWAYS be head (or tail)”.

Therefore, referring the situation that if I tossed the coin 10 times and all the results are “tail”, it cannot be considered as a reliable data. This is because, “someone” may have tossed the same coin 10,000,000 before me and the the result of probability 0.5 and 0.5. Thus my 10 times and get the “tail” doesn’t mean anything.

Besides that, the experiments are done to get the calculation of the probability, not reversing it by presume a probability and test by the experiments as the situation above. If I am the first person to toss a specific coin 100 times, and all the results are “tail”,  then I can say that the probability of getting the “head” of that specific coin is less than 0.5 and the “tail” is more than 0.5. I cannot simply assume that the next 100 times have the high probability to get “head”. There are several reasons: i) the coin may be poorly designed, it may ALWAYS produce “tail”, and ii) the event of tossing the coin is independent, that is tossing the coin now does not affect tossing the coin next time.

So, my commenter’s statement is very convincing.


Get every new post delivered to your Inbox.

Join 142 other followers