Joining video parts together

Have you downloaded the videos online, such as Youku, Tudou, or even YouTube? Have you downloaded the videos which the uploaders split the them into several parts?

Whatever your answer is, you may face the same problem as me.

I downloaded the videos to watch later. But the videos are split into several parts. I wish to watch it as a whole (because it should be one big file). So, I created this script to solve the problem. This script requires MP4Box (in the gpac package) and FFmpeg.

To use the script,

./video_join.sh 'video_part*.mp4' "video.mp4"

where the first argument is same as the “find . -iname ‘video_part*.mp4′”, so that if the files are video_part1.mp4, video_part2.mp4, video_part3.mp4, …, they will be joined together; 2nd argument is the output file. It will use the MP4Box to join the file, which is fast.

However, sometimes the videos we downloaded are FLV format. This is solved by FFmpeg, but it will convert to MP4 as well, and the conversion is slow.

./video_join.sh -flv 'video_part*.flv' "video.mp4"

Just add “-flv”, it will use FFmpeg to convert and join the FLV videos into MP4. Actually, it not only converts from FLV, but any format supported by FFmpeg.

Therefore, if you downloaded a series of videos, and each series are split, then you may

for i in {01..20} ; do cmd=`echo "./video_join.sh 'video${i}_part*.mp4' \"video${i}.mp4\""` ; sh -c "$cmd" ; done

The command above will convert and join the video01_part*.mp4 into video01.mp4, video02_part*.mp4 into video02.mp4, …, and so on until video20.mp4.

Gambler’s fallacy

Referring to my previous post about gambler’s fallacy, I was totally wrong after I pondering more about this.

In an example of tossing a coin, we know that to get a “tail” is 0.5 probability and “head” is 0.5 probability. That means, each result should fairly appear once. And in the experiment, if we tossed the coin 1000 times, then we will get the result of “tail” appeared around 500 times and “head” another 500 times.

And in my previous post, I mentioned that, if I tossed the coin 10 times, and all the results are “tail”, then, as a gambler’s fallacy, I will feel that next toss or next 10 tosses should be probably “head”, so that the probability will be 0.5 and 0.5.

However, the problem is the “time to start tossing” restricted my thinking, thus I have a feeling as mentioned above.

In the experimental probability, the more we toss the coin, and collect the results, then the more accurate our results. For example, calculating the probability by tossing the coin 1000 times is better than calculating the probability by tossing the coin 100 times. Thus, it is not valid by tossing the coin ONCE and conclude that, “tossing the coin will ALWAYS be head (or tail)”.

Therefore, referring the situation that if I tossed the coin 10 times and all the results are “tail”, it cannot be considered as a reliable data. This is because, “someone” may have tossed the same coin 10,000,000 before me and the the result of probability 0.5 and 0.5. Thus my 10 times and get the “tail” doesn’t mean anything.

Besides that, the experiments are done to get the calculation of the probability, not reversing it by presume a probability and test by the experiments as the situation above. If I am the first person to toss a specific coin 100 times, and all the results are “tail”,  then I can say that the probability of getting the “head” of that specific coin is less than 0.5 and the “tail” is more than 0.5. I cannot simply assume that the next 100 times have the high probability to get “head”. There are several reasons: i) the coin may be poorly designed, it may ALWAYS produce “tail”, and ii) the event of tossing the coin is independent, that is tossing the coin now does not affect tossing the coin next time.

So, my commenter’s statement is very convincing.