C++ future


Recently updating my hobby project Med, memory editor for Linux, still under heavy development with various bugs.

In this project, I use several C++1x features (compiled with C++14 standard). Most recent notable feature is multi-threading scanning. In memory scanning, scan through the accessible memory blocks sequentially is slow. Therefore, I need to scan the memory blocks in parallel. To implement this, I have to create multiple threads to scan through the memory blocks.

How many threads I need? I make it into variable n, default to 4. Meaning, when scanning is started, the n threads will start scanning asynchronously. When one of the thread finish scanning, the next (n+1) thread will start scanning the next (n+1) memory block, until the end.

I design the solution top-down, and implement it bottom-up. In order to design the solution for the requirement above, I created a ThreadManager (header file here). So the ThreadManager basically will let me queue the tasks that I am going to launch in parallel with n threads. After queuing all the tasks, I just need to start, then they will run in parallel as multi-threading. This is what ThreadManager is doing. If mutex is needed, it is the task need to handle with, not the ThreadManager to handle. ThreadManager just make sure the tasks are run in parallel.

This is the simple test that uses the ThreadManager.

Technically, there are several important C++ standard libraries used, vector, functional, future, mutex, and condition_variable. Vector is an STL that allows me to store a list of items as vector (just like array or list).

Since C++11, it supports lambda expression. Then using functional, I can use std::function template to create any function object.

std::function<void()> fn = []() {
  for (int i = 0; i < 4; i++) {
    this_thread::sleep_for(chrono::milliseconds(300));
    cout << "thread1: " << i << endl;
  }
};

The code above initialize a variable fn which stores an anonymous function. Previously, this can be done using callback function, which makes the code difficult to manage. By using std::function and std::vector, I can store all the anonymous functions to the vector.

Future is a very interesting library. If we are familiar with JavaScript promise or C# async, then it is similar to these (futures and promises). Whenever a task is start, it will return a future. Because we don’t know when the task will be ended. We can also do something like using a loop to check the condition of a task whether is ended, but this will be over complicated. Because future will let you handle what should be done when the task is ended.

Using future, I need not to create thread directly (though it is called ThreadManager). I can use async function to run the callback function asynchronously. It is the async that returns future. And this async function allows lambda expression as function argument. Great C++11.

C++11 supports mutex (mutual exclusion) and condition variable. Mutex can prevent race condition. When we are using multi-threading, most of the time the threads are using some shared resource. Read the empty data may crash the program. Therefore, we need to make sure when reading or writing, the resource is not accessible by other threads. This can be done by locking the mutex, so that other threads cannot continue. Then after the operation, unlock the mutex, and the other threads can lock the mutex and continue. Hence, only a single thread can access the resource.

Condition variable is used together with mutex. We can use condition variable to wait if a condition is fulfilled. When a wait is performed, the mutex will be locked (by unique lock). As a result, the thread will be blocked and cannot continue. The thread will wait until the condition variable notifies the thread to perform a condition check. If it is fulfilled, then the mutex will be unlocked and the thread will continue.

In ThreadManager, my previous code uses a loop to check the condition, if the condition doesn’t allows to run the next thread, then it will sleep a while and check again. This method is wasting the CPU resources. Because it keeps checking the condition. By using condition variable and mutex, I can just stop the thread, until it is notified to continue.

Yeah. Modern C++ is cool!

Advertisements

PHP programming


PHP was a great programming language in web development. It surpasses the VBscript for ASP and Perl for CGI. It is favoured because of the syntax based C and C++. It supports procedural programming paradigm and object-oriented paradigm. A lot of functions resemble C functions such as printf, fprintf, sprintf, fopen, etc. Similarly, it can work directly to the C library such as expat, zlib, libxml2, etc. A lot of great content management systems (CMS) are written in PHP, such as Drupal, WordPress, Joomla, etc.

However, a lot of new programming language emerges and surpassing it.

Taken from http://skillprogramming.com/top-rated/php-best-practices-1234

Array is passed by value

Because PHP syntax is very similar to C and C++, it can use “&” reference operator and pass the parameter by reference in a function. But this will be very different from other languages such Python and JavaScript. Python and JavaScript function parameters are passed by value for all primitive data types, such as integer, float, string, and boolean; complex data type like object and array are passed by reference, meaning they are mutable, including Date object in JavaScript.

function change_array($arr) {
    $arr[0] = 100;
}

function change_object($obj) {
    $obj->value = 100;
}

function change_many_objects($arr) {
    $arr[0]->value = 100;
}

function change_object_array($obj) {
    $obj->array[0] = 100;
}

class MyObj {
    var $value;
    var $array;
}

function main() {
    $arr = [1, 2, 3];
    $obj = new MyObj();
    $obj->value = 10;

    change_array($arr);
    change_object($obj);

    echo $arr[0], "\n"; // still 1, not changing
    echo $obj->value, "\n"; // changed to 100

    $arr_obj = [ new MyObj(), new MyObj(), new MyObj() ];
    $arr_obj[0]->value = 10;
    change_many_objects($arr_obj);
    echo $arr_obj[0]->value, "\n"; // changed to 100

    $obj_arr = new MyObj();
    $obj_arr->array = [1, 2, 3];
    change_object_array($obj_arr);
    echo $obj_arr->array[0], "\n"; // changed to 100

    $obj_a = new MyObj();
    $obj_a->value = 10;
    $obj_b = $obj_a;
    $obj_b->value = 20;
    echo $obj_a->value, "\n"; // 20
    echo $obj_b->value, "\n"; // 20

    $obj_c = &$obj_a;
    $obj_c->value = 30;
    echo $obj_a->value, "\n"; // 30
    echo $obj_b->value, "\n"; // 30
    echo $obj_c->value, "\n"; // 30
}

main();

In the example above, the function change_array() will not modify the array that being passed, this is because it is passed by value. Unless we use the “&” reference operator.

The function change_object() will change the object that being passed.

One of the key-points of PHP 5 OOP that is often mentioned is that “objects are passed by references by default”. This is not completely true. […]

(from PHP manual)

So, basically, the function parameters are passed by value, even though it is an array. But the object will be dealt differently. We can treat it as a pointer, if you are familiar with C++ “new” operator. In C++, “new” operator will create an instance and return a pointer to the instance. If we understand this concept, then this is how it works in PHP (according to what I know).

Consequently, the function change_many_objects() though the argument is for an array, and an array is passed into it, but the function changes the value of the object within the array. This is because the array stores the pointer to the object instances. The function does change the instance is pointed by the “pointers” stored in the array.

In summary, PHP deals array as value, which is different from Python, JavaScript, and even C and C++. However, PHP deals object as pointer, that is why object is mutable when it is passed to a function.

Other limitations

PHP was created before the technology of RESTful API. PHP focuses on the GET and POST, but not like PUT and DELETE. The HTTP methods are server dependent. As a result, the HTTP server such as Apache requires some extra configurations for PHP to work. Unlike Node and Ruby on Rails, Node itself has a HTTP module; Ruby on Rails has WEBrick HTTP server.

Comparing to the language like Python, Node with JavaScript, Ruby, Lua, it lacks of REPL (read-eval-print loop). Interactive shell is different from REPL. With REPL, the function to print the result in the console is omitted. REPL will print the result whenever the function returns value.

Closure


In JavaScript, we can create closure like this,

var foo = (() => {
  let x = 0;
  return () => {
    return x++;
  }
})();

for (var i = 0; i < 10; i++) {
  var x = foo();
  console.log(x);
}

But translating above code into C++, it cannot work as expected for the variable, unless the variable is outside the lambda function.

// In a function
int x; // variable here
auto foo = [&x]() {
  x = 0;
  return [&x]() {
    return x++;
  };
}();
for (int i = 0; i < 10; i++) {
  int x = foo();
  cout << x << endl;
}

This is because C++ variable can be only accessed in the function scope. After the function return the value, the variable is not accessible anymore.

Similarly, Python can also return the function, but it cannot work as closure like JavaScript. However, Python can create non-primitive variable such as object or list so that the variable will be accessed by reference.

def _foo():
    x = [0]
    def increment():
        x[0] += 1
        return x[0]
    return increment

foo = _foo()
for i in range(10):
    x = foo()
    print(x)

CSV to something


Just revived my old script into a project in GitHub, csv_to_something. An old simple script that was created to manage some student data. Because the students data were collected through Google Forms, so I convert it to CSV, then from CSV to SQLite. So that I can use the SQL to query whatever data I need.

Using SQLite software such as sqliteman and sqlitebrowser, I can create any new table, grouping, sorting etc.

Then recently, I need some JSON data for the table. So, I revive the script to convert CSV to JSON format. With the functional programming characteristic in JavaScript, manipulating the data becomes more interesting. Using Node or any JavaScript interpreter, we can perform map and reduce. Other useful functions like find and filter can get what I want. Example,

// Read the JSON data as object
let students;
fs.readFile('./table.json', (err, content) => {
  if (err) throw err;
  students = JSON.parse(content);
});
let average = students.reduce((reduced, item) => item.score, 0) / data.length;
let grades = students.map(student => {
  if (student.score >= 90) return 'A';
  else if (student.score < 90 && student.score >= 70) return 'B';
  else return 'Something else';
});

Why SQLite and JSON? Because they are extremely lightweight.

A brief comparison of GTK+ and Qt


I used to like C language, because it is a basic of programming, and it is portable, and it is low-level. When writing program with C language, it is just like showing off your advanced programming skill, how you manage the memory, how you manage the pointers and creating the linked list. However, in terms of efficiency, C++ is much more powerful, because of object-oriented and the syntax.

Because I like C language, so I chose GTK+ over Qt for long time ago. Not only that, I am also fond of GTK+ desktop environments like GNOME, Xfce4, LXDE, Cinnamon, but not Mate. I feel that KDE is heavy weight.

One of my personal projects, Med, which was written with GTK+, had some issues with multi-threading. I believe that my engine part does not have any problem. As a result, the suspicious part is the GTK+. The program crash without symptoms, and difficult to reproduce. Therefore, I decided to re-write the UI with Qt. If, Qt also produces the same problem, meaning my algorithm is problematic.

In my opinion, GTK+ is straight forward, because of procedural paradigm. Therefore, it is easy to learn and implement. The UI can be design with Glade. But I feel that it is still lack of something, such as button click callback function. Besides that, though there is gtkmm (C++ interface of GTK+), the library like WebKitGTK+ does not have the documentation for gtkmm (I believe still can work). In summary, GTK+ is slower as writing the code in C language.

On the other hand, Qt is more complicated. To use it, better to use qmake project file to generate the Makefile, or use the CMake, so that we can include the necessary headers and link with necessary libraries. And we also need to use some macro. This makes me feel that the coding is very Qt-oriented, not just a C++ language. However, in terms of design, I feel that Qt Designer is much easier. But GTK+ and Qt layout uses different concept.

Though GTK+ and Qt are both object oriented, GTK+ is in the library level yet Qt is in the language level. It is easier to write inherited classes in Qt, and easily make changes at the language level.

As a conclusion, Qt is better than GTK+ for development as C++ is better than C.

Heading, anchor, and bookmarking


Sometimes I read online articles, and these articles are usually long pages and have outlines at the beginning. These outlines are the hyperlinks to the subtopics headings. Technically, you click the outline hyperlink, your browser will browse to the “anchor”, the URL will append with hash (#). Therefore, it is useful for bookmarking, so that you can share the URL target on the topic to someone else, or re-visit your bookmark.

Thus, I wrote a script as bookmarklet to solve this issue, so that I can click the headings when I am reading the article. You can create the bookmark with the following URL:

javascript:var elems=document.querySelectorAll("h1[id],h2[id],h3[id],h4[id],h5[id],h6[id],h1>*[id],h2>*[id],h3>*[id],h4>*[id],h5>*[id],h6>*[id],a[name]:not([href])");var array=Array.from(elems);array.forEach(function(elem){elem.style.cursor="pointer";elem.setAttribute("title","anchor:"+(elem.id?elem.id:elem.name));elem.innerHTML+=" *";elem.addEventListener("click",function(el){location.hash=el.target.id?el.target.id:el.target.name})});

(Yes, it starts with javascript: instead of http://. Sadly WordPress.com doesn’t allow to create bookmarklet in the post. Moreover, I was actually using ⚓ instead of asterisk *, but WordPress.com auto convert it to emoji, consequently cannot use it in the script above.)

You can try any Wikipedia page to test it. It also works on Stackoverflow, so that you can share the URL to a certain answer to others.

Lecturer, researcher, hobbyist, and software developer


I am cognitive science student. That is why I learnt AI, computational linguistic, machine learning, expert system, etc. 

Since I was a researcher on Augmented Reality, then I applied my computing skills. After this, I became a non-computer related lecturer, and spent my time doing programming as a hobbyist. Then later I became computer science lecturer, yet still had to do programming as a hobbyist. Now as a software developer, I know what are the differences of these roles: lecturer, researcher, hobbyist, and a true software developer. 

A lecturer, as long as they know how to bluff, they are considered good in programming skill.

A researcher, whether they are good in programming skill is totally not important. Because the focus is the research. As a researcher, you do not need to write programs  or scripts. Though these may help, they are not required. As long as your research methodology is correct, that is the core element of the research.

As a hobbyist, you can learn any programming language and write any program. You can write the code that you yourself think it is elegant. You can write any powerful function and use a simple but accurate solution to solve your own problem. You can share your code and publish your project. You can have very strong knowledge to write wondrous algorithm and solve the problem effectively and efficiently. BUT, you are still lacked of industrial experience. 

An actual software developer who works in industrial, what you need is to produce stable products. Yes. Stable. In order to produce stable products, you need to have your products fully tested. Hire a tester can only perform black box testing. They test only as an end user perspective. But the problem is, the bugs are found after you have written the dependent codes on the bugs. So, it would be best to test before these, during the development level. As a developer, writing test cases becomes exhaustive. That is why, Test Driven Development comes in. A stable product is the actual thing that the customers want. They paid the money to get the product, or specifically, the customers hire you to develop a product to them. That is why your products must be stable. Unlike the commercial products, the government projects usually have no clear objective. As long as you deliver a prototype, you can get the money. That is why, the products standards are very different, the developers quality is also very different.

As a lecturer, you talk.

As a researcher, you do experiments.

As a hobbyist, you learn and write.

As an actual software developer, you write and test.