How to solve C/C++ memory leaking (in Linux)?


My hobby project Med is written in C++. A lot of implementations need to use dynamic memory allocation and instantiation. Because complex data is impractical to be passed by value, like the case of JavaScript object and array. Since C++ doesn’t have garbage collection, it is possible that the developer doesn’t free/delete the dynamically created memory properly.

As in my project Med, the program will retrieve the memory from other process. That means, it needs to make a copy of the scanned memory. And this will involve creating dynamic memory (using new operator). When using the program to filter the memory to get the target result, it needs to get a new copy of memory with the updated values, then compare with the previous copy. The unmatched will need to be discarded (free/delete); the matched will need to replace the old one (also free/delete the old one, because the new one is dynamically allocated).

Though it can be easily expressed in verbal form, writing the algorithm is usually inducing some human mistakes, such as using wrong variable name. As a result, we may access the memory that is not dynamically allocated, or freeing the memory twice. These will cause segmentation fault, and C++ compiler will not tell you where the error comes from. Debugging is hellish in this case. As a result, I used several solutions to fix memory leaking.

Adopt TDD

I have covered this in my previous post. Smaller function is easier to be tested. Just make sure your functions or methods are testable.

Valgrind (Linux only)

Compile your application with debugging information (g++ with -g option). Then run your test suite with valgrind.

valgrind --leak-check=yes ./testSnapshot

It will tell you which function access the non-allocated memory, how many bytes are untracked, and so on.

Valgrind is super useful to check memory leaking.

Memory Manager

Because I wrote the code with some mistakes, so the memory is freed twice and causes the error. But I failed to find where is the cause (which finally I found it).

Therefore, I wrote a memory manager to make sure the memory will not be freed more than once. However, this is actually not a good solution to avoid memory leaking. It just prevents the program to free the memory twice.

ByteManager::ByteManager() {}

ByteManager::~ByteManager() {
  clear();
}

// Singleton
ByteManager& ByteManager::getInstance() {
  static ByteManager instance;
  return instance;
}

Byte* ByteManager::newByte(int size) {
  Byte* byte = new Byte[size];
  recordedBytes[byte] = byte;
  return byte;
}

void ByteManager::deleteByte(Byte* byte) {
  auto found = recordedBytes.find(byte);
  if (found != recordedBytes.end()) {
    delete[] recordedBytes[byte];
    recordedBytes.erase(found);
  }
}

void ByteManager::clear() {
  for (auto it = recordedBytes.begin(); it != recordedBytes.end(); ++it) {
    delete[] it->second;
  }
  recordedBytes.clear();
}

map ByteManager::getRecordedBytes() {
  return recordedBytes;
}

Then, I refactored both new operator and delete operator to be replaced by newByte() and deleteByte() methods. And I use a std::map to store all the created memory.

Smart pointer

C++11 introduced smart pointers shared_ptr and unique_ptr. unique_ptr can be replaced by auto_ptr before C++11. By using smart pointer, we can use new operator to instantiate an object, then we need not to delete it. Because it will be deleted automatically when it lost the reference. For example,

typedef std::shared_ptr<SnapshotScan> SnapshotScanPtr;

// some where else
SnapshotScanPtr matched = SnapshotScanPtr(new SnapshotScan(curr.getAddress() + i + currOffset, scanType));

So, program will instantiate a new SnapshotScan object as the shared_ptr. Then it will be stored in a std::vector. When it lost the reference, such as removed from the std::vector, it will be deleted automatically.

In my opinion, it is a better solution than the Memory Manager above. However, my existing project can be hardly refactored to use smart pointer.

Advertisements

C++ Unit Test and Dependency Injection


TDD (test driven development) is widely adopted in modern development such as web development. Because it allows the developers to test the solution robustly in order to produce a more stable product.

Higher level programming languages like JavaScript and Ruby allows the developers to easily mock the functions and data to test the target specification. However, programming language like C++ is not designed for TDD. It will be more complex if you want to mock functions.

In order to adopt TDD, we need to write the function as small as possible, so that it can be easily mocked. As a result, the design of the object class needs to be testable. The methods that we are going to test needs to be publicly accessible.

Unlike JavaScript, C++ is prototype-based programming language. JavaScript can easily be mocked by overriding methods. In order to mock the C++ class, I have to use inheritance and mock the public methods. Example from my project Med.

class SnapshotScanTester : public SnapshotScan {
public:
  SnapshotScanTester() : SnapshotScan() {
    scannedValue = NULL;
  }
  virtual Bytes* getValueAsNewBytes(long pid, ScanType scanType) {
    Byte* data = bm.newByte(8);
    memset(data, 0, 8);
    data[0] = 20;
    Bytes* bytes = new Bytes(data, 8);
    return bytes;
  }
};

In order to mock the class SnapshotScan, I need to make the getValueAsNewBytes method as virtual public. Then using CxxTest framework to test with SnapshotScanTester.

Dependency Injection

I learnt the term Dependency Injection when I was developing with AngularJS project. By using dependency injection solution, we can create the services and inject to the client (object).

Med project is complex. The main feature of Med is to scan the memory of other process. Mocking other process is impractical for the test. In order to solve this, I refactored the code that involves PID (process ID) as the parameter, and write them into a service. For example,

class SnapshotScanService {
public:
  SnapshotScanService();
  virtual ~SnapshotScanService();

  virtual bool compareScan(SnapshotScan* scan, long pid, const ScanParser::OpType& opType, const ScanType& scanType);

  virtual void updateScannedValue(SnapshotScan* scan, long pid, const ScanType& scanType);
};

Before the refactoring, my code is doing something like

snapshotScan.compareScan(pid, opType, scanType);

By refactoring them into a service, then I can mock the service to produce any value for the testing.

In order to inject the service, I wrote the constructor that can accept the service,

class Snapshot {
public:
  Snapshot(SnapshotScanService* service = NULL);
  // ...
};

If service is not provided to the constructor, then default service will be instantiated. Therefore, if I want to mock the service, I just create a new class derived from SnapshotScanService. For example,

class SnapshotScanServiceTester : public SnapshotScanService {
public:
  SnapshotScanServiceTester() {}

  virtual bool compareScan(SnapshotScan*, long, const ScanParser::OpType&, const ScanType&) {
    return true;
  }
  virtual void updateScannedValue(SnapshotScan* scan, long pid, const ScanType& scanType) {
    Bytes* currentBytes = Bytes::create(20);
    currentBytes->getData()[0] = 60;
    scan->freeScannedValue();
    scan->setScannedValue(currentBytes);
  }
};

Then in the test,

    SnapshotScanService* service = new SnapshotScanServiceTester();
    SnapshotTester* snapshot = new SnapshotTester(service);

So, by using the dependency injection, I can finally test my class properly to make sure its functionality reliable. Without unit test, meaning I need to test my code by running thousands of time to create different situations. I am not genius, I don’t think that my code without proper test can function properly.

Firefox Legacy version 56.0.2


The latest Firefox version 57 and above, a.k.a Firefox Quantum, it is fast, but… that is not what I need.

As a developer, I favoured Chromium more than Firefox. And I use Firefox mainly for downloading. The addon DownThemAll is the must. The greatest feature I love is the ability to highlight and download the selected hyperlinks as batch. And I can name the downloaded files by original filename or based on the text in HTML.

DownThemAll in Firefox 56
DownThemAll in Firefox 56

This is neither Google Chrome nor Firefox Quantum can do.

With Firefox Quantum, there is no more support for legacy addons. As a result, unless the addon developers update their project, nothing more we can do. We can either install other Firefox variants such as Firefox ESR, Waterfox, Palemoon, etc (as suggested by one of the DTA reviewers). In my opinion, with WebExtensions API, DTA developer may face some issues to develop for Quantum support.

I tried to install Firefox ESR (which is actually 52.5.0) in order to use DTA. However, it is extremely slower comparing to Firefox 56. When Firefox 54 was released, it was the best Firefox ever and it was really fast comparing to the older versions. Therefore, in order to use the latest Firefox that supports legacy addon, I choose to install Firefox 56 without removing the latest Firefox Quantum.

How to install older Firefox (for Linux only)?

  1. Use Google to search an older Firefox, you should get a URL like this: https://ftp.mozilla.org/pub/firefox/releases/56.0.2/
  2. Choose the link (directory) that matches your target machine, Windows 32-bit / 64-bit, Linux 32-bit / 64-bit, etc. Then choose a language that you prefer. For my case, I downloaded Firefox 56.0.2 for Linux 64-bit.
  3. Download and extract it to some location, such as /opt/firefox56.

Now, you should able to invoke Firefox 56 by (but don’t run it now)

/opt/firefox56/firefox

Run the above command will use the same profile as your default Firefox (if you installed Firefox Quantum).

Create another profile

Your Firefox profile should locate at ~/.mozilla/firefox/, named as xxxxx.profilename.

(NOTE: Suggest to make a backup of your default profile.)

Easiest way to create another profile for another version of Firefox,

  1. Create a new profile using command firefox -ProfileManager. Because this will update profile.ini as well.
  2. Exit Firefox Profile Manager.
  3. Locate to the directory of the profile, then delete all the files and directories within the profile, i.e, make it empty. (NOTE: Do not delete your default profile.)
  4. Copy all the files and directories of your default profile to the empty profile in step 3.
  5. Now you can run Firefox with the new profile, like /opt/firefox56/firefox -P profilename.

Create shortcut (application menu, aka desktop entries, for Linux only)

Because the Firefox we installed is just by extracting. There is no desktop entry file created. Therefore, we need to create our own desktop entry file. This can be done by creating a file as ~/.local/shared/applications/firefox-legacy.desktop.

The following is the content,

Name=Firefox Legacy
GenericName=Web Browser
Icon=firefox
Type=Application
Categories=Application;Network;
MimeType=text/html;text/xml;application/xhtml+xml;application/vnd.mozilla.xul+xml;text/mml;x-scheme-handler/http;x-scheme-handler/https;
Encoding=UTF-8
Exec=/opt/firefox56/firefox -P legacy %u
Terminal=false
MultipleArgs=false
StartupNotify=false
StartupWMClass=Firefox

where the /opt/firefox56/firefox -P legacy %u, with the legacy as the profile name. You can modify this line to your target profile name.

Then, restart your gnome-panel or whatever panel, it should generate this menu item named as Firefox Legacy. Now, you can run two different versions of Firefox simultaneously.

 

(Side note: WordPress with Markdown mode enabled is the best!)

NVidia and hibernation issue, partially solved


In my previous post, I mentioned about NVidia and xcompmgr, it is not true reason that causes the Chrome not updating the display.

The root cause is partially found. The issue is caused by the optimus laptop (dual graphic card, NVidia with Intel). In unknown conditions, resume from hibernation will cause the Intel graphic card doesn’t work properly. This can be checked by running “glxgears” after resume. You will see the OpenGL fails to refresh on the display.

However, if installed bumblebee, then we can run “optirun glxgears”, and this solves the graphic card issue.

Children process

Now, there is a tricky issue. because I use GNOME-Do, it is not started with “optirun”, as a result, launching the application through GNOME-Do doesn’t use the NVidia graphic card. As a result, I need to quit GNOME-Do and start it with “optirun”. So, all the application launched by GNOME-Do will use the grphic card correctly.

Run with NVidia only

Unfortunately, I experienced failing to start the X window with NVidia graphic card only. And I didn’t disable Intel graphic card, because it becomes a waste for an optimus laptop. As a result, I cannot confirm whether if only using NVidia graphic card, will the display refresh issue exist.

But so far, I use the “optirun” to run the application, if the graphic card fails to refresh the display.

Firefox or Chromium (software development)?


I was switching from Chromium to Firefox as my primary web browser recently. Then, I switched back to Chromium again.

Chrome was usually claimed that it consumes a lot of memory. And recent Firefox updates claim that it is faster and consumes less memory. That is why, I switched to Firefox. I agree that, it is much faster than before. However…

I faced a critical issue. One less important issue that I would like to mention is, Firefox does not support Google Hangout.

The critical issue I faced related to JavaScript. During the web development or even visit CircleCI (which I believe it has heavy usage of JavaScript), if the JavaScript has severe errors, whatever web browser you are using will stop respond or slow down. But, Chrome (Chromium I mean) deals the issue differently from Firefox. The whole computer will be slow down temporary (may be several minutes), then at the end, the page will be shown as “dead” and I can control over my computer again.

In the same condition, Firefox will expand the memory (possibly exponentially) due to the errors. Then the computer starts slowing down and stop respond until I do a hard reboot. Based on my observation, the memory grows and uses all the RAM. When the RAM is not available, the memory is immediately stored into the Swap. Because storing into the Swap, that is the hard drive, it is much slower for me to switch to a Terminal to kill Firefox. And even I successfully switch to a Terminal, typing the command and see the response takes approximately infinite time, yet the Swap memory usage keeps growing non-stop.

As a web developer, I prefer to use Chrome.

NVidia and probably xcompmgr


I have a Dell Vostro 5459 with Arch Linux. Previously, whenever I do a hibernation, and resume will produce a black screen, which I can do nothing.

Then I believed that one of the NVidia updates fixed this issue.

However, very soon later, I faced another issue is, resuming from hibernation causes Chromium with freeze content, or the content doesn’t redraw. This not only happen to Chromium, but also Opera and SMPlayer. I thought it is caused by NVidia. Tried a lot of solution, search nothing from Internet. I also installed “bbswitch”, nothing solved.

But just now, before I did a hibernation, I tried to exit every application related to the display or possibly doing some graphic things. Then I remembered that I always run “xcompmgr”, as it enables the composite feature on OpenBox. I killed it, and do a hibernation. And now resume from hibernation, and Chromium works fine.

So, possibly it is “xcompmgr” that causes the trouble all the time since NVidia fixed. To be confirmed.

Complexity and simplicity


When we are developing a solution or a system, we are prone to choose a simple solution. Because simple solution is just better than complex solution. However, most of the time, we choose a simple solution inappropriately, and this causes more troubles gradually when the system is growing.

The complexity of a solution, should depend on the complexity of the problem itself, not the other way round. For example, we cannot create an operating system with a single line of programming statement. We also cannot create an operating system with just a single source file. Because an operating system is very complex (managing devices, memory, process, etc), no simple solution can fulfil the requirements.

That is why, most of the time global variables are not encouraged, because they become difficult to be managed when your source code is growing. However, if the problem is simple and global variables can solve the problem efficiently, then the approach will be acceptable.

Our human mind is limited. We cannot process too much information. Hence, if a source file contains a lot of global variables (or similar case like too many parameters in a function), we cannot process the information well. Because it is complex. And when a function is too long, with hundreds line of statements, we cannot remember what was happened in the beginning of the function. However, if we organize the variables and parameters properly, then we can process the source code much better.

As UNIX philosophy, “Do One Thing and Do It Well” (DOTADIW) (so does Microservices), this is what we ought to design our solution. We simplify the solution, not the problem, because problem cannot be changed. As a result, a very complex problem will need a lot of simple solutions or services to be built.

In reality, life form like human is complex, that is why we have multiple systems such as digestive system, respiratory system, circulatory system, etc. And each system is focusing on one task. However, the low life form organism like amoeba is very simple. We cannot expect the biological system of amoeba is workable on a human. Moreover, a large organization will need a very complex management system (not in terms of the software system), comparing to a small organization. You cannot expect the CEO have contact with thousands of employees every day in the large organization. But in a small organization, CEO can contact with every one in the team.

Therefore, if a problem is complex, or the system requirement is complex, we can only “divide and conquer” by breaking down the main problem into sub-problems, then for each sub-problem we solve it with smaller and simpler solution.

Pyramid, tree, or pipeline

When a community is growing, it will end up become a pyramid like hierarchy system. When a file folder is growing, it will end up become tree structure. If the data flow is linear, then pipeline will be the appropriate solution. Therefore, as the system is growing, your information needs to be passed from unit to unit. It is inefficient to convey the message, but it is efficient to be managed.

(But in reality, pyramid hierarchy is troublesome, because human is full of flaws and corruptions.)

Pure function

Interestingly, by learning ReactJS, uses the pure function method for development helps managing the code much simpler. Because all the input of a function is immutable, or read-only. That means, you will not create a side effect to the parent component or the caller. Similar to microservices, we just need to focus on the functionality of each component.