Linux Kernel Development - Smatch


#bugfixing   #contributor   #kernel   #linux   #tools  

If you are one of those who think that static analysis is for the weak, you are also one of those who commit silly mistakes that could have been easily avoided! Give up your arrogance for once and let Smatch help you a little bit :wink:

Content:

  1. Yet another cpp checker?
  2. Setup
  3. How to use and short demo

1. Yet another cpp checker?

If you, on the other hand, use code checkers to catch obvious bugs, you might be wondering why you would need yet another tool to complain about that unused variable. There are some well-known tools out there like clangd, cppcheck, etc., that already do that.

If that was the case, I would have not written this article in the first place. You are right, you don’t need a hundred tools telling you exactly the same, most of the times false positives (cppcheck is the master of the false positives: try to check the whole kernel and you will see). But that is not the case, or at least not 100%. Smatch is mainly focused on the Linux kernel (it can be used for other projects, though), and it is able to carry out deeper and kernel-tailored analysis that will potentially unveil bugs that other general-purpose tools might overlook. I also like that it is based on another reliable tool like sparse, and it is actively maintained by Dan Carpenter, who also reports and fixes a lot of bugs in the Linux kernel.

How do I know that smatch catches more bugs? Because I use several static analysis tools in my workflow, and smatch is often the winner. Why do I still use the others? Because they give me immediate feedback while coding, and that is a big time-saver. And who knows if they will ever find something that smatch overlooked… If I can profit from all of them, why should I refuse and wait until someone reaches me to tell me that their tools found something I missed? As you will see during my demo, I know what I am talking about.

2. Setup

The setup I am going to show you is documented in the project, so you can clone the repository and follow the steps from Documentation/smatch.rst, which I converted to from the original .txt version recently after fixing some trivial inaccuracies. A minor contribution, but happy to help anyway!

You can find the repository of the project in several locations, and GitHub is one of them. Let’s clone it first:

git clone https://github.com/error27/smatch/tree/master

If you are too lazy to look for the documentation, just install the dependencies:

# For Debian & co:
apt-get install gcc make sqlite3 libsqlite3-dev libdbd-sqlite3-perl libssl-dev libtry-tiny-perl
# For Fedora & co:
yum install gcc make sqlite3 sqlite-devel sqlite perl-DBD-SQLite openssl-devel perl-Try-Tiny

and build the project with the following complex commands:

cd smatch && make

You don’t have to install anything else as you will run smatch from its directory. Nonetheless, there is another step that is highly recommended: building a cross function database. That will help smatch find more bugs, so let’s do it:

cd ~/path/to/kernel_dir
~/path/to/smatch_dir/smatch_scripts/build_kernel_data.sh

That will take a while depending on your machine and your kernel config, and it should be rebuilt from time to time to keep accuracy, but it is definitely worth it.

3. How to use and short demo

First, a copy+paste from the documentation I mentioned before. You can run smatch for the whole kernel, a specific directory, or a single file:

# Run these commands from the kernel root directory.
# whole kernel:
~/path/to/smatch_dir/smatch_scripts/test_kernel.sh
# directory:
~/path/to/smatch_dir/smatch_scripts/kchecker drivers/whatever/
# file:
~/path/to/smatch_dir/smatch_scripts/kchecker drivers/whatever/file.c

And now let’s take a look at a very simple, yet real example. I am going to tell you a true story about how I started to use smatch.

I have mentioned my kernel driver for the Amphenol Chipcap 2 humidity sensor that was applied to the hwmon subsystem in previous articles. A few days after it got applied, and fortunately before it reached the mainline kernel, I got an email from Dan telling me that he (i.e. smatch) had found an issue in my code.

The first thing I thought when I saw the issue was: how did I miss this silly mistake? Ok, I am a human… but clangd and cppcheck?? Maybe I have some misconfiguration… But then I saw that my tools were catching slightly simpler bugs, as expected. The next step was obviously getting smatch to check if it could find the issue, but it didn’t. Why? Because I had not built the cross function database yet. After building it, smatch found the issue as expected, so I fixed it and made sure that I did not miss any other similar bugs in the rest of the code.

The bug smatch found was a trivial ‘uninitialized variable’, which is obvious when some(one/ tool) tells you, but otherwise easy to overlook. The following code snippet shows two similar functions: the first one requests a threaded interrupt if irq_ready is not 0, and the second one does more or less the same for irq_low and irq_high.

uninitialized variables
clangd is only unhappy about the first ret

Thanks to the error message, I could fix the first one on the fly and keep on programming, which is always nice. On the other hand, the second ret seems ok for clangd. The logic is only slightly more complex (a second if clause), but in this case, that is enough to fool clang.

Let’s see what cppcheck has to say about the same file. I will use the –enable=all option to catch more stuff:

$ cppcheck --enable=all chipcap2.c
Checking chipcap2.c ...
chipcap2.c:143:6: style: The scope of the variable 'err' can be reduced. [variableScope]
 int err;
     ^
chipcap2.c:292:16: style: The scope of the variable 'timeout' can be reduced. [variableScope]
 unsigned long timeout;
               ^
chipcap2.c:371:16: style: The scope of the variable 'timeout' can be reduced. [variableScope]
 unsigned long timeout;
               ^
chipcap2.c:668:9: warning: Uninitialized variable: ret [uninitvar]
 return ret;
        ^
chipcap2.c:659:22: note: Assuming condition is false
 if (data->irq_ready > 0) {
                     ^
chipcap2.c:668:9: note: Uninitialized variable: ret
 return ret;
        ^

Basically the same: only the first uninitialized variable (more verbose), plus variable scope checks that in the kernel are usually false positives, because the common practice is declaring variables a the beginning of the function. Alright, let’s call smatch to save the day:

$ ~/repos/smatch/smatch_scripts/kchecker drivers/hwmon/chipcap2.c
  CHECK   scripts/mod/empty.c
  CALL    scripts/checksyscalls.sh
  DESCEND objtool
  INSTALL libsubcmd_headers
  CC      drivers/hwmon/chipcap2.o
  CHECK   drivers/hwmon/chipcap2.c
drivers/hwmon/chipcap2.c:668 cc2_request_ready_irq() error: uninitialized symbol 'ret'.
drivers/hwmon/chipcap2.c:701 cc2_request_alarm_irqs() error: uninitialized symbol 'ret'.

Bingo! Smatch reported the lines where the two uninitialized ret variables are used (in this case, returned). Not only that, it did not report useless information.

And that’s basically it for a start. If you want to learn more about smatch (e.g. how to add new checks), I could never tell you as much as the project maintainer. Check out his blog, which is also a good place to learn about static analysis in general.

Oddly enough, I did not know that Dan would join us for a mentoring session at the LKMP to talk about smatch, and that session took place a few minutes before I published this article :grin: If you would like to attend such mentoring sessions and learn from the most experienced kernel developres, click on this link and learn how to join the Linux Kernel Mentoring Program.

One disclaimer before I say goodbye, because I can see it coming: I would not be particularly surprised if it was possible to configure clangd and cppcheck to increase their coverage, but it is a fact that they didn’t catch that bug out of the box, and smatch did. But if you know how to configure them to even surpass smatch (for Linux kernel code analysis), please let me know because I am always looking for new improvements to my workflow. Feedback is always welcome, and not only from tools :smiley:



Enjoy and share knowledge!



Footer