Let's Build Chuck Norris! - Part 4: Python and ctypes
Note: This is part 4 of the Let’s Build Chuck Norris! series.
Static and shared libraries #
C and C++ libraries come in two forms: static and shared.
In both cases, a library is collection of names (the symbols) and executable code. The difference between static and shared libraries is how those are used, for instance by an executable.
The code from a static library is directly integrated into the program: the compiler will take copies of the code the program uses from the static library and make it part of the program.
On the other hand, the code in a shared library is only referenced by the program. When the program is launched, the operating system will try and find the code to run in the shared library file.
We says the code in the static library is used at compile time whereas the code of a shared library is used at runtime.
For this reason, static libraries are also called archive libraries, and shared library are also called dynamic libraries.
Static and shared libraries usually have different extensions depending on the platform:
platform | static | shared |
---|---|---|
Linux | .a | .so |
macOS | .a | .dylib |
Windows | .lib | .dll 1 |
So, which type of library do we need to write bindings in Python ?
Since we are using Python as a program (the /usr/bin/python3
binary), it’s obviously too late to do anything at compile time. So let’s build a shared library and see what we can do at runtime, shall we?
Building the shared library #
When we described the chucknorris library in our CMakeList.txt earlier, we did not specify its type.
The type of the library used in this case is controlled by a variable called BUILD_SHARED_LIBS
which is OFF
by default. So, let’s re-run CMake, setting this variable to ON
instead.
$ cd build/default
$ cmake -GNinja -DBUILD_SHARED_LIBS=ON ../..
$ ninja
...
-- Library sqlite3 found /.conan/data/sqlite3/3.21.0/bincrafters/...libsqlite3.a
...
[1/7] Building C object CMakeFiles/c_demo.dir/src/main.c.o
[2/7] Building CXX object CMakeFiles/chucknorris.dir/src/c_wrapper.cpp.o
[3/7] Building CXX object CMakeFiles/cpp_demo.dir/src/main.cpp.o
[4/7] Building CXX object CMakeFiles/chucknorris.dir/src/ChuckNorris.cpp.o
[5/7] Linking CXX shared library lib/libchucknorris.so
FAILED: lib/libchucknorris.so
: && /bin/c++ ...
-o lib/libchucknorris.so
ChuckNorris.cpp.o
c_wrapper.cpp.o
...
libsqlite3.a
..
/bin/ld: libsqlite3.a(sqlite3.o): relocation R_X86_64_PC32
against symbol `sqlite3_version` can not be used when making a shared object;
recompile with -fPIC
The link fails.
What’s going on here is that we are trying to incorporate a static library (libsqlite3.a
) inside a shared one. Most of the time this works fine, but not on Linux.
The compiler tells us what needs to be done: we have to recompile libsqlite3.a
with -fPIC
.
Here’s what man gcc
has to say about fpic
:
-fpic
Generate position-independent code (PIC) suitable for use in a shared library.
Fair enough, let’s try to rebuild sqlite3
by generating position-independent code.
Patching a conan recipe #
The first step is to see if we can rebuild sqlite3 ourselves.
We will be using a different user name and channel. (@dmerej/test instead of @bincrafters/stable). As we explained earlier, conan is decentralized, so copying and modifying other people’s recipes in order to satisfy your requirements is possible and even encouraged.
Let’s fetch the recipe from the remote:
$ conan copy sqlite3/3.21.0@bincrafters/stable dmerej/test
Downloading conan_sources.tgz
[==================================================] 706B/706B
Copied sqlite3/3.21.0@bincrafters/stable to sqlite3/3.21.0@dmerej/test
Copied sources sqlite3/3.21.0@bincrafters/stable to sqlite3/3.21.0@dmerej/test
Here conan looked for the recipe in the remote and created a copy with a different name, but still inside the conan cache.
Then we copy the sources from the cache and put them in a conan/sqlite3
folder next to the C++ code:
$ cd ChuckNorris/cpp
$ mkdir -p conan/sqlite3
$ cd conan/sqlite3
$ cp -rv ~/.conan/data/sqlite3/3.21.0/bincrafters/stable/export/* .
'../conanfile.py' -> './conanfile.py'
'../conanmanifest.txt' -> './conanmanifest.txt'
'../LICENSE.md' -> './LICENSE.md'
Let’s try to build the package ourselves:
$ conan create . dmerej/test
sqlite3/3.21.0@dmerej/test: Exporting package recipe
sqlite3/3.21.0@dmerej/test: A new conanfile.py version was exported
...
sqlite3/3.21.0@dmerej/test: Installing package
..
sqlite3/3.21.0@dmerej/test: Attempting download of sources from:
https://www.sqlite.org/2017/sqlite-amalgamation-3210000.zip
...
sqlite3/3.21.0@dmerej/test: Calling build()
CMake Error: The source directory "..." does not appear to contain CMakeLists.txt.
Turns out we need to also copy some files from the export_source
folder:
$ cp -rv ~/.conan/data/sqlite3/3.21.0/bincrafters/stable/export_source/* .
'../CMakeLists.txt -> './CMakeLists.txt'
'../FindSQLite3.cmake' -> './FindSQLite3.cmake'
And now we can build:
$ conan create . dmerej/test
sqlite3/3.21.0@dmerej/test: Exporting package recipe
sqlite3/3.21.0@dmerej/test: A new conanfile.py version was exported
...
sqlite3/3.21.0@dmerej/test: Installing package
..
sqlite3/3.21.0@dmerej/test: Attempting download of sources from:
https://www.sqlite.org/2017/sqlite-amalgamation-3210000.zip
...
sqlite3/3.21.0@dmerej/test: Calling build()
...
[1/2] Building C object CMakeFiles/sqlite3.dir/sources/sqlite3.o
[2/2] Linking C static library lib/libsqlite3.a
...
sqlite3/3.21.0@dmerej/test: Package '6ae331b7...' built
...
sqlite3/3.21.0@dmerej/test: Calling package()
sqlite3/3.21.0@dmerej/test package(): Copied 1 '.cmake' files: FindSQLite3.cmake
sqlite3/3.21.0@dmerej/test package(): Copied 2 '.h' files: sqlite3.h, sqlite3ext.h
sqlite3/3.21.0@dmerej/test package(): Copied 1 '.a' files: libsqlite3.a
...
sqlite3/3.21.0@dmerej/test: Package '6ae331b7...' created
Let’s sum up what happened:
- Conan fetched the sources from
sqlite.org
- It called
CMake
using theCMakeLists.txt
we copied fromexport_source
- It built the CMake project using a function named
build()
- It copied some files using a function named
package()
.
This is roughly what we did earlier when we built sqlite by hand.
Let’s take a closer look at the conan source files:
CMakeList.txt:
project(cmake_wrapper)
include(conanbuildinfo.cmake)
conan_basic_setup()
add_library(sqlite3 sources/sqlite3.c)
conanfile.py:
class ConanSqlite3(ConanFile):
name = "sqlite3"
version = "3.21.0"
settings = "os", "compiler", "arch", "build_type"
options = {"shared": [True, False]}
...
def source(self):
base_url = "https://www.sqlite.org/" + self.year
...
download_url = "{0}/{1}.{2}".format(base_url, archive_name, archive_ext)
tools.get(download_url)
def build(self):
cmake = CMake(self)
...
cmake.configure()
cmake.build()
def package(self):
self.copy(...)
The conan recipe is just a Python class that derives from the ConanFile
class.
It contains the source()
, build()
and package()
methods used to fetch the sources, build the package, and copy the relevant files we saw mentioned in the previous console output.
It uses a CMake
class that knows how to run cmake
.
Finally, the class contains a few attributes. name
and version
are self-explanatory, but we have to talk about the settings
and options
.
In both cases, settings and options are variables the consumers of the package can set. The settings cannot have default values, and if a setting changes, a different package must be produced. This is why the compiler (think gcc
versus Visual Studio
) is a setting. Settings are set globally, usually inside a profile, and apply to all the recipes. Options are different: they can have default values, they can be set package per package, and they have a pre-defined list of possible values.
We can see the conanfile.py
already defines a shared
option that can be true or false. We do not really want a shared sqlite3
library, we want a static sqlite3
library but built with position independent code.
There are several ways to do this. One of them is to introduce a new option called pic
.
CMake knows how to convert the abstract concept of “position independent code” into concrete compiler flags such as -fPIC
for gcc, so we just have to set the correct CMake variable:
class ConanSqlite3(ConanFile):
settings = "os", "compiler", "arch", "build_type"
options = {
"shared": [True, False],
"pic": [True, False],
}
def build(self):
cmake = CMake(self)
if self.options.shared:
cmake.definitions["BUILD_SHARED_LIBS"] = "ON"
if self.options.pic:
cmake.definitions["CMAKE_POSITION_INDEPENDENT_CODE"] = "ON"
...
So now we can re-create the sqlite3
package:
$ conan create --option 'pic=True' . dmerej/test
Finally, we can change the conanfile.txt
in cpp/ChuckNorris
to reference our newly built package:
conanfile.txt:
[requires]
sqlite3/3.21.0@dmerej/test
Using the chucknorris shared library #
Let’s re-run conan install
, using the --option
command line flag again, and see if we can manage to build chucknorris the way we want.
Note that we prefix the pic=True
option by the name of the package we want to apply the option on. If we did not do that, conan
would have tried to set the option on every package.
$ cd build/default
$ conan install ../.. --option 'sqlite3:pic=True'
$ cmake -GNinja -DBUILD_SHARED_LIBS=ON ../..
$ ninja
[1/7] Building C object CMakeFiles/c_demo.dir/src/main.c.o
[2/7] Building CXX object CMakeFiles/chucknorris.dir/src/c_wrapper.cpp.o
[3/7] Building CXX object CMakeFiles/cpp_demo.dir/src/main.cpp.o
[4/7] Building CXX object CMakeFiles/chucknorris.dir/src/ChuckNorris.cpp.o
[5/7] Linking CXX shared library lib/libchucknorris.so
[6/7] Linking C executable bin/c_demo
[7/7] Linking CXX executable bin/cpp_demo
Success!
Side note: specifying the option about sqlite3
each time we call conan
is a bit tedious, but we can just specify the option directly in the conanfile.txt
:
[requires]
sqlite3/3.21.0@dmerej/test
[options]
sqlite3:pic = True
Anyway, we said earlier that it was the operating system that took care of loading code from the shared library at runtime. On Linux, this is done by a special shared library called ld-linux.so
.
We can thus check that the libchucknorris.so
file does get loaded when we run the cpp_demo
executable, by asking ld.so
to output debug information about the files it loads 2:
$ cd build/default
$ export LD_TRACE_LOADED_OBJECTS=1
$ ./bin/cpp_demo
./bin/cpp_demo
linux-vdso.so.1 (0x00007f...)
libchucknorris.so => .../build/default/lib/libchucknorris.so (0x00007f...)
...
libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f...)
libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f...)
...
Chuck Norris knows Victoria's secret
Our friends libpthread.so
and libdl.so
we had to take care of when we linked with sqlite3
by hand are involved, and we can see the full path of the ChuckNorris lib inside our build folder.
Using ctypes #
The Python standard library contains a module called ctypes
that allows to do what ld.so
does, but using Python code.
The documentation says we can use ctypes.cdll.LoadLibrary
to get a “handle” from the .so
, and then use the symbols in the shared library simply by calling methods with the right names on the handle.
Let’s try:
handle = ctypes.cdll.LoadLibrary("build/default/lib/libchucknorris.so")
ck = handle.chuck_norris_init()
fact = handle.chuck_norris_get_fact(ck)
print(fact)
$ python ck.py
zsh: segmentation fault (core dumped) python ck.py
Whoops :/
Actually for this to work we have to specify the types of the parameters and return values for every method we call, like this:
handle = ctypes.cdll.LoadLibrary("build/default/lib/libchucknorris.so")
handle.chuck_norris_init.restype = ctypes.c_void_p
handle.chuck_norris_get_fact.restype = ctypes.c_char_p
handle.chuck_norris_get_fact.argtypes = [ctypes.c_void_p]
ck = handle.chuck_norris_init()
fact = handle.chuck_norris_get_fact(ck)
print(fact)
$ python ck.py
b'When Chuck Norris enters a rodeo the bull has to try and last 8 seconds.'
Almost there: we still have get rid of the b'
prefix.
ctypes
can’t really assume a char *
in C code contains text, so the c_char_p
type has been translated to a bytes
object, suitable for representing binary data.
Assuming we were careful and only inserted valid UTF-8 encoded text in our sqlite3 database, we can call .decode(UTF-8)
in our Python code, though:
fact_as_bytes = handle.chuck_norris_get_fact(ck)
fact_text = fact_as_bytes.decode("UTF-8")
print(fact_text)
$ python ck.py
When Chuck Norris enters a rodeo the bull has to try and last 8 seconds.
And we’re done.
This is all well and good, but there’s a more robust method to write our Python bindings. You can read more about it in part 5.
Thanks for reading this far :)
I'd love to hear what you have to say, so please feel free to leave a comment below, or read the contact page for more ways to get in touch with me.
Note that to get notified when new articles are published, you can either:
- Subscribe to the RSS feed
- Follow me on Mastodon
- Follow me on dev.to (mosts of my posts are mirrored there)
- Or send me an email to subscribe to my newsletter
Cheers!