A Bit Wiser

Background

A lot of Alight’s development work consists of writing interfaces to various different APIs. Our clients have social media data they want us to pull together (Twitter mentions, Facebook likes, Youtube shares). They have online media that we report on (Google analytics, Adwords, and ad-serving). And they often have internal CRM and sales data which they provide to us via secure file transfer.

For that last one, we recently put together a process, written in Python, to interface with a remote SFTP service. Using the Paramiko library, this is simple enough. There’s an interesting sub-problem within this, though, which is ammenable to bitwise operators — a topic which often seems esoteric to new developers. I want to give a few examples here of how and why they’re used.

Unix filesystem permissions

On Unix-like operating systems (i.e. Mac OSX and linux), file system permissions are usually described and modified using a set of bit-masks. The combinations of these masks are very succint to use:

touch test chmod 600 test

In Python, it’s simple enough to read these modes from the filesystem:

$ python -V Python 2.7.7 $ python -c "import os; print(os.stat('test').st_mode)" 33152

However, the default output is in base-10 (integer) representation. This obscures the relation to the permission bits somewhat. Fortunately, it’s simple enough to convert numberic representations in Python. It’s also straight-forward to create and manipulate numbers in integer (int), octal (oct) and hexidecimal (hex) formats:

int(10) oct(10) hex(10) 0xa 0xa - 2 0300 0b100 int(0b100) int(0b100) - 0x10

So that’s the background. Knowing this, and using the Python standard stat library, we can view (and modify) these permissions, in the format most familiar from Unix, octal. Using the excellent IPython command-line intepreter:

In [1]: import os, stat In [2]: os.stat('test').st_mode Out[2]: 33152 In [3]: oct(os.stat('test').st_mode - stat.S_IFREG) Out[3]: '0600'

The last line is subtracting the stat.S_IFREG permission (regular files, as opposed to directories or symlinks), and then converting the result to octal. The output, 0600 is the original permission assigned using chmod above.

Semantic Mnemonics

Perhaps regrettably, these octal bit manipulations are not terribly easy to remember. Getting read, write, and execute permissions correct requires remembering that 0100 is execute for owner — not super obvious.

Luckily, computers are really good at remembering all those sorts of nit-picky details. A common idiom is to create more human-readable static variables for small sets like the permission bits here. If we wrap these permission sets in a class object or module, we get a very clean, literate way of referring to each permission. In the example above, when using stat.S_IFREG, this is precisely what we’re employing.

An Example: Account Permissions

The excellent Flask book has a good example of model permissions, as well. In that project, a Permission class is defined, and then the set of permissions for each user role are bitwise OR’d together:

'User': (Permission.FOLLOW | Permission.COMMENT | Permission.WRITE_ARTICLES, True), 'Moderator': (Permission.FOLLOW | Permission.COMMENT | Permission.WRITE_ARTICLES | Permission.MODERATE_COMMENTS, False),

 

Equivalence

If you’re curious to learn more about this, the Codecademy site has a nice interactive tutorial on Bitwise Operators specifically (as well as a lot of other topics).

Hopefully you now know a bit more about the motivation for these operators — or you already know the subject, and I haven’t made any egregious assertions — and have an idea of how to work with them in a straight-forward way. I’ll leave with a few more examples that hint at how the various representations are equivalent, and how you can experiment to build the mental model of what each form means.

$ python Python 2.7.7 (default, Jun 2 2014, 18:53:46) [GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> 0b01 1 >>> 0b10 2 >>> 0b11 3 >>> 0x01 1 >>> 0x02 2 >>> 0x03 3 >>> 0b11 == 0x03 True >>> 0b10000 == 0x10 == 16 True >>> 0b10000 == int('10', 16) True

Leave a Reply