Python on Wheels

The python packaging infrastructure has long received criticism from bothPython developers as well as system administrators. For a long time eventhe Python community in itself could not agree on what exactly the toolsto use were. We had distutils, setuptools, distribute, distutils2 asbasic distribution mechanisms and then virtualenv, buildout, easy_installand pip as high level tools to deal with this mess.

As distribution formats before setuptools we had source files and forWindows there were some binary distributions in form of MSIs. On Linux wehad bdist_dumb which was historically broken and bdist_rpm whichonly worked on Red Hat based systems. But even bdist_rpm did notactually work good enough that people were actually using it.

A few years ago PJE stepped up and tried to fix the initial distributionproblems by providing the mix of setuptools + pkg_resources to improvedistutils and to provide metadata for Python packages. In addition tothat he wrote the easy_install tool to install packages. In lack of adistribution format that supported the required metadata, the egg formatwas invented.

Python eggs are basically zip packages that include the python packageplus the metadata that is required. Even though many people probablynever built eggs intentionally, the egg metadata is still alive andkicking and everybody deploys things through setuptools now.

Now unfortunately a few years ago the community split in half and part ofthe community declared the death to binary distributions and eggs. Whenthat happened the replacement for easy_install (pip) stopped acceptingeggs altogether.

Fast forward a few years later. The removal of binary distributions hasbecome noticed very painfully as people started more and more clouddeployment and having to recompile C libraries on every single machine isno fun. Because eggs at that point were poorly understood I assume, theywere reimplemented on top of newer PEPs and called wheels.

As a general information before we dive in: I’m assuming that you are inall cases operating out of a virtualenv.

What are Wheels

So let’s start simple. What exactly are wheels and what’s the differenceto eggs? Both eggs and wheels are basically just zip files. The maindifference is that you could import eggs without having to unpack them.Wheels on the other hand are just distribution archives that you need tounpack upon installation. While there are technically no reasons forwheels not to be importable, that was never the plan to begin with andthere is currently no support for importing wheels directly.

The other main difference is that eggs included compiled python bytecodewhereas wheels do not. The biggest advantage of this is that you don’tneed to make wheels for different Python versions for as long as you don’tship binary modules that link against libpython. On newer Python 3versions you can actually even safely link against libpython for as longas you only use the stable ABI.

There are a few problems with wheels however. One of the problems is thatwheels inherit some of the problems that egg already had. For instanceLinux binary distributions are still not an option for most people becauseof two basic problems: Python itself being compiled in different forms onLinux and modules being linked against different system libraries. Thefirst problem is caused by Python 2 coming in two flavours that are bothincompatible to each other: UCS2 Pythons and UCS4 Pythons. Depending onwhich mode Python is compiled with the ABI looks different. Presently thewheel format (from what I can tell) does not annotate for which Pythonunicode mode a library is linked. A separate problem is that Linuxdistributions are less compatible to each other as you would wish andconcerns have been brought up that wheels compiled on one distributionwill not work on others.

The end effect of this is that you presently cannot upload binary wheelsto PyPI on concerns of incompatibility with different setups.

In addition to that wheel currently only knows two extremes: binarypackages and pure Python packages. When something is a binary packageit’s specific to a Python version on 2.x. Right now that’s actually notthe worst thing in the world because Python 2.x is end of life and wereally only need to build packages for 2.7 for a long time to come. Ifhowever we would start considering Python 2.8 then it would be interestingto have a way to say: this package is Python version independent but itships binaries so it needs to be architecture specific.

The reason why you might have a package like this are packages that shipshared libraries loaded with ctypes of CFFI. These libraries do not linkagainst libpython and as such would work cross Python (even cross Pythonimplementation which means you can use them with pypy).

On the bright side: nothing stops yourself from using binary wheels foryour own homogenous infrastructure.

Building Wheels

So now that you know what a wheel is, how do you make one? Building awheel out of your own libraries is a very straightforward process. Allyou need to do is using a recent version of setuptools and thewheel library. Once you have both installed you can build a wheel outof your package by running this command:

$ python setup.py bdist_wheel

This will throw a wheel into your distribution folder. There are howeverone extra things you should be aware of and that’s what happens if youship binaries. By default the wheel you build (assuming you don’t use anybinary build steps as part of your setup.py) is to produce a pure Pythonwheel. This means that even if you ship a .so, .dylib or .dllas part of your package data the wheel spit out will look like it’splatform independent.

The solution for this problem is to manually subclass the setuptoolsdistribution to flip the purity flag to false:

import osfrom setuptools import setupfrom setuptools.dist import Distributionclass BinaryDistribution(Distribution):    def is_pure(self):        return Falsesetup(    ...,    include_package_data=True,    distclass=BinaryDistribution,)
Installing Wheels

Now you have a wheel, how do you install it? On a recent pip version youcan install it this way:

$ pip install package-1.0-cp27-none-macosx_10_7_intel.whl

But what about your dependencies? This is what it gets a bit tricker.Generally what you would want is to install a package without everconnecting to the internet. Pip thankfully supports that by disablingdownloading from an index and by providing a path to a folder for all thethings it needs to install. So assuming you have all the wheels for allyour dependencies in just the right version available, you can do this:

$ pip install --no-index --find-links=path/to/wheels package==1.0

This will then install the 1.0 version of package into yourvirtualenv.

Wheels for Dependencies

Alright, but what if you don’t have the wheels for your dependencies? Pipin theory supports doing that through the wheel command. In theorythis is supposed to work:

pip wheel --wheel-dir=path/to/wheels package==1.0

In this case wheel will throw all packages that package depends on intothe given folder. There are two problems with this.

The first one is that the command currently has a bug and does notactually throw dependencies into the wheel folder if the dependencies arealready wheels. What the command is supposed to do is to collect all thedependencies and the convert them into wheels if necessary and then placesthem in the wheel folder. What’s actually happening though is that itonly places wheels there for things that were not wheels to begin with.So if a dependency is already available as a wheel on PyPI then pip willskip it and not actually put it there.

The workaround is a shell script that goes through the download cache andmanually moves downloaded wheels into the wheel directory:

#!/bin/shWHEEL_DIR=path/to/wheelsDOWNLOAD_CACHE_DIR=path/to/cacherm -rf $DOWNLOAD_CACHE_DIRmkdir -p $DOWNLOAD_CACHE_DIRpip wheel --use-wheel -w "$WHEEL_DIR" -f "$WHEEL_DIR" \  --download-cache "$DOWNLOAD_CACHE_DIR" package==1.0for x in "$DOWNLOAD_CACHE_DIR/"*.whl; do  mv "$x" "$WHEEL_DIR/${x##*%2F}"done

The second problem is more severe. How can pip wheel find your ownpackage if it’s not on PyPI? The answer is: it cannot. So what thedocumentation generally recommends is to not run pip wheel package butto run pip wheel -r requirements.txt where requirements.txtincludes all the dependencies of the package. Once that is done, manuallycopy your own package’s wheel in there and distribute the final wheelfolder.

DevPI Based Package Building

That workaround with depending on the requirements certainly works insimple situations, but what do you do if you have multiple in-house Pythonpackages that depend on each other? It quickly falls apart.

Thankfully Holker Krekel sat down last year and build a solution for thisproblem called devpi. DevPI is essentially apractical hack around how pip interacts with PyPI. Once you have DevPIinstalled on your own computer it acts as a transparent proxy in front ofPyPI and you can point pip to install from your local DevPI server insteadof the public PyPI. Not only that, it also automatically caches allpackages downloaded from PyPI locally so even if you kill your networkconnection you can continue downloading those packages as if PyPI wasstill running. In addition to being a proxy you can also upload your ownpackages into that local server so once you point pip to that server itwill both find public packages as well as your own ones.

In order to use DevPI I recommend making a local virtualenv and installingit into that and then linking devpi-server and devpi into yoursearch path (in my case ~/.local/bin is on my PATH):

$ virtualenv devpi-venv$ devpi-venv/bin/pip install --ugprade pip wheel setuptools devpi$ ln -s `pwd`/devpi-venv/bin/devpi ~/.local/bin$ ln -s `pwd`/devpi-venv/bin/devpi-server ~/.local/bin

Afterwards all you need to do is to start devpi-server and it willcontinue running until you shut it down or reboot your computer:

$ devpi-server --start

Once it’s running you need to initialize it once:

$ devpi use http://localhost:3141$ devpi user -c $USER password=$ devpi login $USER --password=$ devpi index -c yourproject

In this case because I use DevPI locally for myself only I use the samename for the DevPI user as I use for my system. As the last step I createan index named after my project. You can have multiple indexes next toeach other to separate your work.

To point pip to your DevPI you can export an environment variable:

$ export PIP_INDEX_URL=http://localhost:3141/$USER/yourproject/+simple/

Personally I place this in the postactivate script of my virtualenv tonot accidentally download from the wrong DevPI index.

To place your own wheels on your local DevPI you can use the devpibinary:

$ devpi use yourproject$ devpi upload --no-vcs --formats=bdist_wheel

The --no-vcs flag disables some magic in DevPI which tries to detectyour version control system and moves some files off first. Personallythis does not work for me because I ship files in my projects that I donot want to put into version control (like binaries).

Lastly I would strongly recommend breaking your setup.py files in away that PyPI will reject them but DevPI will accept them to notaccidentally release your code with setup.py release. The easiest wayto accomplish this is to add an invalid PyPI trove classifier to yoursetup.py:

setup(    ...    classifier=['Private :: Do Not Upload'],)
Wrapping it Up

Now with all that done you can start inter depending on your own privatepackages and build out wheels in one go. Once you have that, you can zipthem up and upload them to another server and install them into a separatevirtualenv.

All in all this whole process will get a bit simpler when the pipwheel command stops ignoring already existing wheels. Until then, ashell script is not the worst workaround.

Comparing to Eggs

Wheels currently seem to have more traction than eggs. The development ismore active, PyPI started to add support for them and because all thetools start to work for them it seems to be the better solution. Eggscurrently only work if you use easy_install instead of pip which seems tobe something very few people still do.

I assume the Zope community is still largely based around eggs andbuildout and I assume if an egg based deployment works for you, thenthat’s the way to go. I know that many did not actually use eggs at allto install Python packages and instead built virtualenvs, zipped them upand sent them to different servers. For that kind of deployment, wheelsare definitely a much superior solution because it means different serverscan have the libraries in different paths. This previously was an issuebecause the .pyc files were created on the build server for thevirtualenv and the .pyc files include the filenames.

With wheels the .pyc files are created upon installation into thevirtualenv and will automatically include the correct paths.

So there you have it. Python on wheels. It’s there, it kinda works, andit’s probably worth your time.

Python on Wheels

相关文章:

你感兴趣的文章:

标签云: