MacPorts is a package
repository that aims to be an easy-to-use system for compiling,
installing, and upgrading [...] open-source software on
the MacOS X
operating system
. Unfortunately, it is not very good. In fact,
I think it sucks quite a bit, in a way that cannot be repaired
easily without some fundamental changes. At the end of this story,
I will propose to let MacPorts die. Here's why.
First Things First: Why a Package System?
Here's the scenario: In order to accomplish some given task, Jane
User wants to use a program, which is not yet installed. The sole
point of a package system is to get Jane to her goal (running and
using said program) as fast and efficient as possible. The fact
that she has to install the program's package before using it is
already overhead. Minimizing this overhead will help Jane focus on
her task, and be more efficient.
With MacPorts, chances are high that instead, Jane has to handhold
the installation, un- or reinstall packages, jump through hoops, and
generally wait a long time before she can use the program. Even
worse, she quite likely has to page in
knowledge about
compilers, build systems, OS idiosyncrasies, etc., if something goes
wrong. Nothing of this is related to her original task. All of it
slows Jane down.
In the remainder of this longish rant, I will try to highlight why
MacPorts fails me. I will use a
recent GNUCash installation as
running example, but similar things happened before with other
packages. Notice that I am not blaming the GNUCash ports package.
Were none of the dependencies installed already, it would have
needed less handholding, I am sure. (Unfortunately, it seems the
installed dependencies are never just right in a production
system...) However, the overall experience would have been only
slightly better.
Don't Waste My Time
Apparently, I am not the only one having trouble to install GNUCash
from MacPorts:
John
had, too. When the efforts to build the package reached the
point of becoming ridiculous, I started to keep
a build
log of the ordeal.
Much of this was trial-and-error,
because I actually wanted to do
something else. However, there are some gems in there,
where the package system tries quite hard to mislead me.
It took me about one-and-a-half days to
get GNUCash installed with
MacPorts. Granted, there were long periods of waiting in between
when the package system was busy doing whatever, and I was not near
that computer most of the time. But it's even worse, the whole
process is not entirely automatic—it required interaction
every other hour or so to help it getting unstuck. I have better
things to do than babysit a stretched-out installation process.
Okay, GNUCash has a lot of dependencies. Perhaps, it is the
ultimate package system benchmark. Good! I applaud the developers
for reusing whatever is possible. Actually, I don't even care, the
package system should shield me from having to think about this at
all. As a user, all I really care about is to have GNUCash
installed.
And GNUCash is not the only behemoth here, this happened before.
Take
GHC, for example. It
takes ages to build with MacPorts.
Download Missing Pieces In One Batch
The Debian equivalent of installing GNUCash looks like this:
% sudo time apt-get -d install gnucash
[...]
0 upgraded, 82 newly installed, 0 to remove and 171 not upgraded.Need to get 24.3MB/32.9MB of archives.
After unpacking 120MB of additional disk space will be used.
Do you want to continue [Y/n]?
[...]
Get: 65 http://ftp.de.debian.org testing/main gnucash 2.2.1-1 [1965kB]
Fetched 24.3MB in 4s (6024kB/s)
Download complete and in download only mode
1.16user 0.24system 0:30.63elapsed 4%CPU (0avgtext+0avgdata 0maxresident)k
So, I need a network connection for all of 4 seconds to get
everything that is needed onto my hard disk for GNUCash to run.
Before that, I am asked whether it is okay to continue, so that I
can intervene if needed.
With a drawn-out build like GNUCash, MacPorts alternates between
downloading a package, compiling it (which can take an arbitrary
amount of time), installing it, and then looking at the next
dependency. Suck! When I am near a fast net connection, I want the
package system to download whatever is possibly needed to complete
my initial installation request in one go. When finished,
I want to disconnect my machine from the net, and go hiking in the
outback while still being able to finish the installation. There is
a reason why I chose a laptop as my main work horse: mobility.
In other words, network access should only be needed right
after the user-initiated command to install a new package.
When all network activity is done, a clear indication of this fact
should be given. See
also: Debian's package manager.
(While not perfect, it gets some things right. I'll spare you
mentioning this in the sections hereafter.)
Don't Install Stuff I Did Not Ask For (Or Explain Why)
No really, why would I ever want firefox-x11 on
MacOS X, considering that I have already
another Firefox
installed,
and Camino,
and Flock, not to mention
Safari, which I actually
use most of the time? Oh
right, yelp, a help
browser for GNOME, depends on firefox-x11.
% port installed gnucash-docs
The following ports are currently installed:
gnucash-docs @2.0.1_0 (active)
Why on earth does GNUCash pull in
the firefox-x11
package? Not to
mention evince and
goffice! If
other packages get pulled in because the package I requested has
declared a dependency (perhaps by transitivity) on them, I want at
least to be told what will be installed. Maybe I would refine my
installation request in light of that information.
On a similar note, MacPorts tries really hard to install gcc, perl,
and some other basic packages, completely ignoring the ones that
ship with MacOS X. This is explained in the
MacPorts
FAQ:
The drawbacks on this behaviour also are minimal:
Wasting 10MB for a Python installation is next to nothing if you
have a GB-harddisk and gain consistency all the way in return.
Right. Except, how long does it take to build gcc, perl and python
from source? Long enough to make me forget what I actually wanted
to install, and why.
Anyway, some programs come bundled with heaps of documentation, or
in different flavors. Sometimes, I just don't care, but sometimes
I'd like to have some influence on what gets installed. Depending
on the circumstances, I would like to choose from a minimal or
a batteries included
installation, or just the documentation,
please. Too much to ask?
Building from Source Considered Harmful
The reason is simple: there are too many variables in the process.
Too much can (and routinely does) go wrong, resulting in broken
programs, or even installations. Another war story: after a gettext
update (which was pulled in as a dependency by some other program,
no less), all of a sudden vital programs started to segfault: awk,
sed, tcl, you name it. Unnecessary to mention, MacPorts needs some of
these programs to function. Chicken, meet Egg.
I hope nobody dares to point to the installable packages
facility of MacPorts. They are utterly lacking in comparison to,
e.g., the Debian package manager. Dependencies, anyone?
I am really astonished that with all the emphasis on optimization,
it is not considered odd that every installation of MacPorts repeats
all the package compilation jobs over again. Why not cache the work
on download servers, and share it? Even if package build bots are
generally untrusted, it should only be a Small Matter Of Programming
to provide a web of trust with cryptographically signed packages,
right?
Build Environment and Run-time Environment Should be Separate
- Build Environment
- The
Build Environment
of a package contains the set of
dependencies needed to successfully build that package from source
code. This includes external packages containing compilers,
linkers, header files, libraries, documentation generators, OS
settings.
- Run-time Environment
- The
Run-time Environment
of a package contains
dependencies needed to use the package. Without them present, the
contents of the package are not functional, at least not fully.
The point here is that the build environment can differ
significantly from the run-time environment, and a package system
should acknowledge this.
Extreme cases are cross-compiled packages: such packages might not
even be usable in the build environment, but they are
usable in the run-time environment. A more common case is that the
build environment has perhaps a compiler or document generation
system like TeX installed, while the run-time environment does not
need a compiler, but instead a PDF reader to view
the documentation (which was not needed on the build environment).
Another reason is that it is usually fine to have several versions
of a single library installed at the same time, with the loader
taking care of choosing the right one for the program at hand.
However, when it comes to compiling and linking C code, one set of
header files and one (matching) shared library should be installed
in standard places in the build environment, such that the brittle
mess that is autotools
(a fine topic for another rant) can find them without getting
confused. When I want to build against the newest library version,
I do not want to touch at all the installed programs which were built
against some older version. They should keep working just fine.
As user of the package, I do not care what was needed to get
it to the point when I can execute the program. Whether it was
written in C or INTERCAL does not matter, as long as I have an
executable to run, which works.
Thus, all I need to ensure is that the run-time environment for the
package is complete. And fortunately, I do not have to take care
about that myself. If the package declares its dependencies, the
package system can make sure that everything that is needed gets
installed alongside the package.
Transactions and Rollback
If something goes wrong during package installation, I want an easy
way out. I want at least the ability to get back to the point
before I started installation. Also, if some dependencies cannot be
fulfilled, I want the whole transaction to fail, or possibly ask the
user whether to proceed with a partial installation.
User Interaction
Even with the ideal of a short installation time, any user
interaction should be at the beginning (or the end) of this
transaction, never interspersed with other activity.
You Failed It
How long are the BSDs around? Is the ports system really the best
package system that came out of this? When I was waiting for the
GNUCash build to finish I saw warnings scrolling by which
effectively said that something could not be undone because
post-remove hooks are not yet implemented. It's 2007 for another
couple of days, the much sneered-at Linux package systems (e.g.,
Debian's) have had {pre,post}-{install,remove} hooks for how many
years now?
So, invoking Fred Brooks here ("Fail fast"): please, let MacPorts
die. It is a waste of resources. Instead, I recommend spending
energy on improving,
e.g., Fink packages. At
least, it has modern, full-featured package management tools.
Speaking of which, I just asked fink to selfupdate via
rsync. It is still... COMPILING?!
p.s.: Another option
is Zero-Install, the package
management nirvana. I managed to get Zero-Install installed on
MacOS X actually (which also was not a smooth process, in parts
thanks to MacPorts). However, before it becomes usable in
day-to-day situations it needs a bigger user base. I will give it a
spin and see how far I get.
Yeah, that totally worked. Not!
Fink sent me through a twisty
maze of virtual packages, all alike (i.e., not working).
Eventually, I caved in and let it install ghostscript and tetex,
too, before it agreed to continue with GNUCash. The end
result is not very impressive.
After the fink selfupdate command and fink -b
install gnucash, this is what I ended up with:
% du -sh /sw
1.2G /sw
1.2 GB worth of (mostly build-time) dependencies.
On the good side, all the downloading appeared to happen at the
beginning of the command. Had fink not bailed out half dozen times,
the installation process could be called almost automatic. On the
bad side, there was again quite some compilation involved. Any
number of factors could or could not be the reason that gnucash
segfaults.
When will the madness stop?