It looks as if my Haskell for Enterprise Linux is set for wider circulation. Unfortunately I have something that will keep maxed out for a couple of weeks while the best explanation of what I am trying to do with the release went into a ‘heads up’ email to Simon Marlow, Simon Peyton Jones, Ian Linagh, Duncan Couts and Don Stewart. So here is a lightly edited copy of it.
Share and Enjoy
As I think most of you know I have been threatening to make a RHEL/CentOS Haskell distribution for some time. Well for the last few months I have been making good and I think I am ready for an alpha release.
The release is a little different from the other one I am aware of, so I want to explain what I am doing here first.
Stability is King for Enterprise
Especially for an Enterprise Linux release I felt that we needed something that allows multiple releases to sit together. For my own production servers I can’t swap over the platform to fit in with distro release cycles. Because the servers use CentOS this has never been an issue, but any distribution that did delete old tools and platforms would not be useful for my day job, except perhaps as a handy source of bootstrap compilers!
I have found that even my own crafted setups are surprisingly cumbersome to work with, especially since I have gotten more heavily into the compiler-building game, but even while working on the client code base I have noticed a marked tendency for me to stick to well-proven collection of packages (especially after the an accidental cabal update destabilised my setup at an unwelcome time – after which I made a careful note of the exact collection of packages I was using).
So, for the EL distribution I have tried to make it easier to smoothly integrate multiple releases at the system level and manage multiple build contexts at the user level, all using a simple device that organises the excellent mechanisms provided by GHC and Cabal in a way that could prove helpful to the user.
From 6.10.4 to 7.2.1
So this distribution provides many platforms and compilers, all of which can reside on the same system. The repository contains the three major, stable platforms to date (2011.2.0.1, 2010.2.0.0, 2009.2.0.2, though for the moment without Haddock and without OpenGL on the older platforms) and every publically-released GHC since 6.10.4.
How does it work? If you install the Haskell package then you will get the platform du jour (2011.2.0.1) and updating the package will download and install later platforms as they become available. Old platforms won’t go away after a mere update (but it could be manually erased), and any source trees that so configured will continue to use their designated platform.
If older compilers and platforms are needed from the get go then the individual packages can be installed.
The GNU Tool and 5.5G
Because I have noticed some problems with the old-ish gcc (4.1.2) and binutils (2.17) supplied with RHEL/CentOS 5 the distribution comes with its own inboard gcc (4.6.1) and binutils (2.21). RHEL/CentOS 6 use 4.4.2 and 2.20, which I suspect I could use for the Haskell release for these O/Ss (which I haven’t prepared yet), but I think I will provide the same approach and GNU tools on EL6 as EL5, and prioritise stability in the HP across both O/Ss. The cost is 72M+19M on download, 147M+76M unpacked. In total, it’s 200M for the first full platform and 109M for each extra tool chain and 25M for each of the platform libraries (12M without OpenGL). 1.2G of disc space is needed to install haskell, rising to 5.5G for the works.
So how does it really work?
Out of the box, if you type ghc or cabal you will get the default platform and the tools and they should work exactly as they would anywhere else.
If you want to access a different tool chain or fix the tool chain that you get (or you users get by default) you have options. Staying with the user options, you may indicate through the HUB environment variable the platform or compiler you desire like so:
$ HUB=2010.2.0.0 ghc-pkg list
Or you can run a little tool that will drop a marker in your current directory. Provided the HUB environment variable is not set, then any tools run from this or any of its sub-directories (as defined by getCurrentDirectory) will use the specified platform and tool chain.
$ cd ~/brave-new-world
$ hub set 7.2.1
In these directories it is just as if you are on a system with 7.2.1 installed in /usr/bin.
Obviously /usr/bin contains a set of links from the tools that we know and love to the hub program which looks at the configuration files dropped by the distributions, the HUB environment variable and/or the markers in the current and parent directories and invokes the relevant tool (with HUB set to ensure consistent usage regardless of the changes in directory – important for cabal install).
If the user wishes, this trip through the hub can do some more work and provide cabal-dev type functionality. Until now we have been working with system hubs – those that came with the compiler and platform packages, but the users can create their own hubs with package databases attached…
# create user hub ‘bnw’ based on GHC 7.2.1
$ hub init 7.2.1 bnw
$ hub info bnw
bnw (user hub) GHC : 7.2.1 Tools : /usr/hs/ghc/7.2.1/bin Cabal : /usr/hs/cabal/cabal Package DBs global : /usr/hs/db/7.2.1.d user : /home/apache/.hubrc/lib/bnw/package.config
# and a user hub based on the H.P. 2011.2.0.1
$ hub init 2011.2.0.1 tt
$ hub info tt
tt (user hub) GHC : 7.0.3 Haskell Platform : 2011.2.0.1 Tools : /usr/hs/ghc/7.0.3/bin Platform Tools : /usr/hs/hp/2011.2.0.1/bin Cabal : /usr/hs/cabal/cabal Package DBs global : /usr/hs/db/2011.2.0.1.d user : /home/apache/.hubrc/lib/tt/package.config
Now you can install packages into the hub (with cabal), copy, rename and remove hubs (with hub), analyse the packages (with ghc-pkg), etc.
And How is it has working in practice?
I have found this to be quite liberating. My standard smoke test for the repository is to install everything and run this script to prepare a hub for each platform and compiler capable of compiling the hub programme itself, building the compiler and running a simple test to make sure that the executable is functioning. I started with 7.0.3, my natural base. Going forward into 7.2.1 I was browbeaten by the tools into making my code 2010 clean. This turned out to be quite a quick and easy task. Because the tools were to hand and made it really broke down my tendency to stick with the tried and trusted.
Going the other way I discovered some ‘missing’ packages in the 2010 platform.
# we need the platform
$ sudo yum install haskell-platform-2010.2.0.0-hub
# now create a new hub based on the 2010 platform
$ hub init 2010.2.0.0 hub-2010
# and switch the directory to the new hub
$ hub set hub-2010
# install what we need and rebuild
$ cabal install hexpat
$ make clean all
# oops – whining about missing Text.Regex
$ ghc-pkg list
$ HUB=hub-2011 ghc-pkg list
$ HUB=hub-2011 ghc-pkg describe regex-compat
# hmn, missing regex-compat
$ cabal install regex-compat
# we are good to go
When it came to 2009.2.0.2 I ran into problems getting hexpat to work. Essentially type errors on 6.10.4 pushed me back to a version of hexpat that would have required modifying the hub source code. Now I have no substantive reason for making hub 6.10.4-friendly so I stopped at 6.12.1, but it was a relatively pleasant exercise.
Perhaps contrary to expectation, by making it easy for people to hold onto their old compilers while acquiring the latest tools, we could see the wider code base tracking the tools more closely.
Cor blimey this email [post] is getting a bit long!
Naturally it is all a little rough, but I think still quite usable. If there are any flaws in the scheme I think you folks will be quick to point it out, which is why I wanted to run it past you before take it wider.
If you would like to try it out the instructions should be complete. (Let me know if they are not!)