Oscar Bonilla: Installing the ACS on a Linux Box

Installing the ACS on a Linux Box

A very dated article on Installing the ACS

This document is a log of the joyful weekend I spent at the computer science department of Universidad Francisco Marroquín in Guatemala installing the ArsDigita Community System on a Linux Machine. I figure it might prove useful for others.

Which Machine?

The easiest part of the whole enchilada was choosing the machine in which we would install the software. I had only one machine which could run the Oracle RDBMS. It was a PC I had built myself. It had the following specs:

Pentium II CPU Running at 450MHz
ASUS P2B Motherboard
256 MB or RAM
6.3 GB Hard Disk Drive (just one!)

I think the most important feature your machine's got to have is lots of RAM. From what I've heard in the Web/DB Forum Oracle won't even install on a machine with less than 128 MB of RAM.

Installing the OS

The installation of the Operating System was pretty straight forward. We have a mirror of Linux Red Hat 6.2 on a local SunSITE, so I downloaded the install disk and did a plain network install.

Usually, Unix systems create various partitions that are mounted in different directories (mount points) in the logical filesystem. The only one really needed is the root filesystem, which is mounted on /. All of the other filesystems are created to "organize" things. Ancient Unix systems would choke pretty badly if the / filesystem filled up. Modern ones still choke, but not as bad. This was particularly annoying since the unattended log files in /var usually filled up pretty quickly on a busy machine, so people decided to have different filesystems for different things. In general the partitioning of the logical filesystem in many physical ones is a trade-off.

Unless you have a big machine, or at least something with SCSI, you won't be able to implement the Optimal Flexible Architecture (described in Chapter 12 of the book). If you're stuck with IDE, the best you can do is mirror the disks. If you only have one disk, why partition it? Not only you won't have any performance improvements, but daily administration will be more difficult.

With this in mind, I settled for a single partition occupying the whole disk and mounted as the root filesystem.

The default filesystem shipped with Red Hat Linux is ext2fs. The problem with this filesystem is that it uses async mode for maintaining the meta-data state of the filesystem. If the computer crashes, the metadata can be left in an inconsistent state. Thus the need to run the dreaded fsck program. This program basically scans the whole filesystem and compares the results of building the metadata by hand to what is actually stored on disk. The time this process takes is directly proportional to the size of the filesystem. An alternative is to install a Journaling Filesystem such as ReiserFS. This type of filesystems record all of the metadata information in a log before actually modifying it. Thus, when hell breaks loose, there's a nice log telling the system exactly what was left in an inconsistent state. This saves a lot of time when the machine crashes (no fsck). I got one of our local Linux Gurus, Otto Solares to install the ReiserFS. Note that this step is completely unnecessary, and in fact, probably not needed.

Red Hat 6.2 ships by default with the Linux Kernel 2.2.14, but this version of the kernel has bug in the UDP code. Otto also upgraded the Linux kernel to version 2.2.16pre3. This is not required either.

There are some nice utilities that Red Hat doesn't have by default. For instance, it doesn't come with OpenSSH, which Otto gladly installed. All in all, it took us about 2 hours to set up the machine from scratch, and most of the time was spent waiting for the bits to be copied to the disk.

Installing Oracle

So I had my wimpy installation of Linux running and, already feeling great about myself, set out to install the Oracle RDBMS. Well, things started to get ugly as soon as I downloaded Oracle, it took me about ten minutes just to be able to log into the FTP server and almost a full day to get all of the bits.

Fresh from a good night sleep, I unpacked the Oracle tarball in /orainstall and ran the runInstaller script. Of course it didn't work. Seems like the installer script doesn't like to be run as root. After reading the Oracle Installation Notes, the Oracle DBA Handbook, asking a friend, and paying $250 of Oracle Consulting Time, I came up with the following instructions:

Create a group named dba
Create a user named oracle
Create the user's home dir

Set the oracle user's environment to include

	  PATH=/oracle/bin:$PATH
	  LD_LIBRARY_PATH=/oracle/lib:/oracle/ctx/lib:$LD_LIBRARY_PATH
	  ORACLE_HOME=/oracle
	  ORACLE_SID=WEB

Now, as the oracle user, I gave the runInstaller another try. Surely enough, after a couple of seconds a nice screen popped up and offered to install the Oracle Relational Database Management System in my machine. Heeding the defaults, I chose custom install and selected all the options. I chose /oracle as the target install directory, and /db as the target for the database files. At the prompt that asks if you want to create a new database once Oracle is installed I said yes. After crunching for a while, Oracle was all set and the Net8 Configuration Assistant came up. I chose not to have Directory Services support, since I have none, and left most of the remaining stuff untouched. Only one word of caution, do not change the name of the LISTENER, you'll regret it later as I did on a previous install.

It seems to me that the default parameters of the Oracle Database Assistant have been very carefully engineered to screw users up. Had they chosen too small a value for the default sizes of the tablespaces and redo logs, most applications would overflow them as soon as the data models were loaded. If the value was larger, probably it would be sufficient for most applications. No, the values are exactly engineered in such a way that once you have data loaded, the tables fill up, the rollback segments choke and your database dies. The moral of the story is: choose custom when the database assistant offers you the option.

Once you're in the custom path, it will ask if you want to do data warehousing or On-Line Transaction Processing. Since I think the ACS looks more like an OLTP application, I chose the later. In the next screen, I left the default of 15 users. After that, I was offered the choice between a Dedicated Server Mode or a Shared Server Mode. Since I was going to use the AOLServer, which pools database connections, I chose the Dedicated Server Mode. I then proceeded to select every possible option to the database, making specially sure I selected the InterMedia option which is needed for the Site Wide Search. The next screen asked for a database name, a SID name, and a bunch of other stuff. I'd heard that it's a Good Thing(tm) to choose the same name for both the database name and the SID, so out of pure ignorance and superstition I chose "web" for both and left the rest of the stuff alone. The next screen looked good; I didn't really cared where the Control Files were stored as long as they were on /db. The datafiles are the files associated with the tablespaces you create, so 254 seemed plenty enough. I thought for a second about bumping up the Maximum Log Files to 64 from the default of 32, but left it alone after all. The only change I did was to increase the Maximum Log Members from 2, the default, to 4.

I was already gaining confidence in the default parameters when I clicked on Next and was presented with the tablespace parameters section. This is where you really have to change stuff. I suggest doubling every tablespace size and setting autoextend and unlimited on for each and everyone of them. Once you're done with that, you can press next and wonder what sort of wimpy application would use 500K redo logs. Set each Redo log to at least 10M (10240K). At the next screen, I enabled the Archive Log and left the rest of the parameters untouched. After Next'ing that screen, I had fun figuring out the Block Size of the ReiserFS filesystem. After trying all of the obvious tools like tunefs, df, and newfs -N, I had to finally dig throug the source code of the driver to finally realize it's 4K, the default! Note that for most filesystems, this is 8K and it's best to have the same block size in the database as in the filesystem. The next screen asked about the location of the trace files, which I left untouched. At last the Next button turned gray and I was offered the choice of immediately creating the database or saving all of my choices in a script. I chose to create it right away and left the machine crunching for about an hour.

So was that it? didn't look too hard did it? actually, you're reading about my third Oracle installation! I just thought it would be more useful to write about the stuff that worked and not about the failed attempts. I was able to derive a simple premise from all the failed installs though: don't mess with Oracle if you don't know what you're doing. You'll only make it worse. If you've made it bad enough and you don't longer have a clue, wipe it all and reinstall.

Installing the ACS

I pointed my browser to software.arsdigita.com to download the latest version of the ACS. Lucky me! a new version had just been released. I downloaded the 3.2.3 Release of the ACS to my home dir.

I had done my homework and read how to use CVS for web development, so I created a root repository in /cvsroot:

	$ export CVSROOT=/cvsroot
	$ cvs init

after which I proceeded to import the ACS in the repository:

	$ tar xvfz acs-3.2.3.tar.gz
	$ cd acs
	$ cvs import -m "Initial Import of ACS 3.2.3" web arsdigita acs-3_2_3

After a little while, I had the ACS under the control of CVS and proceeded to checkout a copy of it in the development directory. Namely /webroot/web/fisicc-dev/ using the checkout option of cvs.

According to the CVS guide I had to have two tablespaces and two web users for the development and production systems, so after feeding this script to the sqlplus program as the system user, I was finally ready to load the data model.

Following the instructions from the installation guide I managed to have all the data model plus Site Wide Search loaded and without errors. Here's basically what you need to do:

	$ script acs-install.log
	$ cd acs/www/install
	$ sqlplus system/password < load-geo-tables.sql
	$ cd ../doc/sql
	$ sqlplus user/passwd < load-data-model.sql
	$ ./load-site-wide-search user password ctxpassword
	$ exit

Remember to do it for both the development user and the production user. The nice thing about using script is that it lets you check for errors afterwards using something like

	$ cat acs-install.log | grep -i error | more

If you find errors, you screwed up at some point.

Installing AOLServer (chroot'd of course!)

I downloaded AOLServer 3.0 and untared it in my home dir. I also downloaded the OpenSSL Module for the ACS from Stefan Arentz's Site. Compiling the AOLServer was truly a breeze. I just changed the install directory in include/Makefile.global to point to /aolserver and typed make && make install. The nsopenssl tarball has a precompiled nsopenssl.so which I just copied to /aolserver/bin. Now for the chroot part.

Your best friend for installing the AOLServer in a chroot environment is ldd. This nifty utility shows you on which shared libraries a certain executable depends. So armed only with ldd and a statically compiled shell, I started the painful process of having the AOLServer properly run inside the chroot jail.

Ideally you want only what is strictly necessary inside the chroot jail. Namely, modules, libraries and the acs distribution itself. Do not put your Oracle and AOLServer installations inside the jail. This would defeat the purpose of having a chroot jail in the first place since the configuration files and data files would all be accesible to the web server.

In my case, I had Oracle installed in /oracle and AOLServer installed in /aolserver, the chroot jail was to be set up in /webroot with the pages residing in three different places: /webroot/web/fisicc-dev for the development server, /webroot/web/fisicc-staging for the staging server, and /webroot/web/fisicc for the live production server.

AOLServer is designed to be run outside the chroot jail and instructed to do the chroot itself after it has read the configuration file. This is a smart design since it allows the configuration file, which contains the database passwords, to reside outside the chroot jain and therefore be inaccesible to the running web server. This would be all fine, except that since some stuff is read before the call to chroot and some stuff is read after it, some of the paths don't match. For instance, if AOLServer decides that its home is /aolserver it would be very unhappy to find after the chroot call that /aolserver doesn't exist. The same goes for the Oracle stuff. The problem is finding exactly what files are needed to run the Oracle driver inside the chroot jail.

My procedure for finding this was very sofisticated and took me years to perfect. Here's the general algorithm:

Step 1: Run AOLServer inside the chroot jail
Step 2: If it works, Go to Step 6
Step 3: See where it choked (using strace)
Step 4: Fix it by finding what it tried to open that it wasn't there and copying it where it was supposed to be
Step 5: Go to Step 1
Step 6: Pat yourself in the back

Of course, this method is extremely time consuming and could be greatly accelerated if someone was kind enough to tell you what files are needed, but I guess part of the fun of installing the AOLServer in a chroot'd environment is figuring this out by yourself.

There are a couple more things worth mentioning:

You must copy the file /oracle/network/admin/listener.ora to /webroot/oracle/network/admin/tnsnames.ora and be sure it knows how to speak TCP/IP to oracle. I know mine does.
Oracle doesn't like to be told that your database name is web.fisicc-ufm.edu and later find that your DNS doesn't know about web.fisicc-ufm.edu. Fix your DNS.
The Oracle Driver uses the IPC mechanism when it's DataSource is left empty. Unfortunately, the IPC mechanism does not work inside a chroot environment since the socket used for communication is outside the chroot'd jail. Therefore, you must point your DataSource to your database name. You can take a look at an example configuration file.

One last tip: if you have a single partition, you can save some space by making hard links instead of copying the files inside the chroot jail.

Making it start automagically

If you've reached this point without giving up and deciding that the stupid system isn't really worth it and that FrontPage and the Personal Web Server are really all you'll ever need, you'll find that everything works as advertised. Of couse every time you restart the machine you'll have to manually do a lot of stuff to have it work again.

An alternative would be to have the whole thing come up as part of the usual system startup. This is not too hard, but there are some tricky points. First of all, you'll have to edit the file /etc/oratab to "activate" your database. Otherwise the startup and shutdown scripts for oracle (named dbstart and dbshut respectively) won't work. Once you have done this:

Kill your AOLServer (if you left it running) by issuing the command killall nsd
Stop the Oracle listener: lsnctrl stop
Shut Down your Database: dbshut

Edit this script to fit your environment and drop it in /etc/rc.d/init.d, run chkconfig --add dbora and try running /etc/rc.d/init.d/dbora start to find out that the Oracle guys messed up the dbstart script and your database doesn't come up. Fix your dbstart script and try again.

Now that your database comes up automatically, you'll need to add the AOLServer to the /etc/inittab to make it respawn every time it dies. Here's an example of what I have in mine. Maybe you'll also like to see my startup script for the AOLServer.

Security Considerations

Once upon a time people could be really sloppy and just write

Security considerations not addressed in this memo

and get away with it. Not anymore. The Internet is no longer such a nice place.

So how do you secure your installation? Here are some tips.

Drop all the services you won't use

Most Linux boxes come preconfigured to have a wide variety of services available. While this is nice for the novice user, it's also great for crackers. Try editing your /etc/inetd.conf and removing everything you don't need. As a rule of thumb, if you don't know what it does, remove it (but make a copy in case it was really needed). Then check with netstat -a to see which daemons are listening in which ports.

Change all the default passwords

A student broke into the univerity's Oracle Database once and tried to change his grades. This amazing hacker used a very sophisticated attack. He telneted to the machine and logged in using the account oracle (password oracle) and started a sqlplus shell using the database user system (password manager). Why didn't he succeed? the Data Model was not very well documented and he hadn't read about using SQL so he didn't know about the describe command in Oracle.

If the administrator for this machine had taken the time to change the default passwords this would not have happened.

Use a packet filter

Why let others connect to your server if you don't want to give them a service? If all you have is a web server, block all traffic to all ports at the IP level using something like IP Chains, IP Firewall, or IP Filter. If you only have a couple of rules, the performance impact is really negligible and the security gain could be enormous. Remember to block port 1521 (used by the Oracle Listener) and port 25 (used by your email program) in such a way that local connections are still allowed. Your AOLServer inside the chroot jail will use this ports to connect to your database, and you don't want this to fail. Also remember to leave port 80 (HTTP) and 443 (HTTPS) unblocked.

obonilla@galileo.edu