First draft: July 15, 1998

This is presently a working draft. The facts are correct, such as file load times, but certainly we reserve the right to change our approach upon further testing. Furthermore, Corel Draw was listed as an example case, but licensing issues still prevent simple deployment of this particular application.

Waterloo Polaris Application Filers

Waterloo Polaris Phase I was primarily aimed at bringing the user environment up to modern standards so users could run the latest applications. While Phase II may appear focused on the backroom servers, Phase II will significantly enhance the user environment in several ways.

A general roadmap of Phase II was circulated in early 1998. Several server changes have already been performed, such as web and Email servers that are already online and in use. Other documentation will be forthcoming concerning proposed user file storage, shared file storage, home/residence features. This document focuses solely on application file serving.

The Waterloo Polaris Phase II Application File Server (appfiler) is an extension of the traditional read-only fileservice Watstar has supplied for years. Large collections of applications software will be supplied to user stations by the appfiler, albeit with some significant differences from the past. Increasingly, Waterloo Polaris will be the glue between standard PC desktops and industry standard fileservers.

The Problems to Solve

The primary issue driving the migration to appfilers is the increased need for application filespace and a continued strategy for economy.

Whether using Watstar or Waterloo Polaris, every few years we require significant increases in space to hold popular PC software of the day.

We initially configured Waterloo Polaris workstations with almost 1GB of local application hard disk space, and a networked $SYSTEM disk which is also about 1GB, resulting in roughly 2GB of application storage space. These choices were tradeoffs of affordability versus requirements at the time. However, it is becoming increasingly difficult to live within these confines.

Several essential applications are now expecting 600 MB to 1GB for themselves, and in a year, those numbers will seem conservative. If we are to continue using industry standard software like Corel Draw, AutoCad and MS Office, we need a new plan.

Traditionally, we offset client application storage costs by using networked read-only storage space. This strategy means upgrading a few servers rather than upgrading every workstation. Indeed, our current gigabyte $system disk could be expanded to 2GB (the permanent and unchangeable Watstar account size limit), but that creates complications. First, most servers would have to be upgraded because they lack sufficient spare space, and that would require roughly $35,000 for campus-wide hardware upgrad es. Secondly, even 2GB won’t last more than a year as it would only be a 30% increase in effective application filestorage space. Lastly, the Watstar server strategy is optimal for smaller disks, but it loses efficiency as we move closer and closer to the theoretical account size limit.

Another driving issue is the problem of the dept command which traditionally assisted users who were enrolled in courses of more than one faculty or department. The dept strategy involves relinking the K: disk from one faculty to another as appropriate to gain access to their software. Dept was fine for the DOS based world, but it is insufficient in Windows 95.

Finally, it would be nice to not be locked into a certain client disk size. We initially configured Waterloo Polaris for a minimum 1 GB disk, and suggested people buy 2 GB. Now 4 to 6 GB drives are becoming affordable. An ideal strategy would optimize for any disk size. That would extend the life of the 1 GB disks to reduce hardware costs and labour, but would also use any additional disk space thus maximizing the benefit of superior hardware.

The plan for application storage is designed to solve these problems.

The Plan

As with other parts of the Phase II plan, the strategy is to leverage today’s switched networks and new hardware and software options toward per-faculty-centralized solutions. This will give us flexibility solve the stated problems, and will prove move cost effective over the long term.

We will start with the creation of appfilers, special high performance and scalable file servers which deliver application files to large segments of each faculty. Much like the faculty Email and Web servers, appfilers will be superservers compared to today’s systems.

The primary criteria for these devices will be:

There will be far fewer appfilers than we have Watstar servers today, thus changing the client to server ratio from about 20:1 to 100:1 or higher. The new client to server ratio will underscore the need for server scalability, but will obviously relieve the faculties and departments from upgrading many little fileservers and their disks every few years, and will ease our task of coordinating those many upgrades.

The increased client count per server will be offset by reduced responsibilities. Appfilers will generally only serve applications and appear read-only. In contrast, today’s Watstar servers usually do extra duty as userspace fileservers and much more. Furthermore, server hardware and client/server software and filesystem optimizations will improve each appfiler’s ability to respond quickly to user readonly requests.

Appfilers will offer much larger filespaces than their Watstar counterparts (eg. probably 10 GB initially) and will continue to grow as needed. This change will increase our flexibility to deliver the applications users want, and to make available the many option packs which have sometimes been avoided to save space. This is another example of economy of scale not available with the highly distributed model.

The appfilers will be faculty level devices, so Engineering, for example, might have a small farm of them, and theirs would be operated by Engineering Computing. Some other faculties could suffice with one or two. The exported filesystems on all appfilers (except the staging server) will be identical across campus, not unlike the current J: $SYSTEM disk.

In addition to general purpose applications, appfilers will export a subdirectory tree which will house all specialized faculty and departmental software. For example, \eng\civ would be fully controlled by the Civil Engineering computing staff. This hierarchical strategy will eventually replace both the need for the K: departmental disk and the current dept command. Faculties and departments will enjoy several new features with this new arrangement, including improved performance in non-local l abs, and the introduction of a staging area so they will not be installing directly on their user’s production filespace.

There are currently about 60 Watstar servers distributed across campus. For several terms their use will likely not change. Beyond that, they will continue to operate as boot servers, printer queue managers, and performing other local tasks. We may change them to a different OS and use different protocols eventually, but luckily the existing hardware will usually be sufficient in most cases. So while we will probably not be decommissioning local servers, faculties and departments will not likely incur additional costs upgrading the hardware and disk in the coming years.

Finally, we will move to a workstation file cache for the biggest and most frequently used application files. This strategy will be similar to the watcache feature we used before Windows 95 arrived and will continue to allow small 1GB disks to be used for a prolonged period, but will offer improved performance to those who have purchased larger hard disks. Any given lab can be optimized for its intended community. MFCF’s Cygnus lab, for example, would probably cache parts of the general software and MF CF’s own part of the tree. This caching strategy allows continous growth of applications while extending the life of client disk hardware.

Appfiler Hardware and Server OS Configurations

The basic requirements are listed under ‘the plan’. Several competing technologies are being tested this summer. Hard numbers and experience (with updated server software) will determine our recommendations.

Given the Waterloo Polaris client disk cache feature, client to server ratios of at least 100:1 seem easily possible with several commercial products. Of course, we need to test to determine how far we should realistically push the ratio.

Faculties will want to consider budgetting for appfilers if possible. Specifics of hardware will be made available as tests conclude and are too specific for this general directions statement.

Departments within faculties will not require additional hardware.

When users log onto Waterloo Polaris, the login client application will automatically connect to an appfiler. In faculties where there are several appfilers, the workstation will choose a server in their IP subnet with preference, but will just as easily connect to another appfiler if there is an outage, or as part of a load distribution mechanism. There is no requirement for server clustering products, this will be intrinsically part of our client software. This will typically offset server downtime (scheduled or not) and should not prevent a user from using the computer, though obviously there may be a performance hit on the remaining appfilers.

Appfiler Filetrees Explained

Most appfilers will do nothing except export the new single large filetree in read-only mode, and accept updates overnight when new applications are distributed.

The single filetree means that software is visible no matter who is logged on, or where. The advantages of this strategy may not be obvious, but will make users’ lives much easier as they study courses in a variety of disciplines.

The filetree at this point looks like:

Faculty and departmental subtrees will be entirely under the control of the appropriate subunit. The only consideration will be disk requirements, as the replication will cost other faculties a small amount of money/disk space. However, several server technologies allow distributed filesystems or re-exportable filesystems (since this is readonly data), thus offering some relief if a particular department seems exceedingly rich with other people’s budgets.

There will be staging areas so that no-one works on the live filesystem. Engineering is hosting the /eng and /gen staging areas, and the subtrees underneath them. We will have to co-ordinate with other faculties to arrive at a reasonable solution for their subtrees. A departmental person need only log in with the staging tree attached and will be able to add or change applications for his users. Like today’s strategy, there will be control mechanisms which allow administrators to push out chan ges on their portions of the tree. This strategy gives one time to install software carefully, and to revert back if necessary. Typically the updates will be scheduled for overnight distribution, but sometimes emergencies will require other options.

The software distribution strategy will use open protocols. To the servers, these will appear as file requests. To the administrator, there with be something Waterloo Polaris-specific which invokes the update.

Client Side Caching

Client side caching dramatically improves workstation performance when running applications installed on the appfiler. Small subsets of the appfiler tree are replicated on the local hard disk and subsequently accessed locally rather than from the appfiler.

This strategy significantly boosts client performance while dramatically reducing server effort per client. It also dramatically lengthens the usable lifetime of client hard disks without restricting application availability. This is another win-win solution.

The local computing unit will decide which portions of the total tree are most likely to benefit its clients, and the bulkier or more frequently loaded files from that software will be copied to the workstation’s cache directory. Non-cached applications will still work, but will obviously incur some performance hit. The result is user flexibility without excessive hardware investment or upgrade cycles.

When one attempts to run Corel Draw 7, for example, only about 15 MB of the 600 MB package need to be local to get near-local performance. This is because Corel Draw 7 relies primarily on one EXE and two DLL files which total about 15 MB. Given physical memory size of their customer base, it is rare to require more than 10 to 20 MB of local files to give excellent application startup performance. That number will tend to increase in direct relation to popular workstation memory sizes.

The technology works as follows.

Obviously, a client station with four gigabytes of spare disk could thus cache more files locally and would usually achieve better performance than a one gigabyte client. This solution can sustain hard disk hardware investments longer than would a constant upgrade program, thus creating a more sustainable solution.

Perspective

From its inception, a stated goal of the Waterloo Polaris project was to use industry standard technologies wherever possible.

The appfiler component of our Phase II solution combines:

Also as promised, Waterloo Polaris is increasingly becoming the glue which connects these various off-the-shelf products.