Multi-Threading -- The Next Level

The backend-process manager

Every paper you pick up today talks about threading as if we were working in a textile mill. Using the "thread" word is very popular. But those who know threading intimately know that multi-threading is often like a Steven King novel -- there are things crawling around in the shadows that can bite you. This article exposes the creepy-crawlers of multi-threading, presents a solution in the form of a backend-process manager and delivers the knock-out punch with a set of Open Source projects. (2200 words)

Edward Harned (ed@coopsoft.com)
Senior Developer, Cooperative Software Systems, Inc.
November, 2002

Introduction

Multi-threading is clearly the future.

The IBM Power4 processor architecture (as well as the Power3 and PowerPC) recommends multi-threaded applications for shared storage. Now, with Intel's Hyper-Threading Technology on Xeon and the new Pentium P4 processors there are going to be more and more applications that need multi-threading.

While Application Servers (JBoss™) certainly lead the way in high-end server computing, there are ubiquitous small server and client machines in desperate need of multi-threading.

This article takes the basic multi-threading structures available today to the next level by making professional quality, Open Source code available to all programmers.

The Thread
You know how to start and end threads. You know how to control a single thread. But do you know how to control a multi-threading environment?

Multi-Threading Has a Shadow

When you start a thread of execution what you are really doing is starting a backend-process (some operating systems call threads light-weight processes.) Think of a backend-process as something taking place in another room of your house. You're sitting in the den and the new thread is working in the basement. What is it doing down there? Is it still alive? What happened to the last request I asked it to work on?

Every backend-process needs to address these common threading issues:

Multi-threading and its shadow have been around for a long time. Individuals and corporations have been answering these questions with in-house and proprietary software for decades. Before we can tackle this list we need to shed some light on the methods of threading.

Threading Methods

There are generally three classes of application threads:

  1. Manual Threads
  2. Thread Pools
  3. Managed Threads

Manual Threads
A manual thread is any thread an application starts and controls itself. The standard Listener thread is the most common manual thread. The application starts the Listener. The Listener registers itself with the System. When there is work, the Listener notifies the application. This is the basic scenario of one thread for one task, clean and simple

Thread Pools
Thread pool are useful for performing background tasks where there are multiple requests for services and the threads do not continually block each other.

Thread pools work well for homogeneous applications. The classic example is a word processor that uses a thread pool for background spell checking, repaginating, etc. Even server side applications are workable as long as the task that each thread does is independent of the other threads.

Where thread pools often fail are the complex application whether client or server.

The prevalent complex application is one that has multiple components. For instance, to satisfy a request a client application must access three different files. Try to envision the logic necessary for an application to schedule three threads (one for each file access), wait for each thread to finish and concatenate the return objects from these three threads. Now add in the logic to recover from a failure on any one of these threads. Can you see that not having direct access to these threads (as you would with manual threads) may make error recovery most difficult if not impossible?

Take the above example and put it on a server. The server has a request queue and ten threads in a pool. One of the file access methods has a bug and hangs the thread. The first request comes in and the system schedules three threads (one for each file access.) Two complete and one hangs. There are only nine threads left in the pool. This continues until there are no threads left in the pool. The server is no longer functional.

Managed Threads
The way to manage threads that are not simple or that do not belong in a thread pool is with a backend-process manager.

The Backend-Process Manager

Backend processing is so common you probably never knew it has a name.

Did you ever place an order at a take-out restaurant? The clerk taking your order is not the person filling that order.

bp1.gif (2494 bytes)

The benefits are:

Front-end processing is the way most software operates.

bp2.gif (1734 bytes)

The disadvantages are:

Some developers try backend processing without the kitchen manager. They create threads (cooks) with no central management. They soon discover that:

bp3.gif (2112 bytes)

Efficient backend software development resembles the well run restaurant. The backend-process manager is like the kitchen manager. The cooks are separate, single purpose threads. Your applications are efficient, scalable and simple.

Essentials
The key to building a backend-process manager is:

That sounds simple enough. At least until we try it. A mission-critical, backend-process manager must answer all the common threading issues, above, and have the ability to handle:

Building this backend-process manager takes a long time and a lot of talented programmers. It also requires a knowledgeable, dedicated staff to maintain the code. This brings us right back to the shadow.

The Shadow, Part II

Processors are so complex only a computer scientist can understand the logic. Instructions come from multiple level caches. They are pre-fetched, executed out of order, the results stored out of order and even run on separate processor cores. Memory synchronization is critical. Therefore, the programmer must be an expert hardware engineer to write efficient programs. This is why it takes a long time to build a good backend-process manager.

However, most software developers build in-house software for the companies at which they work. They need to understand and solve the company's problems. There are high-level and object-oriented languages to get the software developers away from the machine architecture so they can concentrate on solving business problems.

So, who is going to write the multi-threading, synchronized infrastructure? Who are the people who understand pthread libraries, context switching and such? Usually this is done by outside developers (consultants and/or Independent Software Vendors). Then, who is going to maintain the code? If the outsiders go on to greener pastures then this wonderful, efficient, multi-threaded, synchronized code is now the problem of the business oriented staff. The dark cloud of dread for any business unit manager.

The Acceptable Solution

The only truly acceptable approach is Open Source code. This way the standard version is maintainable by a huge base of programmers and individuals may tailor the code to their own applications.

The solution is Tymeac; a set of Open Source projects hosted on SourceForge.net: (see Resources for downloading)

TymeacSE for the Java Standard Edition.
TymeacME for the Java Micro Edition.
(the original, TymeacTS for the IBM CICS Platform)
(Both .Net and C/C++ versions are in progress.)

Tymeac is a full-feature, backend-process manager that is the culmination of decades of experience with threads in a multitude of industries.

Tymeac handles all the problems of a multi-threading environment both as a server and as an embedded queuing and threading structure for clients. Naturally, it comes complete with extensive documentation.

Conclusion

Multi-threading software is necessary to take advantage of the IBM "Power series" of processors and the Intel Hyper-Threading Technology.

A backend-process manager is necessary to build effective multi-threaded software.

Open Source is the best way to build a backend-process manager.

Tymeac is the best Open Source, backend-process manager money can't buy.

Resources

About the Author

Since his academic introduction to queuing and sub tasking, Edward Harned has been actively honing his multi-threading and multi-processing talents. He first led projects as an employee in major industries and then worked as an independent consultant. Today, Ed is a senior developer at Cooperative Software Systems, Inc., where, for the last five years, he has used Java programming to bring backend-process solutions to a wide range of tasks.