Somewhere, a long time ago, someone in a marketing department decided that, from that moment on, there would be no more very large companies. Nor would there be very powerful computer systems running large-scale, complex programs. From now on, there would only be enterprises, enterprise systems, and enterprise software. (Perhaps this was the same person who decreed that software companies would sell "solutions" instead of programs.)
Enterprises, whatever they are, spend a lot of money and, like all organizations that spend a lot of money, they are run by people who would spend less money if they could find a way to do so wisely. Since open source software is often free, people ask, should enterprises use open source software? After all, if you and I can save a few bucks running, say, Linux and Open Office instead of Windows and Microsoft Office, wouldn't a very, very big company save a very, very large amount of money by using free software?
It is natural to wonder, why enterprises (whatever they are) continue to spend large amounts of money on expensive mainframes computers, commercial software, systems programmers, applications programmers, and support personnel. Wouldn't it make more sense to use free software running on networks of inexpensive PCs, and let the legions of volunteer open-source programmers around the world do most of the work? How do we make sense of it all?
I'm going to divide our discussion into two parts. In this article, I will lay the groundwork by discussing the questions: What is an enterprise? What is open source software? and What is Linux? I will then explain what makes enterprise computing so unique, and why mainframe computers are so important in such environments.
In the second article, I will use these basic concepts to discuss the question: What role does Linux and open source software have to play in enterprise computing?
Note: This article was written for Technical Support magazine, a publication of the Network and Systems Professionals Association (NaSPA).
The traditional meaning of an enterprise is any company organized for commercial purposes. In that sense, GM, Bank of America and Wal-Mart are all enterprises, as are the Snappy Service Delicatessen, Yvette's French Maids, and Bob & Dave's Barbershop ("Two Chairs, No Waiting"). However, in the IT (Information Technology) world, the term "enterprise" has a different meaning, and the meaning, unfortunately, is often fuzzy. In an attempt to unfuzz an important term, let me offer the following definition.
An enterprise is any organization — commercial or not — that has the following four characteristics:
Size and location: An enterprise is a very large organization, often widely distributed with hundreds to tens of thousands of locations.
Management: An enterprise is organized into divisions or departments, and managed by a large hierarchy, not by a single person or group of people. The hierarchy provides for the short-term and long-term needs of the enterprise, thereby ensuring the continued existence of the organization. With respect to computing, the IT needs of an enterprise are so complex as to be beyond the total understanding of any single person. Thus, they must be managed by a sophisticated combination of human workers and automated systems.
Software: All businesses require the software necessary to administer an organization: accounting, payroll, email, office tools, Web services, backups, and so on. However, an enterprise requires more. First, the administrative software must operate on a very large scale. Second, there are needs that are unique to extremely large organizations. For example, there must be software to support the processing of massive amounts of data (often terabytes per day), data warehousing, data mining, highly distributed transaction processing, wide-area networking, and IT management, as well as data distribution to customers, suppliers, employees, the media, and the general public.
Hardware: An enterprise requires large, complicated, inter-connected computing systems that will not fail, degrade or interfere with one another. Because of the enormous expense, such systems must be designed and managed to run efficiently. Moreover, they must be upgraded on a regular basis.
Having defined an enterprise in this way, we can see that any sufficiently large organization qualifies: not only companies, but governments, universities, and large social organizations. (What do GM, Bank of America, Wal-Mart, UCLA, the Vatican, the Red Cross, and the Mormon Church all have in common?)
There are four ways we can define open source software. First, the designation can mean that anyone is allowed to look at the source code for a particular program without having to sign a special agreement. This is the case with, say, Linux, but not with Microsoft Windows. Although this sounds important, most individuals and businesses don't have the time, money, inclination or expertise to examine source code. Within an enterprise, the situation is different: the IT department is perfectly capable of reading source code and knowing what to do with it.
The larger the organization, the more complex the responsibilities of the IT department, creating a more pressing need to understand the details of how their tools work. Being able to look at source code is especially important because, in many cases, the code itself is the best (or only) documentation. Large IT departments, however, do not like depending on proprietary black boxes, whose innards are closed to scrutiny. This is why some proprietary software companies release the source code for their programs, even though the products are strictly commercial and may not be modified.
Although it seems obvious, it is worth pointing out that when an enterprise pays someone to write software, the source is available to other programmers within the enterprise. Thus, any software created by contractors or in-house programmers is, by definition, open source within the same organization.
The second definition of open source describes software that can not only be examined, but modified. This is the case with most open source software distributed on the Internet, and it is certainly the case with Linux. For an enterprise with the time, money and expertise to modify complex programs, this type of open source software is especially valuable, as it allows the IT department to integrate publicly available tools into an existing computing structure.
Such license, of course, comes with an important cost. Once an enterprise begins using a program, the IT department is responsible for maintaining that program. True, within many open-source communities, there are volunteers who maintain and enhance the software and, as the saying goes, "Given enough eyeballs, all bugs are shallow" (Linus's Law). However, once you start modifying an open source program to use in a proprietary environment, you are on your own when something goes wrong or when you need to apply an update. Thus, IT departments who depend on open source software can find themselves having to support large, complex programs, including all the parts that were not written in-house and were never designed to fit into their unique environment.
The next two definitions of open source software have to do with being "free". When it comes to software, there are two meanings of the word free: no cost and no restrictions. (Or, as Richard Stallman once observed, the difference between free beer and free speech.) A lot of open source software is free in that it is available at no cost. This is attractive, of course, as quality software costs money — at the enterprise level, a lot of money. If an enterprise can avoid paying for software, might it not lower its IT expenses significantly? This is an important question, to which we will return in a moment.
Second, "free" also refers to software that can be used without restriction. This is a crucial distinction because, if necessary, enterprises must be able to modify their software to work within a large, complex computing environment. IT departments have enough to do without worrying about whether a "free" program downloaded from the Internet can be used legally for commercial purposes. Unlike individuals or smaller businesses, people who work in an enterprise have no choice but to follow the law, adhere to government regulations, and work within strict internal guidelines promulgated by their legal department.
In 1991, Linus ("Lee'-nus") Torvalds was a second-year computer science student at the University of Helsinki. Like tens of thousands of other programming students who loved to tinker, Linus had read Andrew Tanenbaum's book "Operating Systems: Design and Implementation", which explained the principles behind the design of Minix, an operating system designed for teaching. As an appendix to the book, Tanenbaum included the 12,000 lines of source code that comprised the operating system, and Linus spent hours studying the details. (At this point, you may want to take a moment to think about what it would be like to read through 12,000 lines of code.)
Like many other programmers, Linus wanted a free, open-source version of Unix. On August 25, 1991, he sent a message to the Usenet discussion group that was used as a Minix forum, comp.os.minix. (In retrospect, this short message is considered to be one of the historical documents in the world of computing.) In the message, Torvalds mentioned he was working on his own free Unix-like operating system and he asked for ideas: What would other people like to see? He recognized that his main job would be to create a brand new kernel (the core of an operating system, responsible for managing resources and providing controlled access to the hardware). Once Torvalds had a kernel, he could couple it with existing software, already developed by the Free Software Foundation, to create a full operating system.
In September 1991, Linus released the first version of his kernel, which was called Linux. Programmers around the world began to join Linus: first by the tens, then by the hundreds, and, eventually, by the hundreds of thousands. By 2005, virtually every niche in the world of computing would come to be occupied by machines that could run some type of Linux, and Linux would be the most popular operating system in the world. (Windows is more widespread, but Linux is more popular.)
So, what is Linux? "Linux" refers to any operating system based on the Linux kernel. There are literally hundreds of such systems, called distributions. By the nature of the Linux licence, all Linux distributions are open source and may be modified.
Let's say you work in a small- or medium-sized company. Your computer goes down because of a disk crash or an electrical failure or a network outage. The situation is frustrating, but not fatal: you wait for someone to fix the problem and you reboot. If you lose data and there is no backup, it's aggravating, but you muddle through somehow. Similarly, if your Internet access is down and you can't send email, you are inconvenienced, but all you need to do is wait until your connection starts working again. All of us have had such experiences, both as individuals and as employees or customers.
However, what would happen if an entire ATM network failed? Or an airline, hotel or car rental system were incapable of making reservations? Or a company like Costco or Wal-Mart was unable to process their payroll or reorder inventory? The results would be much more serious than mere frustration, aggravation and inconvenience.
The reason such overwhelming problems rarely happen is that enterprises make demands on their IT departments that are qualitatively different from the requirements of other companies. At its basic level, maintaining computing resources is a matter of priorities and, for most companies and for individuals, the main priorities are cost and convenience. For an enterprise, the priorities are more diverse and much more expensive and difficult to achieve: total reliability, stability, fault tolerance, security and efficiency. At the same time, enterprise systems require immense amounts of computing power, I/O throughput, data storage and network connectivity.
As a result, enterprise computing is carried out within a very complex — and expensive — environment. There will be many programmers, software architects, testers, tech support people, admins, and managers, working on countless servers, workstations, terminals, and networks. Typically, the entire enterprise will be centered around one or more mainframe computers.
Hold on, did I just say mainframe computers? Isn't the mainframe supposed to be dead by now? And do mainframes really run Linux?
The mainframe era began on April 7, 1964, when IBM announced the System/360, the very first family of general-purpose computers. The first machines were shipped about a year later and, by 1966, they were being used in large numbers. At first, such machines were simply called computers, and the term "main frame" referred to the cabinet containing the CPU (central processing unit). In the early 1970s, however, when the much smaller "minicomputers" were brought to market, the larger, traditional machines began to be known as mainframe computers which, over time, was shortened to mainframes.
For a long time, mainframes were the fastest computers in the world. However, that was incidental. The true goal of the mainframe was to offer — as much as hardware limitations would allow — efficient, well-balanced computing power to support the needs of medium to large organizations. Over the years, the role of the mainframe has evolved, becoming the computer of choice for what we would, today, call enterprises.
Ever since the late 1980s and the rise of PCs and networks, industry commentators have been predicting the imminent decline of mainframe computing. Today, in fact, there are many people who believe that the only reason to use a mainframe would be to run what you might call "legacy" applications: old programs that can't be ported efficiently to networks of PCs. In their eyes, a large organization would be better served by using loosely coupled open systems, database engines, large disk arrays, and application server farms. Indeed, many organizations use just such architectures. The truth, however, is that there will always be a market for extremely powerful, centrally managed computer systems. As a result, better and better mainframes are continually being developed with no end in sight.
To understand the importance of mainframes, you need to look at them from the point of view of an enterprise. Enterprises need much more than raw power (which can be supplied by large farms of networked PCs). As we have discussed, enterprises also require massive I/O, data storage, and networking capabilities in a package that offers a high degree of integration, reliability, fault tolerance and, above all, efficiency. Moreover, as that package scales larger and larger, it must be manageable. Otherwise, the IT department can easily lose control over its domain, resulting in a confusing mish-mash of expensive and inefficient hardware and software configurations. Once an organization gets large enough, a mainframe-centric system becomes an imperative.
If you are called upon to analyze the power of a workstation or a server, you can do so by looking at a few technical specifications: memory size, processor types, disks, built-in ports, and so on. With a mainframe, it's a whole new world. Of course you still look at the specs. Consider, for example, the IBM System z9 "Enterprise Class" computers. They can be configured with up to 54 64-bit processing units, 512 GB of main memory, 60 logical partitions (LPARs), 336 FICON (extremely fast I/O) channels, 256K I/O devices, and Gigabit Ethernet connections — all contained in a device with a footprint of 26.78 square feet (a bit more than 5' x 5').
However, to understand the true flavor of a mainframe, you need to look past the numbers. When you have a moment, take a look at the description of the z9 on the IBM Web site. Specifically, look at the last half of the "z9 Data Sheet", where you will see a long list of features with strange names such as Open Architecture Distributed Transaction Enablement; Concurrent Hardware Management Console and Support Element; Dynamic Memory Sparing; Enhanced Application Preservation; and on and on.
What you are reading is a list of well over 100 hardware and software features developed to meet the needs of an enterprise. More so than any other computer in history, a modern mainframe is capable of consolidating the data processing needs of a huge organization, while simultaneously supporting up to tens of thousands of applications and users efficiently: a computer so reliable that, for practical purposes it will never go down. This, in a nutshell, is the type of machine around which a typical enterprise builds its IT organization
If you are used to working with PCs, Macs, LANs and servers, almost none of this will be familiar to you, which is to be expected. And the same can be said for virtually all of the programmers around the world who contribute to Linux and other open source software. They also work with PCs, Macs, LANs and servers. Imagine a 21-year-old volunteer programmer, working on Sunday morning on a PC in his basement in Fargo, North Dakota, creating a small patch for the Linux kernel. Is it possible that the fruits of his labor can be used effectively within an enterprise? Might that patch — and the work of hundreds of thousands of other such programmers — ever find its way onto a massively complex computing system serving a multinational corporation?
The answer to both questions is yes because, the truth is, the mainframe is the ultimate open source computing platform. Indeed, as we will see, the mainframe was actually the very first personal computer, as well as the nexus of the first significant open source community.
We'll discuss these ideas — and a lot more — in the next article: How Does Linux and Open Source Software Fit Into Enterprise Computing?"
© All contents Copyright 2021, Harley Hahn