Operating Systems Insights: March 2012

Thursday, March 29, 2012

Process Management in CP/M

By now, we've seen the basics of I/O, and memory management of CP/M, it is time for process management. As CP/M is a single process operating system, the process management is quite limited. Still, even with this limited functionality, there are several issues regarding processes that the operating system must handle. Generally, when a program is being executed, it is called as a process. So, first we will take a brief look at program execution, and then we will move to process management.

Executing an application program:

To execute a program, its executable code must be loaded into memory. Usually, this loading is done by CCP (through loader). CCP accomplishes its work by making BDOS calls only. As CCP is loyal friend of BDOS; CCP never calls BIOS, or hardware directly. CCP always calls them them through BDOS. When a set of characters is entered in CCP, first of all, CCP checks if this set of characters is a built-in command or not. If it is command, then it is executed. If it isn't a built-in command then CCP looks for an executable program file on the disk by that name. In both the cases, the executable code of that file/program is loaded into the memory, and the execution of the program begins.

Command Processing Via CCP:
                        In CP/M, CCP is a program, pretty much like any other program. The CCP is better structured than most of the programs, as it only uses OS system calls, and never bypasses the operating system/BDOS. In other operating systems a component similar to CCP, is sometimes called as a shell, command interpreter, or command line interface. In CP/M a user can directly invoke CCP command by typing the command, or by typing the name of an executable file. The CCP is configured to load in high memory location, just below the BDOS. When a program starts executing, and if the memory is insufficient, then CCP exits the memory creating free space for the program. When a program finishes the execution, it exits with returning control to the BDOS. Then the BDOS checks if the CCP is still in the memory, or not. If it is in the memory, then BDOS returns control to CCP. But if CCP is not in the memory(exited from memory, for making memory available for the program, as mentioned previously), then the CCP is reloaded into the memory, and given control.

Basic “MULTITASKING” In CP/M:
                        Even in the early days of computing, users wanted to do some work in parallel. Most common request was to print data, and edit another file on the computer system at the same time. Because in these days, printers were very slow. It required 30 to 60 minutes to print out few pages. So this time was obviously wasted, while computer did nothing but printing. In these days, most of the users used to start the printing process, and then they used to do something else, like eating reading, or visit a friend to discuss how many balloons would it take to hover a cat? (come on, what are the odds? There must be atleast one user who did this) But if something went wrong, and user returned after a long time, to find that printer needs some attention, and most of the printing hasn’t been done. Well, this is obviously keyboard pounding frustrating! And this used to happen quite sometimes.
                        Gary’s solution was a background printing process. In CP/M, when a user gives command to print a file, then a small program is loaded into the memory. Let us call that program as “print handler”. This print handler is loaded at the highest location in the memory, just underneath the OS. This program initialises itself, and gives control to CCP, allowing another program to be run. This background printing process usually gets control when the CPU is idle, or if the foreground process makes system call, or by setting a timer which causes an interrupt to the foreground process. This background process prints a little bit of a file, a line or two, whenever it gets control.
                        Even though the computer did not do it, this background processing gave appearance of computer doing two things at the same time, something called as multi-tasking. But still, users liked this idea of doing two things at the same time.

Monday, March 26, 2012

Overlays In CP/M

                        This concept of “overlays” is going to be simple with CP/M. But in more advanced operating systems, this is going to be a little trickier. The maximum of memory in CP/m was constrained, by the amount of the memory that processor could address. Initially this was 64 KB (with Intel 8080, 8085, Zilog Z-80), but some processors which were introduced later, allowed more memory to be addressed. But what if a program would not fit in the available memory space? This problem has been an issue, almost since the birth of electro-mechanical computing.
                        A human can solve this kind of problem very easily: If the available space is small, then bring only what is needed at a time. But computers only do the things that they are told to do. So after tell it, a program that manipulates large amount of data, brings only the required data on memory, remaining data is kept on the disk till it is needed. Similar stuff is done with programs, which have large binary codes: only the required code is brought in the memory. These parts “overlay” each-other at the same location in memory, and are called as overlays.
                        Programs with large amount of binary code, have to be divided into different parts. The main part of the program is always memory resident. The programmer (while conceiving the program) has to identify the parts of the program, that should be grouped together in an overlay. While designing this, it is important to avoid one overlay calling a different overlay that would take the place of the first one in memory. Actually, loading of the overlays is done by the programming language’s runtime library, which uses CP/M OS system calls to load an overlay. The programmer has to indicate the compiler that which parts of a program, which functions, and procedures should be in each overlay. Then the compiler produces the loadable overlay code. In following diagram, the program has one main part and three overlays. Only one of the overlay will be in memory at any particular time.

Saturday, March 24, 2012

Memory Management In CP/M

The basic memory management in CP/M is quite simple. All the programs are loaded at fixed address in the memory. In CP/M, programs are basically divided in two parts: the program executable code, and the fixed data (the values that don’t change, such as constant values in the program, character strings, and so on).
A software, which loads the program from disk to the memory, is called as a loader (actually it is from anywhere to the memory). In CP/M, this loader is a part of the CCP. In CP/M, this loader loads the program executable code, and the static data to the memory. A program furthermore needs some more space in the memory to store some temporary data, like temporary variables used in the program, and space to pass parameters to subroutines*1, and some more stuff like that which you don’t know yet. The memory space used for storing such temporary data by the programs, is called as “stack”. In CP/M, this stack is placed at the highest location in the memory, right underneath the OS itself (nothing has higher location then the CP/M OS). This kind of placement allows the stack to increase its size (in case of more temporary data is placed in it) towards the lower locations of the memory. After looking at the following diagram, you will understand everything, but don’t look at it till you go there while reading, so just keep reading. If memory is available below the stack, then it would grow with no problem. But suppose memory locations below the stack are already occupied, then what? Well, just like when two interstellar objects come too close, they collide, causing trouble for the objects nearby. In the same manner, in CP/M, when stack tries to occupy some memory location which is already occupied by, then a “collision” takes place on that memory location.

                   Unfortunately, CP/M did not have any method for detecting the collision between the stack and the fixed data. Such collision usually crashes the program or produces strange results/errors (not like disturbance in gravitational field, but still strange enough). This happens because the processors used with CP/M don’t have memory management registers for memory protection (these processors were Intel 8080, 8085, Zilog Z-80, and compatibles). Such memory over-writing bugs were difficult to find, and to fix, and they occurred quite frequently in CP/M system.
                   In memory, there is a large pool of memory, which can be dynamically allocated and returned. This pool is called as “heap”. It is created for the programs written in high-level programming languages*2 (like Pascal, or BASIC). The heap is set aside by the loader, but it is managed by the routines from runtime libraries of the high level language. Not all the programs use a heap. If some program uses a heap, then it is allocated between the fixed data, and the stack.
                   The program header is located in memory immediately after the executable binary code of the program which is being executed. The program header contains pointers to memory addresses where the stack is located, and to the memory addresses where the fixed data is located. It also contains one more pointer, which points the strings, which are passed as parameters to the program when the user types the command, and supplies arguments to the program.
                   The reason behind CP/M being loaded at the highest location of the memory was the hardware difference. Just like the current days, the computer systems from the period of CP/M did not have same amount of the memory. Some computer systems might have 32 KB of memory (RAM), others 48, or 64 KB. So CP/M was configured to occupy the highest location of memory, leaving a fixed address (always 100 Hex) to load the other data. If the OS becomes larger, it starts at a lower address of the memory, but it doesn’t force any programs to change the addresses, although a user program might get less memory space, in which to run. This also means that, when the OS is upgraded to a new version, it is not necessary to re-link all the application programs.

Wednesday, March 21, 2012

Disk Management, and File System In CP/M

                   The major task of CP/M was to provide a standard and portable file system. About 70 to 75% of OS system calls were related to the disk management and file system. (And rest of the OS systems calls were bypassed most of the times :D)

The Disk System:
                   Author’s Note: Before reading this post, users are kindly advised to read <<8 inch floppy disk>>. Or else, the information given in the following post might prove blinding.
                   In CP/M, there are two factors in the file system: disk drive and disk controller. The disk drive holds and rotates the floppy disk*1. The disk controller (device controller*2 for the disks) is usually placed on the motherboard. The disk drive has a magnetic head, which can read, and write data from track to track. To read or write data on a particular sector of a particular track, the drive/head must wait for that sector/track to come under the magnetic head (unless if it is already there).
                   When a user gives read or command to the CP/M, CP/M commands the disk controller to do the task. After receiving the command, the device drive moves the head towards the target. Sometimes, the magnetic head might wrong the targeted track (it can go to the wrong track). The controller notices this and repositions the magnetic head correctly (each sector, and track has their number on it). Sometimes, the same thing might happen with a sector. All these activities are invisible to BIOS, and CP/M. These activities are carried out by a software, which is embedded on the device controller ROM. This type of software is often called as firmware.
After being formatted, such an 8” floppy disk contains:
77(tracks) * 26 (sectors per track) * 128 (bytes per sector) * 2 (sides) = 5,12,512 bytes.
Disk formatting is the process of writing the control information on the disk, by an operating system.
The File System of CP/M:
                   Explaining about file system in general or in relationship with any OS isn’t easy. But trust me; I’ll explain CP/M’s file system in very easy manner. CP/M’s file system is built on top of the BIOS. In CP/M’s file system the disk is divided in three parts: disk boot area, file directory area, and data storage area. The BIOS has a built in table, which has the size of each of these areas. A typical CP/M file system layout is shown in following diagram.

Disk Boot Area:

                   The first part of CP/M’s file system is this disk boot area. It contains the OS binary code used for booting. This area is not visible from the file system (so, not from anywhere else). Simply, this binary code is not part of ay file, and this code is invisible for normal users. Just like the Men In Black, MIB, this code is invisible, but doing its job promptly. The binary code of the OS contains loadable images of BIOS, BDOS, and CPP. These images are written in this disk boot area: sector by sector, track by track, starting at track 0, sector 1. Usually the BIOS is around 2000 bytes, the BDOS is around 3500 bytes, and the CCP is around 2000 bytes. So, all of them merrily fit together in the first three tracks of the floppy.
                   When the computer system is turned on, a small program, which is stored in a ROM chip on motherboard, is initialised. This program first initialises all the aspects of the computer hardware, and then copies OS executable images (binary code) from the disk to memory (RAM), and then starts the CP/M. This process is called as OS loading or more commonly as booting. Now-a-days there is a program which bears an uncanny resemblance with the program from the ROM mentioned above. This new program is called as bootstrap loader.

File Directory Area:
                   In CP/M, the size of the file directory area is fixed and recorded in a table in the BIOS. For an 8” floppy disk the file directory area holds up to 64 entries of 32 bytes each. A “file directory entry” is shown in the following diagram:

User Number: This is actually a group number from 0 to 16, which allows multiple users or groups to share a disk and collect their files into a group. The tricky thing here is that there are actual sub-directories. Technically, all files are in one directory, the group numbers give an illusion of single level sub-directories. So that makes them “virtual sub-directories”.
File Name and File Type: These two items can be considered as one item. A filename can consist 1 to 8 characters. (kill or killbill, but not killthebill).A file type can consist 0 to 3 characters (actually, it’s the OS, and application’s job to set the file type). The filename and file type is separated by a period (that’s what the computer nerds call the full stop). So, depending on the type of file, we get killbill.doc (a text document describing the various methods for killing the bill). This is something that we can still see in today’s operating systems. Sometimes, the file type is called as “extension”. Using some special characters some special characters in filename in not allowed in CP/M, including period, and blank space. Microsoft Windows XP doesn’t allow following items: / \ : * ? ! < > |
Extent Counter: An “extent” is the portion of a file controlled by one directory entry. If a file takes more blocks than one directory entry can point, then that file is given additional directory entries. The extent counter is set to zero for the first part of the file, and then sequentially numbered for the each of the remaining parts of the file. Large files have multiple directory entries with the same file name, but different group of allocation pointers in each entry. As files can be deleted, and their directory entries can be re-used, the extents may not be in order in the directory.
The Number of the Records: No useful info could be found yet. (But soon, I will upload the information.)
Reserved: Please read above entry.
Allocation Map: This is a group of numbers, of (or, pointers to) the disk block that conation the data of the file. (This entry tells the OS that the data of killbill.doc file is stored on X block of the Y track.) There are eight pointers of 16-bits each. Each value (pointer) points to a sector which contains part of the file. If the file is small and if it can be stored in seven sectors, then seven pointers are used and the remaining one is set to zero. (the remaining pointers are set to zero, no matter how many are remaining.) If the file is too large, then additional pointers are allocated, and they are filled in another directory entry. In some CP/M systems, there were sixteen pointers of 8-bytes each.(Above, we were discussing about eight pointers of 16-bites each.)

Data Storage Area:
                   This is the area, which contains the actual data/files. When a user, or application tries to access a file, the file system searches the file directory entries to determine if a file with that name is stored on the disk or not. If the file name is found, then the directory entry would have the address of tracks and sectors (blocks), where the file is stored, so the data of the file can be accessed.
                   By now, if you are thinking that the file systems are too much easy, then let me correct you. Right now it seems easy, because the platform on which it is implemented, the CP/M, is not much complex in regard to file system. But still there is a complication in CP/M’s filesystem. As we saw in file directory area, there are 64 directory entries, and 8 file pointers per entry (each pointer of 16-bits.), which gives us 512 sectors to be used. But on the floppy, there are 26 sectors on each of the 77 tracks, which gives a total of 2002 tracks on the floppy! That simply means, we are not getting most of the space from our floppy. But Gary was a smart guy, do you think he could have overlooked this problem which was so easy to be found in theory? In practical finding this problem might have been easier. So, rather than pointing to an individual sector, Gary used a method in which, consecutive sectors are grouped together as “allocation blocks”. The size of these allocation blocks, is determined by the size of the disk. With the 8” floppies used with CP/M, it is eight sectors, that is 1024 bytes. So technically, in CP/M, each directory entry points to eight allocation blocks of 1024 bytes each.
   Whatever you studied above, should give you the answer of a question, this question bothers everybody, but nobody asks it loudly: “Where the heck are my A: and B: drives?” If you haven’t got your answer yet, then please visit: <<post will be uploaded soon.>>

Sunday, March 18, 2012

I/O Management in CP/M

In the era of CP/M, because of the limited capabilities of the hardware, there was no multitasking. Even the types of I/O devices were limited at that time. So managing I/O devices was pretty easy (only if compared with current time). Most of the application programs needed following I/O services:

Read a character(s) from the keyboard,
Write character(s) to the video display.
Print character(s) to the printer.
Utilise the file system to create a new file, read (access), write, close, and delete a file.

That means we need to discuss about only three I/O devices: keyboard, printer, and video display. One might ask about the floppies. But the appropriate place for discussing the floppies would be “Disk Management”, rather than this.

Keyboard Input In CP/M

At that time, keyboards came in many different formats/types. They might have 65 to 95 keys, that too placed in the different places (the placement of alphabets was borrowed from the type-writer, but function keys, and other keys conceived by the computer nerds, did not have permanent place to live. Manufacturers used to move these keys wherever they wanted). Some of the keyboard manufacturers used serial method to transfer the data/signals, while some others used parallel method. Even more, some keyboards represented characters by 7-bits, while others did it by 8-bits. And that’s where the first component of the CP/M, the BIOS, comes in.

As mentioned previously, each manufacturer of computer kits, was supposed to (and did) adapt to the BIOS to set of devices which included with their machines (which also includes the keyboard). In more simple way, each BIOS is customised by the manufacturer for their particular keyboard. So that particular BIOS contains all the information about the keyboard which is connected to the computer. That information includes whether the data transfer is being done in parallel or serial mode, whether the characters are represented by 7-bits or 8-bits, and some more stuff like that. In short, BIOS knows the keyboard completely, inside-out. Just like Chandler and Joey knows each other , but here is a one way connection, keyboard doesn’t know much about the BIOS.

No matter what the characteristics of the keyboard are, the BIOS provides the same set, “the standard set of BIOS interface functions”, to the rest of the OS. Just for a moment, assume Mr. Barak Obama as CP/M OS, his interpreter as BIOS, different countries as different hardware, and the people as keyboard. Now let’s take Mr. Obama on a world tour. Mr. Obama goes to France. French government (the hardware manufacturer) gives him an interpreter (BIOS), who knows everything about the French people (keyboard). So, whatever French people say to Obama in French language, Obama (CP/M) gets it in English (standard) interface through the interpreter (BIOS). With this method, no matter where Obama goes, he will get the “data” from the native people in his own language through the interpreter provided by that particular government. That means Obama doesn’t have to carry a bunch of persons who speak foreign languages. In the same manner, no matter how the underlying keyboard/hardware is, the BIOS gives the information to OS in one standard format. That is, same set of BIOS interface functions, aka “OS system calls”. In the time of CP/M there were two system calls for keyboard.

Check if a key has been pressed.
Read the character from the key which has been pressed.

This was sufficient for most of the programs/applications. But some applications needed more than that. Suppose in a text processing application Alt>F>S saves the file (like it does is MS-Office applications).But suppose, if some programmer created a similar application, in which there is only one way to save the file, that is pressing Ctrl and S keys at the same time. If some user pressed these two keys at the same time to save the file, then BIOS obediently tells the BDOS that two keys have been pressed at the same time. But our BDOS doesn’t know what does that mean. So BDOS doesn’t pass/forward this information to the application program (in this example, the text processor). Hence, the program doesn’t get its required input.

But it is possible to bypass the BDOS. It means the application program the keystrokes from the BIOS itself, without going through BDOS. But this is possible only if the application is programmed to bypass the BDOS. If an application is not programmed to bypass it, a user can’t do it. This kind of bypassing is possible because there is no memory protection in CP/M OS. So any application could address any part of the memory. It is the same level of easiness for the application to use the BIOS call, and to use the BDOS calls. But there is a side-effect of bypassing the BDOS. The applications which can bypass the BDOS are not portable.

Friday, March 16, 2012

Characteristics of Early PC System

            Before learning more about CP/M, now it is time to learn something about the hardware that was available in the period of CP/M.
            For the convenience, we will discuss the computers with a video monitor (CRT) (Yes, that means not all the computers had video output, but video monitor was one of the requirements for CP/M). As one might guess, early personal computers too had a motherboard. That motherboard had a microprocessor chip, RAM (Random Access Memory), some ROM (Read Only Memory) chips, and several IC’s who are responsible to for all these chips to work in harmony. Furthermore, motherboard used to have some empty slots to insert additional “expansion circuit boards”, which were, and still, are simply called as “cards”. These cards included a video controller card, which was used to connect a video monitor to the computer. Other expansion cards included additional RAM, floppy, and hard disk controllers. The standard devices for input/output were printer or video, or both, and keyboard. The keyboard was plugged directly to the motherboard, which already had a keyboard controller chip built in it.
            At this point, one might ask, what is it so different from today’s hardware? Everything is almost the same except the video output being optional? Well this is a fair doubt, and question. The answer is all these hardware had very limited capabilities. The main memory used to range from few bytes to several kilobytes. The latest microprocessors were 8-bit microprocessors.
            But yet, there were some advantages in this time. Like, the disk block size, and format was fixed for floppies and hard disks as well (though hard disks were used rarely.) This gave a standardised file system design, which was based on this standard disk format.
            And talking about the interrupt* handling, it was easy thing in those days. As there was only one application supposed to run at a time (no multi-tasking), there was no need to switch between applications. For the same reasons, CPU scheduling was not required in the OS. The interrupts were mainly used for handling the I/O devices.
            That’s all we needed to know about the early hardware, till now. The interrupts and CPU scheduling are two new terms here. At this time, we don’t need to know about CPU scheduling, but interrupts are going to prove kind of important in “I/O Management”. An “interrupt” is a mechanism which is used to signal the OS that some event has occurred which requires attention. The interrupt is signaled either by hardware or by software. Hardware may trigger an interrupt by sending a signal to the CPU, usually by the way of the system bus. Software my trigger an interrupt executing a special program, called as “system call”.