|
Troubleshooting, Maintaining & Repairing PCs Stephen Bigelow $54.95 0-07-913732-6 |
|
| Chapter: 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 |
| Reserve your copy at a Beta Bookstore near you! |
Contact Bet@books © 1998 The McGraw-Hill Companies, Inc. All rights reserved. Any use of this Beta Book is subject to the rules stated in the Terms of Use. |
CHAPTER 13
Conflict troubleshooting
The incredible acceptance and popularity of the PC is largely due to the use of an "open architecture". An open architecture allows any manufacturer to develop new devices (i.e. video boards, modems, sound boards, and so on) that will work in conjunction with the PC. When a new expansion board is added to the PC, the board makes use of various system resources in order to obtain CPU time and transfer data across the expansion bus. Ultimately, each board that is added to the system requires unique resources. No two devices can use the same resources - otherwise, a hardware conflict will result. Low-level software (such as device drivers and TSRs) that uses system resources can also conflict with one another during normal operation. This chapter explains system resources, then shows you how to detect and correct conflicts that can arise in both hardware and software.
Understanding system resources
The key to understanding and eliminating conflicts is to understand the importance of each system resource that is available to you. PCs provide three types of resources; interrupts, DMA channels, and I/O areas. Many controllers and network devices also utilize BIOS which requires memory space. Do not underestimate the importance of these areas - conflicts can occur anywhere, and carry dire consequences for a system.
Interrupts
An interrupt is probably the most well-known and understood type of resource. Interrupts are used to demand attention from the CPU. This allows a device or sub-system to work in the background until a particular event occurs which requires system processing. Such an event my include receiving a character at the serial port, striking a key on the keyboard, or any number of other real-world situations. An interrupt is invoked by asserting a logic level on one of the physical interrupt request (IRQ) lines accessible through any of the motherboard’s expansion bus slots. AT-compatible PCs provide 16 IRQ lines (noted IRQ 0 to IRQ 15). Table 13-1 illustrates the IRQ assignments for classic XT and current AT systems. These lines run from pins on the expansion bus connector or key ICs on the motherboard to Programmable Interrupt Controllers (PICs) on the motherboard. The output signals generated by a PIC triggers the CPU interrupt. Keep in mind that Table 13-1 covers hardware interrupts only. There are also a proliferation of processor and software-generated interrupts.
The use of IRQ 2 in an AT system deserves a bit of explanation. An AT uses IRQ 2 right on the motherboard, which means the expansion bus pin for IRQ 2 is now empty. Instead of leaving this pin unused, IRQ 9 from the AT extended slot is wired to the pin previously occupied by IRQ 2. In other words, IRQ 9 is being redirected to IRQ 2. Any AT expansion device set to use IRQ 2 is actually using IRQ 9. Of course, the vector interrupt table is adjusted to compensate for this slight-of-hand.
After an interrupt is triggered, an interrupt handling routine saves the current CPU register states to a small area of memory (called the stack), then directs the CPU to the interrupt vector table. The interrupt vector table is a list of program locations that correspond to each interrupt. When an interrupt occurs, the CPU will jump to the interrupt handler routine at the location specified in the interrupt vector table and execute the routine. In most cases, the interrupt handler is a device driver associated with the board generating the interrupt. For example, an IRQ from a network card will likely call a network device driver to operate the card. For a hard disk controller, an IRQ calls the BIOS ROM code that operates the drive. When the handling routine is finished, the CPU's original register contents are "popped" from the stack, and the CPU picks up from where it left off without interruption.
As a technician, it is not vital that you understand precisely how interrupts are initialized and enabled, but you should know the basic terminology. The term "assigned"simply means that a device is set to produce a particular IRQ signal. For example, a typical hard drive controller board is assigned to IRQ 14. Assignments are usually made with one or more jumpers or DIP switches, or are configured automatically through the use of Plug-and-Play (PnP). Next, interrupts can be selectively enabled or disabled under software control. An "enabled" interrupt is an interrupt where the PIC has been programmed to pass on an IRQ to the CPU. Just because an interrupt is enabled does not mean that there are any devices assigned to it. Finally, an "active" interrupt is a line where real IRQs are being generated. Note that active does not mean assigned or enabled.
Interrupts are an effective and reliable means of signaling the CPU, but the conventional ISA bus architecture - used in virtually all PCs - does not provide a means of determining which slot contains the board that called the interrupt. As a result, interrupts cannot be shared by multiple devices. In other words, no two devices can be actively generating interrupt requests on the same IRQ line at the same time. If more than one device is assigned to the same interrupt line, a hardware conflict can occur. In most circumstances, a conflict may prevent the newly installed board (or other previously installed boards) from working. In some cases, a hardware conflict can hang up the entire system.
The MCA (micro channel architecture) and EISA (Extended ISA) busses overcome this IRQ sharing limitation, but MCA was never widely accepted in the PC industry because the slots are not backwardly compatible with the well-established base of ISA boards. EISA bus slots are backwardly compatible with ISA boards, but an ISA board in an EISA slot was still faced with the same IRQ limitations.
DMA channels
The CPU is very adept at moving data. It can transfer data between memory locations, I/O locations, or from memory to I/O and back with equal ease. However, PC designers realized that transferring large amounts of data (one word at a time) through the CPU is a hideous waste of CPU time. After all, the CPU really isn't processing anything during a data move, just shuttling data from one place to another. If there were a way to "off-load" such redundant tasks from the CPU, data could be moved faster than would be possible with CPU intervention. Direct Memory Access (DMA) is technique designed to move large amounts of data from memory to an I/O location, or vice versa, without the direct intervention by the CPU. In theory, the DMA controller IC acts as a stand-alone "data processor", leaving the CPU free to handle other tasks.
A DMA transfer starts with a DMA Request (DRQ) signal generated by the requesting device (such as the floppy disk controller board). If the channel has been previously enabled through software drivers or BIOS routines, the request will reach the corresponding DMA controller IC on the motherboard. The DMA controller will then send a HOLD request to the CPU, which responds with a Hold Acknowledge (HLDA) signal. When the DMA controller receives the HLDA signal, it instructs the bus controller to effectively disconnect the CPU from the expansion bus and allow the DMA controller IC to take control of the bus itself. The DMA controller sends a DMA Acknowledge (DACK) signal to the requesting device, and the transfer process may begin. Up to 64KB can be moved during a single DMA transfer. After the transfer is done, the DMA controller will reconnect the CPU and drop its HOLD request - the CPU then continues with whatever it was doing without interruption.
Table 13-2 illustrates the use of DMA channels for both classic XT and current AT systems. There are twice as many DMA channels available in an AT than an XT, but you may wonder why the AT commits fewer channels. The issue is DMA performance. DMA was developed when CPUs ran at 4.77MHz, and is artificially limited to 4MHz operation. When CPUs began to work at 8MHz and higher, CPU transfers (redundant as they are) actually became faster than a DMA channel. As a result, the AT has many channels available, but only the floppy drive controller and other limited-performance devices (such as sound cards) continue to use DMA. In an AT system, DMA channel 4 serves as a cascade line linking DMA controller ICs.
As with interrupts, a DMA channel is selected by setting a physical jumper or DIP switch on the particular expansion board (or through Plug-and-Play). When the board is installed in an expansion slot, the channel setting establishes a connection between the board and DMA controller IC. Often, accompanying software drivers must use a command line switch that points to the correcponding hardware DMA assignment. Also, DMA channels can not be shared between two or more devices. Although DMA sharing is possible in theory, it is extremely difficult to implement in actual practice. If more than one device attempts to use the same DMA channel at the same time, a conflict will result.
I/O areas
Both XT and AT computers provide space for I/O (input/output) ports. An I/O port acts very much like a memory address, but it is not for storage. Instead, an I/O port provides the means for a PC to communicate directly with a device - allowing the PC to efficiently pass commands and data between the system and various expansion devices. Each device must be assigned to a unique address (or address range). Table 13-3 lists the typical I/O port assignments for classic XT and classic AT systems. PS/2 systems use many of the same address assignments, but also add some wrinkles of their own (as shown in Table 13-4). Finally, the I/O scheme for a modern Pentium system (with a 430TX-based motherboard) is listed in Table 13-5.
I/O assignments are generally made manually by setting jumpers or DIP switches on the expansion device itself, or automatically through the use of Plug-and-Play. As with other system resources, it is vitally important that no two devices use the same I/O port(s) at the same time. If one or more I/O addresses overlap, a hardware conflict will result. Commands meant for one device may be erroneously interpreted by another. Keep in mind that while many expansion devices can be set at a variety of addresses, some devices can not.
Memory assignments
Memory is another vital resource for the PC. While early devices relied on the assignment of IRQ, DMA channels, and I/O ports, a growing number of modern devices (i.e. SCSI controllers, network cards, video boards, modems, and so on), are demanding memory space for the support of each device’s on-board BIOS ROM. No two ROMs can overlap in their addresses - otherwise, a conflict will occur. Table 13-6 lists a memory map for a modern PC using an Intel 430 or 440 type of chipset (or equivalent).
Index of typical assignments
Now that you’ve got a handle on the way resources are allocated, it’s time to put some of that information to work. Table 13-7 presents a cross-section of typical devices and ports found in today’s PCs, and lists the standard resource assignments most often associated with them - and may assist you in spotting potential conflicts before installing new devices. Note that you may encounter ANY combination of resources listed in the table for a given device.
Recognizing and dealing with conflicts
Fortunately, conflicts are almost always the result of a PC upgrade gone awry. Thus, a technician can be alerted to the possibility of a system conflict by applying the Last Upgrade rule. The rule consists of three parts:
If all three of these common-sense factors are true, chances are very good that you are faced with a hardware or software conflict. Unlike most other types of PC problems which tend to be specific to the faulty sub-assembly, conflicts usually manifest themselves as much more general and perplexing problems. The following symptoms are typical of serious hardware or software conflicts:
What makes these problems so generic is that the severity and frequency of a fault, as well as the point at which the fault occurs, depends on such factors as the particular devices that are conflicting, the resource(s) that are conflicting among the devices (i.e. IRQs, DMAs, or I/O addresses), and the function being performed by the PC when the conflict manifests itself. Since every PC is equipped and configured a bit differently, it is virtually impossible to predict a conflict's symptoms more precisely.
Confirming and resolving conflicts
Recognizing the possibility of a conflict is one thing, proving and correcting it is another issue entirely. However, there are some very effective tactics at your disposal. The first rule of conflict resolution is Last In First Out (or LIFO). The LIFO principle basically says that the fastest means of overcoming a conflict problem is to remove the hardware or software that resulted in the conflict. In other words, if you install board X and board Y ceases to function, board X is probably conflicting with the system, so removing board X should restore board Y to normal operation. The same concept holds true for software. If you add a new application to your system, then find that an existing application fails to work properly, the new application is likely at fault. Unfortunately removing the offending element is not enough. You still have to install the new device or software in such a way that it will no longer conflict in the system.
Companion CD: There are several "system reporting tools" on the Companion CD which can help you identify the resources in use (and avoid accidental resource conflicts due to configuration mistakes). Try CONF810E.ZIP, SNOOP330.ZIP, or SYSINF.ZIP on the Companion CD. You may also try some more specific tools like CHKIO.ZIP or IRQINFO.ZIP.
Dealing with software conflicts
There are two types of software that can cause conflicts in a typical PC; TSRs and device drivers. TSRs (sometimes called popup utilities) loads into memory, usually during initialization, and waits until a system event (i.e. a modem ring or a keyboard "hot key" combination). There are no DOS or system rules that define how such utilities should be written. As a result, many tend to conflict with application programs (and even DOS itself). If you suspect that such a popup utility is causing the problem, find its reference in the AUTOEXEC.BAT file and disable it by placing the command REM in front of its command line (i.e. REM C:\UTILS\NEWMENU.EXE /A:360 /D:3). The REM command turns the line into a "REMark" which can easily be removed later if you choose to restore the line. Remember to reboot the computer so that your changes will take effect.
Device drivers present another potential problem. Most hardware upgrades require the addition of one or more device drivers. Such drivers are called from the CONFIG.SYS file during system initialization (or loaded with Windows), and use a series of command line parameters to specify the system resources that are being used. This is often necessary to ensure that the driver operates its associated hardware properly. if the command line options used for the device driver do not match the hardware settings (or overlap the settings of another device driver) system problems can result. If you suspect that a device driver is causing the problem, find its reference in the CONFIG.SYS file and disable it by placing the command REM in front of its command line (i.e. REM DEVICE = C:\DRIVERS\NEWDRIVE.SYS /A360 /I:5). The REM command turns the line into a "REMark" which can easily be removed later if you choose to restore the line. Remember that disabling the device driver in this fashion will prevent the associated hardware from working, but if the problem clears, you can work with the driver settings until the problem is resolved. Remember to reboot the computer so that your changes will take effect.
Finally, consider the possibility that the offending software has a bug. Try contacting the software manufacturer. There may be a fix or undocumented feature that you are unaware of. There may also be a patch or update that will solve the problem.
Dealing with hardware conflicts
A PC user recently added a CD-ROM and adapter board to their system. The installation went flawlessly using the defaults - a 10 minute job. Several days later when attempting to backup the system, the user noticed that the parallel port tape backup did not respond (although the printer that had been connected to the parallel port was working fine). The user tried booting the system from a "clean" bootable floppy disk (no CONFIG.SYS or AUTOEXEC.BAT files to eliminate the device drivers), but the problem remained. After a bit of consideration, the user powered down the system, removed the CD-ROM adapter board, and booted the system from a "clean" bootable floppy disk. Sure enough, the parallel port tape backup started working again.
Stories such as this remind technicians that hardware conflicts are not always the monstrous, system-smashing mistakes that they are made out to be. In many cases, conflicts have subtle, non-catastrophic consequences. Since the CD-ROM was the last device to be added, it was the first to be removed. It took about 5 minutes to realize and remove the problem. However, removing the problem is only part of conflict troubleshooting - re-installing the device without a conflict is the real challenge.
Ideally, the way to correct a conflict would be to alter the conflicting setting. That's dynamite in theory, but another thing in practice. The trick is that you need to know what resources are in use and which ones are free. Unfortunately, there are only two ways to find out. On one hand, you can track down the user manual for every board in the system, then inspect each board individually to find their settings, then work accordingly. This will work (assuming you have the documentation) but it is cumbersome and time-consuming. As an alternative, you can use a resource testing tool such as the Discovery Card (by ForeFront Group). The Discovery Card plugs into a 16-bit ISA slot and uses a series of LEDs to display each IRQ and DMA channel in use. Any LED not illuminated is an available resource. It is then simply a matter of setting your expansion hardware to an IRQ and DMA channel that is not illuminated. Remember that you may have to alter the command line switches of any device drivers. The only resources not illustrated by the Discovery Card are I/O addresses, but since most I/O ports are reserved for particular functions (as you saw in Tables 13-3 to 13-5), you can typically locate an unused I/O port with a minimum of effort.
Conflict troubleshooting with Windows 95
One of the biggest problems with conflict troubleshooting is that every conflict situation is a bit different. Variations in PC equipment and available resources often reduce conflict troubleshooting to a "hit or miss" process. Fortunately, conflict troubleshooting can be accomplished quickly and easily using the tools provided by Windows 95 (namely the Device Manager). This part of the chapter provides a step-by-step process that you can use for conflict resolution under Windows 95.
NOTE: The steps described below should be read like a flow chart, and you’ll find many references that will take you back and forth to various steps throughout this section.
Step 1: Getting started
Start the Device Manager in Windows 95:
NOTE: If the hardware that has the conflict isn't visible in the list, click the plus sign (+) next to the type of hardware.
Determine if the device was installed twice. Is the device you were installing (or that suffers from the conflict) listed twice in Device Manager?
Step 2: Device listed only once
View the resource settings for the conflicting device:
Do you see a box with resource settings (as in Fig. 13-3)?
Step 3: Device listed twice
Remove ALL the duplicated device(s), and install again:
Did this fix the problem?
Step 4: Resource settings appear
Identify exactly which resources are causing the conflict:
Is more than one resource conflict listed?
Step 5: Manual button appears
Determine why the resources are not displayed:
Which text message do you see?
Step 6: There is no Resources tab
You have probably chosen the wrong device. Select the correct device:
Do you see a box with resource settings now?
Step 7: More than one conflict is listed
At this point, you should determine just how many devices are listed as being conflicting.
Step 8: Only one conflict is listed
Look for a resource setting that doesn't conflict:
Did you find a setting that doesn't conflict with any other hardware?
Step 9: No conflicts are listed
If there are no conflicts listed in the Conflicting Device List box, either you are not viewing resources for the correct device, or the conflict has already been resolved (you need to restart your computer to allow Windows 95 to configure the hardware). Look at the top of the dialog box to see if you are viewing resources for the correct device.
There is no further solution to this problem. If restarting Windows 95 does not clear the problem, you may simply need to remove the conflicting device.
Step 10: The device is conflicting
Now you need to identify which hardware is conflicting:
Is more than one resource conflict listed?
Step 11: Only one device is conflicting
Do you want to disable the device that is causing all the conflicts?
Step 12: More than one device is conflicting
Look for resource settings that don't conflict:
Did you find a free setting for each conflicting resource?
Step 13: There is a free setting
When a free setting is available, change the configuration:
NOTE: Depending on the type of hardware you have, you may have to change the jumpers on your hardware card to match the new setting(s), or you may have to run a configuration utility provided by your hardware manufacturer. If the jumper settings on your card aren't set properly, your hardware will not work, even if you resolved the conflict correctly. Refer to your hardware documentation for instructions on changing jumpers.
Restart your computer:
This should correct the problem, and the hardware conflict should now be resolved once the PC is restarted.
Step 14: All other settings conflict
Identify hardware you no longer need:
Can you identify a hardware device that you no longer need to use?
Step 15: Resource settings cannot be modified
View the resources for the other device:
Does this device have a Resources tab?
Step 16: There is more than one conflict
How many devices are listed as conflicting?
Step 17: There is only one conflict
Look for a resource setting that doesn't conflict:
Did you find a setting that doesn't conflict with any other hardware?
Step 18: Disable conflicting hardware
Determine how to best disable the conflicting hardware:
Do you see a Set Configuration Manually button?
Step 19: Resources now set without conflicts
Print out a report for each device you changed:
This should correct the problem, and you should be done.
Step 20: Some resources are still conflicting
Set resources to conflict with only one device:
Are all conflicts with one device?
Step 21: Disable the unneeded device
Determine whether the hardware you want to disable is Plug-and-Play:
Do you see a Set Configuration Manually button?
Step 22: All devices are in use
Write down a list of all devices using resources:
Rearrange resource settings for conflicting hardware:
Did you find a free resource setting?
Step 23: Resource information is available
Check to see if the device can use a different resource:
Did you find a free resource setting?
Step 24: Resource information is not available
Decide which device you should disable. Because both devices need to use the same resource setting, you must decide which device you want to use. You must disable and/or remove the other device.
It probably is easier to remove the device that had the original conflict. If you choose to remove the other device, you may see a message telling you that you still have a conflict after completing the procedure. Just restart the procedure and continue resolving the conflict.
Which device would you like to disable?
Step 25: Manual button not available
Disable the conflicting hardware by removing it:
Go to Step 19.
Step 26: Free resources found
Change the resource settings to utilize the free resources:
Go to Step 19.
Step 27: No free resources available
You must disable some hardware to relieve the conflict. Do you want to disable the hardware that caused the original conflict?
Step 28: Free setting found
Determine whether there are any remaining conflicts:
NOTE: If the conflict you just resolved is listed, you can ignore it. It will no longer conflict after you restart your computer later.
Are there still conflicts listed?
Step 29: No free setting found
You must decide which device to disable. Because both devices need to use the same resource setting, you must decide which device you want to use. You must disable and remove the other device.
It probably is easier to remove the device that had the original conflict at this point. If you choose to remove the other device, you may see a message telling you that you still have a conflict after you finish and restart your computer. Just restart this procedure and continue resolving the conflict.
Which device would you like to disable?
Step 30: Disable original conflicting device
Determine whether you have to remove the card to disable the hardware:
When you see a Set Configuration Manually button:
If you do see that button, and there are no resource settings listed in the box, you’ll need to restart your computer.
When you don’t see a Set Configuration Manually button:
If no button is available, you’ll need to disable the physical hardware by removing it from the system.
This should correct the conflict, and complete your troubleshooting procedure.
Step 31: Disable other conflicting device
Determine whether you have to remove the card to disable the hardware:
Do you see a Set Configuration Manually button?
Step 32: Disable the other device
Determine whether there are any remaining conflicts:
Are there still conflicts listed?
Step 33: Remove the other device
Disable hardware by removing it:
Go to Step 19.
Step 34: There are still some conflicts
Try setting resources to conflict with only one device:
Are all conflicts now with one device?
The role of Plug-and-Play (PnP)
Traditional PCs used devices that required manual configuration - each IRQ, DMA, I/O port, and memory address space had to be specifically set through jumpers on the particular device. If you accidentally configured two or more devices to use the same resource, a conflict would result. This would require you to isolate the offending device(s), identify available resources, and reconfigure the offending device(s) manually. Taken together, this was often a cumbersome and time-consuming process.
In the early 1990s, PC designers realized that it was possible to automate the process of resource allocation each time the system initializes. This way, a device need only be installed, and the system would handle its configuration without the assistance or intervention of the installer. This concept became known as "Plug-and-Play" (PnP), and is now standard in the PC arena. PnP systems require three elements in order to function:
When the PnP system works properly, a PnP device can be installed in an available expansion slot on a PnP-supported motherboard (with a PnP BIOS). When Windows 95 starts, it recognizes the new PnP device, assigns resources, then attempts to install the proper protected-mode driver (which could be installed from manufacturer’s floppy disk or a Windows 95 installation CD). Thereafter, the system "remembers" the new device, and reconfigures it each time the system starts. Ideally, if the PnP device is ever removed, Windows 95 would automatically clear the device from its "system", and free the resources for other devices.
However, if any of these elements are missing, devices will not be "auto-configured". For example, PnP won’t work under DOS (though there are DOS PnP drivers which can be used to initialize PnP devices). Older, jumper-configured devices (called legacy devices) also won’t support PnP, and resources need to be reserved for legacy devices in order to prevent the PnP system from ignoring them entirely.
NOTE: PnP "auto-configuration" information is stored in the Extended System Configuration Data (ESCD) area, and is cleared when the CMOS RAM is cleared or lost.
Keep your notes
Once you have determined the IRQ, DMA, and I/O settings that are in use, a thorough technician will note each setting on paper, then tape the notes inside the system's enclosure. This extra step will greatly ease future expansion and troubleshooting. To make your note taking process even faster, you can photocopy and use the System Setup Form included in the Appendix of this book.
Further study
That concludes Chapter 13. Be sure to review the glossary and chapter questions on the accompanying CD. If you have access to the Internet, take some time to review a few of the resources listed below:
Data Depot: http://www.datadepo.com/datadepo.htm
Download MSD 2.11: http://support.microsoft.com/download/support/mslfiles/GA0363.EXE
The Discovery Card: http://www.ffg.com/pcproducts/discover.html
Windsor Technologies: http://www.windsortech.com/
Chapter: 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 |
| Reserve your copy at a Beta Bookstore near you! |
Contact Bet@books © 1998 The McGraw-Hill Companies, Inc. All rights reserved. Any use of this Beta Book is subject to the rules stated in the Terms of Use. |