Sunday, July 4, 2010

Troubleshooting WinPE and task sequence issues

You can troubleshoot some common WinPE and task sequence issues.

WinPE never starts the task sequence

Check the SMSTS.LOG file at X:\windows\temp\smstslog\smsts.log. If a package never downloaded, it is likely that you simply do not have the appropriate network drivers installed, which prevents the machine from communicating with Configuration Manager.

Check your driver catalog to ensure you have the right network drivers available and installed into the boot image, and update the boot image to your distribution points.

Additional network or storage drivers might be needed in the boot image to enable the WinPE boot to function correctly. You should add those through Drivers in the Operating System Deployment node.

The right drivers have been added to the boot image, but are not loading

The original boot.wim file (WinPE boot image) created during Configuration Manager installation is copied and modified with IBM-specific drivers and other files. Your task sequences that use the IBM Deployment Pack must use this boot image or the tools might not work properly.

Check to make sure the image into which you loaded the drivers is the same image being used by the task sequence.

This is a common error for administrators who maintain multiple boot images.

Servers will not boot using PXE

PXE is an extension of DHCP, which uses a broadcast type of communication. Broadcast communication uses standard timeout values that are not readily changeable. As a result, a computer waits for a default timeframe to receive a DHCP or PXE response before timing out and causing a failure condition.

Each time a server is rebooted, it must renegotiate the connection to the switch. Some network switches arrive configured with default settings that might incur connectivity delays. That is, the settings on the switch might cause a DHCP or PXE timeout because they fail to negotiate a connection in time.

One of the features that can be affected by this issue is Spanning Tree Protocol (STP). STP is a protocol that prevents loops and provides redundancy within a network. A networking device using this algorithm might experience some latency as it collects information about other network devices. During this period of information collection, servers might boot to PXE and time out while waiting for a response from Windows® Deployment Services. Disable the STP or enable PortFast on end-node ports for the target server to prevent such occurrences. Refer to the manufacturer’s user guide for further information.

Another feature that can be affected by this issue is the EtherChannel or Port Aggregation Protocol (PAgP). EtherChannel allows multiple links between devices to act as one fast link that shares the load between the links. Running the EtherChannel Protocol in automatic mode can cause a connectivity delay of up to 15 seconds. Switch to a manual mode or turn off this feature to eliminate this delay.

Speed and duplex negotiation can also play a role in negotiation timeouts. If auto-negotiation on the switch is set to off, and the server is not configured to that speed and duplex setting, the switch will not negotiate with that server.

For more information, see the Cisco Web site and the following Cisco documents:

Default boot order does not allow PXE to boot when a valid drive exists

When an active partition is created on a hard drive, it automatically becomes a bootable device if a valid operating system has been installed. If your PXE NIC is after the hard drive in the boot order, the hard drive tries to boot before PXE and boots to Windows, or causes an Invalid System Partition error if Windows is not installed.

To resolve this issue, be sure that PXE is placed before the hard drive in the boot order. Keep in mind that even if PXE is first in the boot order, the computer does not actually boot to PXE unless Configuration Manager has a task sequence for it to run.

When using a “Reboot” action after initializing an array controller, the task sequence fails

Configuration Manager 2007 does not allow a task sequence to reboot back to PXE. It can reboot back to WinPE or to an installed operating system, both of which require a disk partition and the appropriate installed software.

Without a disk partition, Configuration Manager will fail when attempting to reboot during a task sequence because it expects to copy WinPE to the disk. Additionally, the management point tracks when a machine has booted to PXE to run a task sequence, and once a machine has booted to PXE for a task sequence, it cannot use PXE as a boot method again for that task sequence unless the advertisement is reset.

To perform a reboot to PXE if you need to within a task sequence, use the custom action called “Reboot To PXE." This custom action, written using C# and VBScript, connects to the Configuration Manager 2007 SDK, and contains custom code to drive actions in the admin console as well as the machine being deployed. This custom action performs all the steps necessary to perform the reboot to PXE and allow for proper program flow when it occurs.

The only other way to accomplish a reboot to PXE is to use more than one task sequence, let the computer “fall off the end” of the first task sequence and manually reset the PXE advertisement for the computer.

Task sequence fails with “Failed to Download Policy” and code 0x80093102 or 0x80004005

This error code typically refers to a certificate validation issue.

The SMSTS.LOG file will show an entry with the following text:

CryptDecryptMessage ( &DecryptParams, pbEncrypted, 
nEncryptedSize, 0, &nPlainSize, 0 ), HRESULT=80093102

or

no cert available for policy decoding

Possible causes are:


  • Misconfiguration of your domain or a site server, such as DNS not pointing to the site server, or the site server not specifying a valid FQDN (which is referred to by the DNS listing).

    If your site server does not specify a FQDN (and only specifies the NETBIOS name), and your DNS server refers to the FQDN, a faulty lookup might cause this error.


  • The certificate being used for PXE and boot media.

    Check the certificates under the Site Settings node and see if any certificates are blocked or missing. Open the certificates and ensure that they are actually installed into the certificate store. If not, install them.


If these actions do not work, try removing the package from the distribution point (via Manage Distribution Points) and adding the package again to regenerate the package hash.


Task sequence fails with “Failed to Download Policy” and code 0x80004005


This error code typically refers to a certificate validation issue.

The SMSTS.LOG file will show an entry with the following text:

failed to download policy

Check the certificates under the Site Settings node to if any certificates are blocked or missing. Open the certificates to ensure that the certificates are installed into the certificate store. If not, install the certificates.

Task sequence fails because the package is not downloading


In WinPE, the default option of “Download content locally when needed by running task sequence” will not work. When in WinPE, the task sequence engine will ignore (and fail) all actions that have packages set for this option.

Set all packages needed for use in WinPE to “Access content directly from a distribution point when needed by the running task sequence.”

Task sequence does not run again even after clearing the PXE advertisement


You must set the advertisement to “Always rerun” so that any time you reset the PXE advertisement, the advertisement is applied to the computer regardless of whether it ran the task sequence before.

Task sequences fail or act incorrectly after an upgrade


When upgrading from a previous version of this product, existing task sequences using these custom actions are not automatically updated.

To function correctly, open each task sequence action that uses a custom action in an editor. Add a “.” to the description and remove it to enable the Apply button. Click Apply to refresh the properties of the custom action and save any new automatic data or formatting that is required to function with the new version.

Files and logs are not being returned from the client


A number of issues can prevent the task sequence from returning files or logs from the client.

Among the possible issues that might prevent the task sequence from returning files or logs from the client are:


  • Failure of the client-side script prior to the file copy, which is usually evident in the log file.

    Repeat the task and press F8 during the task to get to a command prompt, if you selected the check box for Enable command support on the boot image properties > Windows PE page.

    Then open the SMSTS.LOG file. The location varies. In WinPE via PXE, the location is at X:\Windows\Temp\Smstslog\smsts.log.


  • Malformed XML in the IBM Deployment Packconfiguration file.
  • The command being executed actually has an error but exits with code 0.

    This can occur when a severe error is encountered in the script while the script is set to ignore errors and use programmatic error handling. Then the error handling did not catch the same error.

    Report such issues to the IBM® support site, as described in Getting help and technical assistance.


  • The task sequence cannot access the share or mapped drive that is the target drive for copying the files or logs.

Logs are being returned but not output files


A number of issues can prevent the task sequence from returning output files while allowing the task sequence to return log files.

Among the possible issues that might prevent the task sequence from returning output files from the client are:


  • No return file parameters are specified in the configuration XML.
  • Return file parameters in the configuration XML are incorrect.
  • An error is occurring with the operation of the utility that generates the output file.
  • A null variable is causing an error in the file name of the file to be returned.

Task step execution does not automatically change after a change to the configuration XML file


If you change the configuration XML, previously existing task steps do not automatically change unless you edit them.

To fix the existing task steps, open the task sequence editor and make a minor edit to each custom action step in the sequence. You can simply add a “.” to the description and then delete it to enable the Apply button. Click Apply. The task sequence steps are now saved with the automatically updated information from the new XML file.


Task sequence fails at “Apply Operating System” with “Failed to make volume X:\ bootable”


Several problems can cause this error.

This issue is indicated by log content similar to the following text:

MakeVolumeBootable( pszVolume ), 
HRESULT=80004005
(e:\nts_sms_fre\sms\client\osdeployment\applyos\installcommon.cpp,759)

Failed to make volume E:\ bootable.
Please ensure that you have set an active partition on the boot
disk before installing the operating system.

Unspecified error (Error: 80004005; Source: Windows)

ConfigureBootVolume(targetVolume),
HRESULT=80004005
(e:\nts_sms_fre\sms\client\osdeployment\applyos\applyos.cpp,326)

Process completed with exit code 2147500037

This issue can be related to two different scenarios:


  • If you are using a Format & Partition action in your task sequence to partition the hard drives, make sure that you select the check box for Make this the boot partition on one of the partitions.

    If you do not make a drive bootable and the computer has only the single drive, the task sequence engine automatically makes one of the partitions the boot partition. But if there are multiple drives, the task sequence engine cannot determine which drive should be bootable, and you see this error.


  • If you upgraded from the Configuration Manager RTM to SP1, you might have a problem if both hard drives are completely raw. If you have never partitioned the drives, a known bug in Windows PE prevents Windows PE from determining the drive where it was booted, and you see this error.

    This situation is likely on a server with a RAID controller where you have just formed two or more RAID sets. The new RAID sets are completely raw because they have never existed before.

    The only workaround to the problem of multiple raw drives is to manually boot into Windows PE and run "diskpart" to partition at least one of the drives. Then run the task sequence again. The task sequence should work.

    The known problem with Windows PE is fixed in Windows Vista SP1 and hence in the Windows PE that is derived from Vista SP1.



Install Configuration Manager 2007 SP1
Configuration Manager 2007 SP1 includes the SP1 version of the Windows Automated Installation Kit (WAIK). Download and install Configuration Manager SP1 to get the new version.

Upgrading to Configuration Manager 2007 SP1 automatically updates your default boot images, but does not automatically upgrade the IBM boot images.

Upgrade the IBM boot images by rerunning the IBM Deployment Pack installer and selecting “Modify”. You must also update your distribution points so that the new images are used. You should also update the distribution points for the default boot images as well.

The product installer detects the version of WinPE that is currently in use by the default boot images. If the default boot images are not Vista SP1, the product cannot install.


How to tell if your boot images are upgraded to Vista SP1
Boot image properties contain an identifier for “OS Version.”

Perform this procedure to see the version of WinPE in your boot images:


  1. Click Computer ManagementOperating System DeploymentBoot ImagesIBM Deployment.
  2. Right-click the boot image and select Properties.
  3. Click Images.
  4. Check the OS Version property for a value of 6.0.6001.18000 or greater.


What to do if your boot images are not upgraded to Vista SP1
You can manually recreate your boot images using the Windows AIK and following the steps listed in Technet: How to Add a Boot Image to Configuration Manager.

If your Configuration Manager processes permit, you might find it easier to remove the old boot image packages using the Admin Console, delete the files in the OSD\boot directories, and rerun the SP1 upgrade installation.


How to tell if WAIK was upgraded to Vista SP1


  1. Click Start > Run; then run the Regedit command.
  2. Navigate to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ComponentStudio.
  3. There should be a single key under this key, which is named with the number of the Windows AIK version.

    Note: Only one version of Windows AIK can be installed. However, an uninstall operation might have failed to remove the registry key.

    In such a case, the registry key with the highest version number should be the correct version number.


What to do if Windows AIK was not upgraded to Vista SP1
Configuration Manager is supposed to automatically upgrade the Windows AIK version during an upgrade to Configuration Manager SP1. If that did not occur, try manually uninstalling Windows AIK and rerunning the Configuration Manager SP1 upgrade.

To download Windows AIK, see the Microsoft Download Center: AIK page.


System environment variables are not carried over to the next action in the task sequence


When a task sequence runs, commands run in a command shell. When the task ends, so does the command shell environment, which causes the loss of any system variables that are defined in the task.

To pass variables between tasks, set the variables as “Task Sequence variables,” “Collection variables,” or “Machine variables.”

No comments: