Windows XP SP3 - Remove the 4GB physical address / RAM limit and use up to 64GB RAM using PAE

The limits of 32-bit memory addressing were addressed back in 1995, and a technology called PAE was developed, in order to extend the addressable memory to 36-bit (64GB).
PAE capable OS can address up to 64GB of RAM (note: each process can still only address 4GB)

All x86 editions of Windows starting from Windows 2000 are PAE capable, unfortunately, Microsoft introduced some limits to prevent a user from utilizing more than 4GB RAM, and in XP SP2\SP3 went even further: a user with a dedicated GPU with 1GB of RAM, could only utilize up to 3GB of system RAM, this is due to newly introduced physical address limits, Geoff Chappell has done a brilliant analysis on those limits.

As Geoff Chappell notes, those limits are enforced in the PAE-capable kernel (ntkrnlpa.exe / ntkrpamp.exe), In addition, the HAL and USB 2.0 Port driver (usbport.sys) assume that all physical addresses will be < 4GB.

Removing the limit(s) from the kernel without updating the HAL will result in hard drive data corruption and BSODs.

It's easy enough to identify and remove the limit(s) from the kernel (those are enforced when the kernel function ExVerifySuite() return false).

Fixing the HAL:
Geoff Chappell notes:
"The HAL models DMA operations in terms of [..] map registers according to whether double-buffering may be needed for transfers [..] involve physical memory that is too high for the device to access".
The problem with XP SP2/SP3 HAL is that: "the modified HAL assumes for itself that double buffering is never required for any DMA operations on a device that can handle 32-bit physical addresses".

The problematic function is HalGetAdapter():
The correct way (Windows Server 2003 SP2):
if ( bScatterGather
	&& (!HalpPhysicalMemoryMayAppearAbove4GB || DevDesc->Dma64BitAddresses)
	&& (LessThan16Mb || InterfaceType == Eisa || InterfaceType == PCIBus) )
{
	MapRegistersNeeded = 0;
}
else // double buffering might be needed, map registers must be allocated
{
	..
}
				
Windows XP SP3 Way:
if ( bScatterGather
	&& (LessThan16Mb || InterfaceType == Eisa || InterfaceType == PCIBus) )
{
	MapRegistersNeeded = 0;
}
else // double buffering might be needed, map registers must be allocated
{
	..
}
				

The problem with XP SP3 is that when you have physical addresses > 4GB, map registers must be used (and its buffers allocated) for double buffering to function properly.

The proper fix would be to make it just like Windows Server 2003, in which map registers are always used unless one of the following is true:
1. All RAM is mapped to physical addresses < 4GB.
2. The device supports 64-bit physical addresses.
However, in all likelyhood we will only apply the fixed HAL to machines where the first condition is never true,
A quick and dirty approach would be to always assume map register buffers will be needed, even when the device supports 64-bit addresses,
This might have a negligible performance penalty compared to Windows Server 2003 (due to unnecessary allocations of map registers), but it's an easy modification to the HAL dll.

Fixing USB 2.0:
The only quick and dirty fix I'm aware of ATM is using usbport.sys from Server 2003 SP2.
None of the methods I tested worked well with VMWare USB Passthrough.

Download:
You can download the patched files here.

* The patched files can be used in one of two ways:
1. Modify C:\boot.ini to include the /kernel and /hal switches, pointing to the modified files.
2. In-place replacement of the existing ntkrnlpa.exe and hal.dll (make sure you know what you are doing, it's best to perform the modifications offline to avoid Windows File Protection issues).

Additional references:
Memory Limits for Windows and Windows Server Releases
Driver Basics - DMA Concepts
What is DMA (Part 3) - DMA Translation & Map Registers