Unlocking the Mystery of Keyloggers with Some Key Information

See what we did there? Didn’t even use ChatGPT for that title! Keyloggers are interesting for several reasons. For starters, they’re a useful type of
malware to experiment with if you want to get into maldev. They also involve a tactile interaction between the user and the malware, since they depend on the user typing. A quick Google search will turn up plenty of keylogger proofs-of-concept, but have you ever stopped to consider how they work — or how the keyboard interacts with the computer in a way that lets you capture the keys pressed? This writeup looks to answer those two questions following the journey of keyboard input from the hardware through to the OS. NOTE: This writeup will only focus on Windows OS keylogging.
The Keyboard
At its foundational level, the keyboard is a collection of switches laid out across a grid which sit under the keys. When a key is pressed down an electric current travels the grid to the keyboards PCB where it is converted into a scan code. Scan codes are intended to signal which physical key was pressed. When the key is released, a break code is sent indicating the key is no longer being pressed. This information is then passed along through the USB to the OS where I/O drivers begin to interpret it. Pretty cool to think that when a keylogger is being executed it is collecting data that is a result from this grid of switches!
From Keyboard to Kernel Mode
How then does the scan code travel to OS? The following diagram should help provide a good visual aid for this next stage of transferring the
keyboard input to the OS. NOTE: A more detailed version can be found here: Overview of Windows Components
For the sake of illustration, the keyboard in our diagram is connected to the PC via USB. When the keyboard gets plugged in the USB Host Controller Driver (usbhub.sys) notifies the Plug and Play Manager (PnP) that a new device has been connected. In short, Plug and Play is a Microsoft feature that allows peripherals to be easily connected. The PnP receives the device descriptors from the USB driver and identifies the keyboard as a Human Interface Device (HID). After the HID has been identified, the PnP will load the correct kernel-mode drivers. Common drivers used for an USB keyboard are hidusb.sys, hidclass.sys, and kbdclass.sys. Working in tandem, it also loads the following drivers for the HID element of the keyboard : kbdclass.sys and kbdhid.sys. For those curious, you can view these drivers within Device Manager.
The driver, kbdclass.sys, is one of the main components in capturing the entered keyboard data. Working with the I/O Manager, the scan codes are routed up to the next stage of processing – the Win32k.sys subsystem.
The Win32k.sys subsystem operates in kernel mode and serves as the main driver that is responsible for managing and interpreting the keyboard input. Here, the scan code is translated into what is known as a virtual-key code. This translation takes place utilizing a keyboard layout DLL. It should be noted that scan codes are generally universally the same across all keyboards because they operate on the electric grid. However, virtual-key codes can be dependent on the keyboard layout DLL being used and OS.
From Kernel-Mode to User-Mode
Once the virtual-key has been decided upon, the keyboard input is sent to the message queue in the form of something such as WM_KEYDOWN or
WM_KEYUP. You might be thinking, “What if I have Teams and Notepad open at the same time? How does the system know which one I am typing
into?” Great question! Windows uses User32.dll to determine the active window that should receive the keyboard input by determining the keyboard focus. The keyboard focus is set by using two functions within winuser.h, GetFocus and SetFocus.
Within winuser.h, is another function that applications use to determine the current state of a key and that is GetAsyncKeyState. This function is
fairly straight forward. Working with the relayed virtual-key codes, GetAsyncKeyState returns a 16-Bit value determining whether a key has
been pressed or is currently being pressed.
As Microsoft’s documentation shows, keyboard input data is fed into a continuous loop constantly processing keyboard activity. From there, the
keyboard is passed into the Window procedure where it is finalized within the application being used and displayed on screen.
User-Mode to Malware
When we say the above was a quick introduction to the internal workings of a keyboard we genuinely mean that it was a quick introduction. We could have gone deeper on Raw Input Threads, additional DLLs, etc. However, for the sake of attention spans everywhere we wanted to keep it as concise as possible.
Now, the fun part – building a basic keylogger.
For this keylogger we will be using Go and running it on Windows 11. The keylogger we build will only be displaying the scan-code and virtual-key
code to provide you with a foundation to expand on.
NOTE: To better understand how Go handles and interacts with Windows, we recommend Black Hat Go (specifically Chapter 12) as mentioned in the
resources.
Looking at the above code you should recognize some elements we discussed in the previous paragraphs, but let’s walk through this code in
chunks.
variables are defined:
– KEY_STATE_PRESSED : set to 0x01 to match the GetAsyncKeyState output
– TRUE: to keep in line with C code
– MAVP_VK_TO_VSC: tells Windows to convert the virtual-key code to a scan code.
– User32: performs the syscall to lazily load in user32.dll (again, Black Hat Go for more information)
– procGetAsyncKey: utilizes GetAsyncKeyState to check on if a key is being pressed
– procMapVirtual: utilizes MapVirtualKeyW to convert virtual-key codes into scan code.
Next we define two functions: GetAsyncKeyState and MapVirtualKey. These two functions are the entry of our application into the Windows
subsystem.
Our main function uses our fmt and time imports and is the heart of the application connecting our previously defined variables, constants, and functions. To help break it down, let’s highlight a few components:
– Next, the code looks at the state of the key. If the key has been pressed the scan code is generated by mapping the virtual key to it with MAPVK_VK_TO_VSC.
– “fmt.Printf” is used to print our output to the terminal when the application is running.
– Without Sleep being present we would eat up our CPU. “time.Sleep(40 * time.Millisecond)” is used to reiterate through the keyboard strokes every 40 milliseconds to ensure a key is not missed without using up our CPU.
Upon execution, our Go program captures the keyboard input and outputs as it should:
Naturally, on a Red Team engagement you would want to capture key input in its ASCII form but this walkthrough was meant to give you a deeper understanding of how keyboards and keyloggers work at their foundational level. With this information, you can better understand keylogger malware writeups or possibly build out your own with more functionality! Either way, keep hacking the planet.
– Keyboard Input Overview
– USB Device Class Drivers in Windows
– Microsoft Virtual-Key Codes
– Get-Focus Function
– Windows Internals pt 1, 7th Edition. Pavel Yosifovich
– “Anti-Keylogging with Noise”, POC||GTFO Volume III. Mike Myers
– Code. Chartles Petzold
– Chapter 12, “Windows System Interaction and Analysis”, Black Hat Go. Tom Steele, Chris Patten, and Dan Kottmann-
Dean Clinton
Dean Clinton is a penetration tester who chose to walk the way of the Mandalore which is why he can’t remove his helmet. Just kidding. He’s actually a normal dude who loves computers, hacking, and learning how things work.