Professional Diary: June 2012

Thread Synchronization with Kernel Objects

Event Kernel Objects

When a manual-reset event is signaled, all threads waiting on the event become schedulable. When an auto-reset event is signaled, only one of the threads waiting on the event becomes schedulable.

Waitable Timer Kernel Objects

Semaphore Kernel Objects

They contain a usage count, as all kernel objects do, but they also contain two additional signed 32-bit values: a maximum resource count and a current resource count.

The rules for a semaphore are as follows

· If the current resource count is greater than 0, the semaphore is signaled.

· If the current resource count is 0, the semaphore is nonsignaled.

· The system never allows the current resource count to be negative.

· The current resource count can never be greater than the maximum resource count.

When you use a semaphore, do not confuse the semaphore object's usage count with its current resource count.

Test-and-set operation is done atomically; that is, when you request a resource from a semaphore, the operating system checks whether the resource is available and decrements the count of available resources without letting another thread interfere

There is just no way to get the current resource count of a semaphore without altering it.

Mutex Kernel Objects

A mutex object contains a usage count, a thread ID, and a recursion counter.

The rules for a mutex are as follows:

· If the thread ID is 0 (an invalid thread ID), the mutex is not owned by any thread and is signaled.

· If the thread ID is nonzero, a thread owns the mutex and the mutex is nonsignaled

· Unlike all the other kernel objects, mutexes have special code in the operating system that allows them to violate the normal rules.

o The system checks to see whether the thread attempting to acquire the mutex has the same thread ID as recorded inside the mutex object. If the thread IDs match, the system allows the thread to remain schedulable—even though the mutex was nonsignaled.

o When a mutex becomes abandoned, the system automatically resets the mutex object's thread ID to 0 and its recursion counter to 0.

Other Thread Synchronization Functions

WaitForInputIdle

This function waits until the process identified by hProcess has no input pending in the thread that created the application's first window.

SignalObjectAndWait

The SignalObjectAndWait function signals a kernel object and waits on another kernel object in a single atomic operation:

Thread Pooling

The new thread pooling functions let you do the following:

· Call functions asynchronously

· Call functions at timed intervals

· Call functions when single kernel objects become signaled

· Call functions when asynchronous I/O requests complete

To accomplish these tasks, the thread pool consists of four separate components. Table 11-1 shows the components and describes the rules that govern their behavior.

Table 11-1. Thread pool components and their behavior

Component
	Timer	Wait	I/O	Non-I/O
Initial Number of Threads	Always 1	1	0	0
When a Thread Is Created	When first thread pool timer function is called	One thread for every 63 registered objects	The system uses heuristics, but here are some factors that affect the creation of a thread: Some time (in seconds) has passed since the thread was added The WT_EXECUTELONGFUNCTION flag is used The number of queued work items exceeds a certain threshold
When a Thread is Destroyed	When a process terminates	When the number of registered wait objects is 0	When the thread has no pending I/O requests and has been idle for a threshold period (about a minute)	When the thread is idle for a threshold period (about a minute)
How a Thread Waits	Alertable	WaitForMultipleObjectsEx	Alertable	GetQueuedCompletionStatus
What Wakes Up a Thread	Waitable timer is signaled queuing a user APC	Kernal object becomes signaled	Queued user APC or completed I/O request	Posted completion status or completed I/O request (The completion port allows at most 2 * number of CPUs threads to run concurrently)

Windows Memory Architecture

How a Virtual Address Space Is Partitioned

· NULL-Pointer Assignment

· DOS/16-bit Windows Application Compatibility

· User-Mode

· 64-KB Off-Limits (Windows 2000 Only)

· Shared Memory-Mapped File (MMF)(only Windows 98)

· Kernel-Mode

Using the /3GB switch reduces the number of threads, stacks, and other resources that the system can create. In addition, the system can only use a maximum of 16 GB of RAM vs. the normal maximum of 64 GB because there isn't enough virtual address space in kernel mode to manage the additional RAM.

when you invoke an application, the system opens the application's .exe file and determines the size of the application's code and data. Then the system reserves a region of address space and notes that the physical storage associated with this region is the .exe file itself. That's right—instead of allocating space from the paging file, the system uses the actual contents, or image, of the .exe file as the program's reserved region of address space. This, of course, makes loading an application very fast and allows the size of the paging file to remain small.

Protection Attributes

you should not pass either PAGE_WRITECOPY or PAGE_ EXECUTE_WRITECOPY when you are reserving address space or committing physical storage using the VirtualAlloc function. Doing so will cause the call to VirtualAlloc to fail; calling GetLastError returns ERROR_ INVALID_PARAMETER. These two attributes are used by the operating system when it maps .exe and DLL file images.

Special Access Protection Attribute Flags

· PAGE_NOCACHE : it exists mostly for hardware device driver developers who need to manipulate memory buffers

· PAGE_WRITECOMBINE : is also for device driver developers. It allows multiple writes to a single device to be combined together in order to improve performance

· PAGE_GUARD : allows an application to receive a notification (via an exception) when a byte on a page has been written to.

The Importance of Data Alignment

you can still tell the system to silently correct misaligned data accesses for all threads in your process by having one of your process's threads call the SetErrorMode (SEM_NOALIGNMENTFAULTEXCEPT) function.

Changing this flag affects all threads contained within the process that owns the thread that makes the call.

You should also note that a process's error mode flags are inherited by any child processes. Therefore, you might want to temporarily reset this flag before calling the CreateProcess function (although you usually don't do this).

For x86 systems, this flag is always on and cannot be turned off. For Alpha systems, you can turn this flag off only if the EnableAlignmentFaultExceptions registry value is set to 1.

Microsoft's C/C++ compiler for the Alpha supports a special keyword called _ _unaligned

You use the _ _unaligned modifier just as you would use the const or volatile modifiers except that the _ _unaligned modifier is only meaningful when applied to pointer variables. When you access data via an unaligned pointer, the compiler generates code that assumes that the data is not aligned properly and adds the additional CPU instructions necessary to access the data.

the _ _unaligned keyword is not supported by the x86 version of the Visual C/C++ compiler.

So, if you are trying to create a single source code base for your application, you'll want to use the UNALIGNED macro instead of the _ _unaligned keyword.

Exploring Virtual Memory

System Information

The GetSystemInfo function retrieves the values relevant to the host machine.

Virtual Memory Status

A Windows function called GlobalMemoryStatus retrieves dynamic information about the current state of memory

If you anticipate that your application will run on machines with more than 4 GB of RAM or if the total swap file size might be larger than 4 GB, you may want to use the new GlobalMemoryStatusEx function:

Determining the State of an Address Space

function is called VirtualQuery

Windows also offers a function that allows one process to query memory information about another process: VirtualQueryEx

Using Virtual Memory in Your Own Applications

Windows offers three mechanisms for manipulating memory:

· Virtual memory, which is best for managing large arrays of objects or structures

· Memory-mapped files, which are best for managing large streams of data (usually from files) and for sharing data between multiple processes running on a single machine

· Heaps, which are best for managing large numbers of small objects

Reserving a Region in an Address Space

You reserve a region in your process's address space by calling VirtualAlloc:

System rounds that address down to a multiple of 64 KB

You cannot use the protection attribute flags PAGE_GUARD, PAGE_NOCACHE, or PAGE_WRITECOMBINE when reserving regions—they can be used only with committed storage.

Committing Storage in a Reserved Region

you can specify a different protection attribute than the ones used at the time of reserving

Decommitting Physical Storage and Releasing a Region

Call the VirtualFree function

When releasing a region, you must release all the address space that was reserved by the region.

You must pass 0 for the dwSize parameter or the call to VirtualFree will fail.

Changing Protection Attributes

You can alter the protection rights of a page of memory by calling VirtualProtect:

flNewProtect can represent any one of the PAGE_* protection attribute identifiers except for PAGE_WRITECOPY and PAGE_EXECUTE_WRITECOPY.

Address Windowing Extensions (Windows 2000 only)

Microsoft had two goals when creating AWE:

· Allow applications to allocate RAM that is never swapped by the operating system to or from disk.

· Allow an application to access more RAM than fits within the process's address space.

One limitation of AWE is that all storage mapped to the address window must be readable and writable.

A Thread's Stack

By default, the system reserves 1 MB of address space and commits 2 pages of storage.

Memory-Mapped Files

Injecting a DLL Using Remote Threads

Injecting a DLL with a Trojan DLL

Injecting a DLL as a Debugger

Injecting Code with CreateProcess

Termination Handlers

With SEH, you can separate the main job from the error-handling chores.

Don't confuse structured exception handling with C++ exception handling. C++ exception handling is a different form of exception handling, a form that makes use of the C++ keywords catch and throw. Microsoft's Visual C++ also supports C++ exception handling and is implemented internally by taking advantage of the structured exception handling capabilities already present in the compiler and in Windows operating systems.

SEH really consists of two main capabilities: termination handling and exception handling.

A termination handler guarantees that a block of code (the termination handler) will be called and executed regardless of how another section of code (the guarded body) is exited.

The _ _try and _ _finally keywords delineate the two sections of the termination handler.

A call to ExitThread or ExitProcess will immediately terminate the thread or process without executing any of the code in a finally block.

Code in a finally block always starts executing as a result of one of these three situations

· Normal flow of control from the try block into the finally block

· Local unwind: premature exit from the try block (goto, longjump, continue, break, return, and so on) forcing control to the finally block

· a global unwind—occurred without explicit identification

To determine which of the three possibilities caused the finally block to execute, you can call the intrinsic function AbnormalTermination:

This intrinsic function can be called only from inside a finally block and returns a Boolean value indicating whether the try block associated with the finally block was exited prematurely.

It is impossible to determine whether a finally block is executing because of a global or a local unwind

Exception Handlers and Software Exceptions

Whenever you create a try block, it must be followed by either a finally block or an except block. A try block can't have both a finally block and an except block, and a try block can't have multiple finally or except blocks. However, it is possible to nest try-finally blocks inside try-except blocks and vice versa.

Unlike termination handlers (discussed in the previous chapter), exception filters and exception handlers are executed directly by the operating system—the compiler has little to do with evaluating exception filters or executing exception handlers.

When this exception is raised, the system will locate the beginning of the except block and evaluate the exception filter expression, an expression that must evaluate to one of the following three identifiers

- EXCEPTION_EXECUTE_HANDLER

- EXCEPTION_CONTINUE_SEARCH

- EXCEPTION_CONTINUE_EXECUTION

EXCEPTION_EXECUTE_HANDLER

The system performs a global unwind,jumps to the code inside the except block. After the code inside the except block finishes executing, control resumes at the first instruction after the except block.

Global Unwinds

When an exception filter evaluates to EXCEPTION_EXECUTE_HANDLER, the system must perform a global unwind. The global unwind causes all of the outstanding try-finally blocks that started executing below the try-except block that handles the exception to resume execution.

Halting Global Unwinds

It's possible to stop the system from completing a global unwind by putting a return statement inside a finally block. It means that the code inside exception block is not executed.

EXCEPTION_CONTINUE_EXECUTION

When the system sees EXCEPTION_CONTINUE_EXECUTION, it jumps back to the instruction that generated the exception and tries to execute it again.

EXCEPTION_CONTINUE_SEARCH

This identifier tells the system to walk up to the previous try block that's matched with an except block and call this previous try block's exception filter.

Any try blocks that are matched with finally blocks instead of except blocks are skipped by the system while it walks up the chain. The reason for this should be pretty obvious: finally blocks don't have exception filters and therefore give the system nothing to evaluate.

GetExceptionCode

The GetExceptionCode intrinsic function can be called only in an exception filter (between the parentheses following _ _except) or inside an exception handler.

You cannot call GetExceptionCode from inside an exception filter function.

GetExceptionInformation

When an exception occurs, the operating system pushes the following three structures on the stack of the thread that raised the exception:

- EXCEPTION_RECORD structure

- CONTEXT structure

- EXCEPTION_POINTERS structure

GetExceptionInformation can be called only in an exception filter. Once control has been transferred to the exception handler, the data on the stack is destroyed.

Software Exceptions

You simply call RaiseException to raise a software exception.

Unfortunately, most developers do not get into the habit of using exceptions for error handling.

There are two basic reasons for this. The first reason is that most developers are unfamiliar with SEH. Even if one developer is acquainted with it, other developers might not be. If one developer writes a function that raises an exception but other developers don't write SEH frames to trap the exception, the process will be terminated by the operating system.

The second reason why developers avoid SEH is that it is not portable to other operating systems. Many companies target multiple operating systems and would like to have a single source code base for their products, which is certainly understandable. SEH is a Windows-specific technology.

Unhandled Exceptions and C++ Exceptions

Windows offers another function, SetUnhandledExceptionFilter, which allows you to wrap all your thread functions in an SEH frame:

The UnhandledExceptionFilter function is a fully documented Windows function that you can call directly from within your own code.

C++ Exceptions Versus Structured Exceptions

If you're writing a C++ application, you should use C++ exceptions instead of structured exceptions. The reason is that C++ exceptions are part of the language and therefore the compiler knows what a C++ class object is. This means that the compiler automatically generates code to call C++ object destructors in order to guarantee object cleanup.

C++ exceptions can never be re-executed, and it would be an error for a filter diagnosing a C++ exception to return EXCEPTION_CONTINUE_EXECUTION.

Catching Structured Exceptions with C++

Window Messaging

The thread that creates a window must be the thread that handles all messages for the window.

This also means that every thread that creates at least one window is assigned a message queue by the system.

A Thread's Message Queue

When a thread is first created, the system assumes that the thread will not be used for any user interface_related tasks. This reduces the system resources required by the thread. However, as soon as the thread calls a graphical UI-related function (such as checking its message queue or creating a window) the system automatically allocates some additional resources for the thread so that it can perform its UI-related tasks. Specifically, the system allocates a THREADINFO structure and associates this data structure with the thread.

The THREADINFO structure is an internal, undocumented data structure that identifies the thread's posted-message queue, send-message queue, reply- message queue, virtualized-input queue, and wake flags, as well as a number of variables that are used for the thread's local input state.

Posting Messages to a Thread's Message Queue

When a thread calls PostMessage function, the system determines which thread created the window identified by the hwnd parameter.

You can determine which thread created a window by calling GetWindowThreadProcessId

PostQuitMessage doesn't really post a message to any of the THREADINFO structure's queues. Internally PostQuitMessage just turns on the QS_QUIT wake flag (which I'll discuss later) and sets the nExitCode member of the THREADINFO structure.

Sending Messages to a Window

Here is how SendMessage works. If the thread calling SendMessage is sending a message to a window created by the same thread, SendMessage is simple: it just calls the specified window's window procedure as a subroutine. When the window procedure is finished processing the message, it returns a value back to SendMessage. SendMessage returns this value to the calling thread.

However, if a thread is sending a message to a window created by another thread, the internal workings of SendMessage are far more complicated.

Your thread is suspended while the other thread is processing the message. The system must perform the actions discussed next.

First, the sent message is appended to the receiving thread's send-message queue, which has the effect of setting the QS_SENDMESSAGE flag for that thread.

Second, if the receiving thread is already processing a message, system will not interrupt it. And when it is waiting for messages, the system first checks to see whether the QS_SENDMESSAGE wake flag is set, and if it is, the system scans the list of messages in the send-message queue to find the first sent message. It is possible that several sent messages could pile up in this queue.

While the receiving thread is processing the message, the thread that called SendMessage is sitting idle, waiting for a message to appear in its reply-message queue.

SendNotifyMessage places a message in the send-message queue of the receiving thread and returns to the calling thread immediately.

SendNotifyMessage differs from PostMessage in two ways.

1. Sent message has higher priority than the posted message.

2. It works same as SendMessage if the Window is created by the calling thread.

ReplyMessage is called by the thread receiving the window message. This is to inform the system that Window has completed enough to know the result of the message processing. This needs to be called by the Window Procedure.

ReplyMessage does nothing if you call it while processing a message sent from the same thread.

The function InSendMessage returns TRUE if the thread is processing an interthread sent message and FALSE if it is processing an intrathread sent or posted message.

Waking a Thread

When a thread is running, it can query the status of its queues by calling the DWORD GetQueueStatus(UINT fuFlags) function:

the fuFlags parameter tells GetQueueStatus the types of messages to check for in the queues and result can be found in can be found in the high word of the return value. The low word indicates the types of messages that have been added to the queue and that haven't been processed since the last call to GetQueueStatus, GetMessage, or PeekMessage.

The Algorithm for Extracting Messages from a Thread's Queue

When a thread calls GetMessage or PeekMessage, the system examines the state of thread’s queue status flags and determine which message should be processed.

Professional Diary

Wednesday, 27 June 2012

Data Structure and Algorithms

Monday, 4 June 2012

Programming Applications for Microsoft Windows