Skip to content

Commit

Permalink
[cdac] ExecutionManager contract and RangeSectionMap lookup unit tests (
Browse files Browse the repository at this point in the history
dotnet#108685)

RangeSectionMap is a data structure for coarse-grained lookups to man native code pointers to managed method descriptors that cover the entire addressable memory.

This PR adds unit tests for the cdac reader RangeSectionMap lookup algorithm and some unit tests for the ExecutionManager contract APIs that use a RangeSectionMap and NibbleMap together to do a full TargetCodePointer->MethodDesc lookup

Contributes to dotnet#108553

* [cdac] ExecutionManager contract and lookup map tests

*  add RangeSectionMap docs

*  exhaustively test RangeSectionMap.GetIndexForLevel

*  fix lookup in RangeSection.Find

    RangeSectionLookupAlgorithm.FindFragment finds the slot containing the range section fragment pointer.
    Have to dereference it once to get the actual RangeSectionFragment pointer from the slot.

    This "worked" before because RangeSectionFragment.Next is at offset 0, so the first lookup would have a garbage range, so we would follow the "next" field and get to the correct fragment

*   make a testable RangeSectionMap.FindFragmentInternal

*   brief nibble map summary

*   [cdac] Implement NibbleMap lookup and tests

    The execution manager uses a nibble map to quickly map program counter
    pointers to the beginnings of the native code for the managed method.

    Implement the lookup algorithm for a nibble map.

    Start adding unit tests for the nibble map

    Also for testing in MockMemorySpace simplify ReaderContext, there's nothing special about the descriptor HeapFragments anymore.  We can use a uniform reader.

*   NibbleMap: fix bug and add test

    Previously we incorrectly computed the prior map index when doing the backward linear search

*   [testing] display Target values in hex in debugger views

*   MockMemorySpace: simplify ReaderContext

    there's nothing special about the descriptor HeapFragments anymore.  We can use a uniform reader

*   refactor ExecutionManager

*   ExecutionManager contract

    the presence of RangeSection.R2RModule is a discriminator for whether we're looking at EE code or R2R code

* use better cdac_data friends

* markdown cleanup

* document the ExecutionManager methods

* cache CodeBlock based on given code pointer, not start of the method

   The CodeBlock includes the relative offset (given ip - start of method) so it's not ok to share for different code pointers into the same method

* Make TestPlaceholderTarget data cache more useful; make TestRegistry lazy

* bugfix: StubCodeBlockLast is uint8

* add a simple bump allocator to MockMemorySpace

* Add a field layout algorithm to TargetTestHelpers

* move all the test builders to a separate class

* cleanup test infra

* remove a few more magic constants

* EECodeInfo -> CodeBlock

* EffectiveBitsForLevel -> GetIndexForLevel

* add documentation to managed contract impl

* RSLATestTarget -> RSMTestTarget

---------

Co-authored-by: Elinor Fung <[email protected]>
  • Loading branch information
2 people authored and mikelle-rogers committed Dec 4, 2024
1 parent b48d530 commit 3475298
Show file tree
Hide file tree
Showing 28 changed files with 1,854 additions and 121 deletions.
179 changes: 175 additions & 4 deletions docs/design/datacontracts/ExecutionManager.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,183 @@ managed method corresponding to that address.

## APIs of contract

**TODO**
```csharp
struct CodeBlockHandle
{
public readonly TargetPointer Address;
// no public constructor
internal CodeBlockHandle(TargetPointer address) => Address = address;
}
```

```csharp
// Collect execution engine info for a code block that includes the given instruction pointer.
// Return a handle for the information, or null if an owning code block cannot be found.
CodeBlockHandle? GetCodeBlockHandle(TargetCodePointer ip);
// Get the method descriptor corresponding to the given code block
TargetPointer GetMethodDesc(CodeBlockHandle codeInfoHandle);
// Get the instruction pointer address of the start of the code block
TargetCodePointer GetStartAddress(CodeBlockHandle codeInfoHandle);
```

## Version 1

**TODO** Methods
The execution manager uses two data structures to map the entire target address space to native executable code.
The range section map is used to partition the address space into large chunks which point to range section fragments. Each chunk is relatively large. If there is any executable code in the chunk, the chunk will contain one or more range section fragments that cover subsets of the chunk. Conversely if a massive method is JITed a single range section fragment may span multiple adjacent chunks.

Within a range section fragment, a nibble map structure is used to map arbitrary IP addresses back to the start of the method (and to the code header which immediately preceeeds the entrypoint to the code).

Data descriptors used:
| Data Descriptor Name | Field | Meaning |
| --- | --- | --- |
| RangeSectionMap | TopLevelData | pointer to the outermost RangeSection |
| RangeSectionFragment| ? | ? |
| RangeSection | ? | ? |
| RealCodeHeader | ? | ? |
| HeapList | ? | ? |



Global variables used:
| Global Name | Type | Purpose |
| --- | --- | --- |
| ExecutionManagerCodeRangeMapAddress | TargetPointer | Pointer to the global RangeSectionMap
| StubCodeBlockLast | uint8 | Maximum sentinel code header value indentifying a stub code block

Contracts used:
| Contract Name |
| --- |

The bulk of the work is done by the `GetCodeBlockHandle` API that maps a code pointer to information about the containing jitted method.

```csharp
private CodeBlock? GetCodeBlock(TargetCodePointer jittedCodeAddress)
{
RangeSection range = RangeSection.Find(_topRangeSectionMap, jittedCodeAddress);
if (range.Data == null)
{
return null;
}
JitManager jitManager = GetJitManager(range.Data);
if (jitManager.GetMethodInfo(range, jittedCodeAddress, out CodeBlock? info))
{
return info;
}
else
{
return null;
}
}
CodeBlockHandle? IExecutionManager.GetCodeBlockHandle(TargetCodePointer ip)
{
TargetPointer key = ip.AsTargetPointer;
if (/*cache*/.ContainsKey(key))
{
return new CodeBlockHandle(key);
}
CodeBlock? info = GetCodeBlock(ip);
if (info == null || !info.Valid)
{
return null;
}
/*cache*/.TryAdd(key, info);
return new CodeBlockHandle(key);
}
```

Here `RangeSection.Find` implements the range section lookup, summarized below.

There are two `JitManager`s: the "EE JitManager" for jitted code and "R2R JitManager" for ReadyToRun code.

The EE JitManager `GetMethodInfo` implements the nibble map lookup, summarized below, followed by returning the `RealCodeHeader` data:

```csharp
bool GetMethodInfo(RangeSection rangeSection, TargetCodePointer jittedCodeAddress, [NotNullWhen(true)] out CodeBlock? info)
{
TargetPointer start = FindMethodCode(rangeSection, jittedCodeAddress); // nibble map lookup
if (start == TargetPointer.Null)
{
return false;
}
TargetNUInt relativeOffset = jittedCodeAddress - start;
int codeHeaderOffset = Target.PointerSize;
TargetPointer codeHeaderIndirect = start - codeHeaderOffset;
if (RangeSection.IsStubCodeBlock(Target, codeHeaderIndirect))
{
return false;
}
TargetPointer codeHeaderAddress = Target.ReadPointer(codeHeaderIndirect);
Data.RealCodeHeader realCodeHeader = Target.ProcessedData.GetOrAdd<Data.RealCodeHeader>(codeHeaderAddress);
info = new CodeBlock(jittedCodeAddress, codeHeaderOffset, relativeOffset, realCodeHeader, rangeSection.Data!.JitManager);
return true;
}
```

The `CodeBlock` encapsulates the `RealCodeHeader` data from the target runtime together with the start of the jitted method

```csharp
class CodeBlock
{
private readonly int _codeHeaderOffset;

public TargetCodePointer StartAddress { get; }
// note: this is the address of the pointer to the "real code header", you need to
// dereference it to get the address of _codeHeaderData
public TargetPointer CodeHeaderAddress => StartAddress - _codeHeaderOffset;
private Data.RealCodeHeader _codeHeaderData;
public TargetPointer JitManagerAddress { get; }
public TargetNUInt RelativeOffset { get; }
public CodeBlock(TargetCodePointer startAddress, int codeHeaderOffset, TargetNUInt relativeOffset, Data.RealCodeHeader codeHeaderData, TargetPointer jitManagerAddress)
{
_codeHeaderOffset = codeHeaderOffset;
StartAddress = startAddress;
_codeHeaderData = codeHeaderData;
RelativeOffset = relativeOffset;
JitManagerAddress = jitManagerAddress;
}

public TargetPointer MethodDescAddress => _codeHeaderData.MethodDesc;
public bool Valid => JitManagerAddress != TargetPointer.Null;
}
```

The remaining contract APIs extract fields of the `CodeBlock`:

```csharp
TargetPointer IExecutionManager.GetMethodDesc(CodeBlockHandle codeInfoHandle)
{
/* find EECodeBlock info for codeInfoHandle.Address*/
return info.MethodDescAddress;
}

TargetCodePointer IExecutionManager.GetStartAddress(CodeBlockHandle codeInfoHandle)
{
/* find EECodeBlock info for codeInfoHandle.Address*/
return info.StartAddress;
}
```

### RangeSectionMap

The range section map logically partitions the entire 32-bit or 64-bit addressable space into chunks.
The map is implemented with multiple levels, where the bits of an address are used as indices into an array of pointers. The upper levels of the map point to the next level down. At the lowest level of the map, the pointers point to the first range section fragment containing addresses in the chunk.

On 32-bit targets a 2 level map is used

| 31-24 | 23-16 | 15-0 |
|:----:|:----:|:----:|
| L2 | L1 | chunk |

That is, level 2 in the map has 256 entries pointing to level 1 maps (or null if there's nothing allocated), each level 1 map has 256 entries covering a 64 KiB chunk and pointing to a linked list of range section fragments that fall within that 64 KiB chunk.

On 64-bit targets, we take advantage of the fact that most architectures don't support a full 64-bit addressable space: arm64 supports 52 bits of addressable memory and x86-64 supports 57 bits. The runtime ignores the top bits 63-57 and uses 5 levels of mapping

| 63-57 | 56-49 | 48-41 | 40-33 | 32-25 | 24-17 | 16-0 |
|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:----:|
| unused | L5 | L4 | L3 | L2 | L1 | chunk |

That is, level 5 has 256 entires pointing to level 4 maps (or nothing if there's no
code allocated in that address range), level 4 entires point to level 3 maps and so on. Each level 1 map has 256 entries covering a 128 KiB chunk and pointing to a linked list of range section fragments that fall within that 128 KiB chunk.

### NibbleMap

Expand Down Expand Up @@ -40,8 +212,7 @@ Suppose there is code starting at address 304 (0x130)

* Then the map index will be 304 / 32 = 9 and the byte offset will be 304 % 32 = 16
* Because addresses are 4-byte aligned, the nibble value will be 1 + 16 / 4 = 5 (we reserve 0 to mean no method).
* So the map unit containing index 9 will contain the value 0x5 << 24 (the map index 9 means we want the second nibble in the second map unit, and we number the nibbles starting from the most significant) , or
0x05000000
* So the map unit containing index 9 will contain the value 0x5 << 24 (the map index 9 means we want the second nibble in the second map unit, and we number the nibbles starting from the most significant) , or 0x05000000


Now suppose we do a lookup for address 306 (0x132)
Expand Down
2 changes: 2 additions & 0 deletions src/coreclr/debug/runtimeinfo/.editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[contracts.jsonc]
indent_size = 2
1 change: 1 addition & 0 deletions src/coreclr/debug/runtimeinfo/contracts.jsonc
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
"DacStreams": 1,
"EcmaMetadata" : 1,
"Exception": 1,
"ExecutionManager": 1,
"Loader": 1,
"Object": 1,
"RuntimeTypeSystem": 1,
Expand Down
43 changes: 43 additions & 0 deletions src/coreclr/debug/runtimeinfo/datadescriptor.h
Original file line number Diff line number Diff line change
Expand Up @@ -340,6 +340,47 @@ CDAC_TYPE_INDETERMINATE(DynamicMethodDesc)
CDAC_TYPE_FIELD(DynamicMethodDesc, /*pointer*/, MethodName, cdac_data<DynamicMethodDesc>::MethodName)
CDAC_TYPE_END(DynamicMethodDesc)

CDAC_TYPE_BEGIN(CodePointer)
CDAC_TYPE_SIZE(sizeof(PCODE))
CDAC_TYPE_END(CodePointer)

CDAC_TYPE_BEGIN(RangeSectionMap)
CDAC_TYPE_INDETERMINATE(RangeSectionMap)
CDAC_TYPE_FIELD(RangeSectionMap, /*pointer*/, TopLevelData, cdac_data<RangeSectionMap>::TopLevelData)
CDAC_TYPE_END(RangeSectionMap)

CDAC_TYPE_BEGIN(RangeSectionFragment)
CDAC_TYPE_INDETERMINATE(RangeSectionFragment)
CDAC_TYPE_FIELD(RangeSectionFragment, /*pointer*/, RangeBegin, cdac_data<RangeSectionMap>::RangeSectionFragment::RangeBegin)
CDAC_TYPE_FIELD(RangeSectionFragment, /*pointer*/, RangeEndOpen, cdac_data<RangeSectionMap>::RangeSectionFragment::RangeEndOpen)
CDAC_TYPE_FIELD(RangeSectionFragment, /*pointer*/, RangeSection, cdac_data<RangeSectionMap>::RangeSectionFragment::RangeSection)
CDAC_TYPE_FIELD(RangeSectionFragment, /*pointer*/, Next, cdac_data<RangeSectionMap>::RangeSectionFragment::Next)
CDAC_TYPE_END(RangeSectionFragment)

CDAC_TYPE_BEGIN(RangeSection)
CDAC_TYPE_INDETERMINATE(RangeSection)
CDAC_TYPE_FIELD(RangeSection, /*pointer*/, RangeBegin, cdac_data<RangeSection>::RangeBegin)
CDAC_TYPE_FIELD(RangeSection, /*pointer*/, RangeEndOpen, cdac_data<RangeSection>::RangeEndOpen)
CDAC_TYPE_FIELD(RangeSection, /*pointer*/, NextForDelete, cdac_data<RangeSection>::NextForDelete)
CDAC_TYPE_FIELD(RangeSection, /*pointer*/, JitManager, cdac_data<RangeSection>::JitManager)
CDAC_TYPE_FIELD(RangeSection, /*int32_t*/, Flags, cdac_data<RangeSection>::Flags)
CDAC_TYPE_FIELD(RangeSection, /*pointer*/, HeapList, cdac_data<RangeSection>::HeapList)
CDAC_TYPE_FIELD(RangeSection, /*pointer*/, R2RModule, cdac_data<RangeSection>::R2RModule)
CDAC_TYPE_END(RangeSection)

CDAC_TYPE_BEGIN(RealCodeHeader)
CDAC_TYPE_INDETERMINATE(RealCodeHeader)
CDAC_TYPE_FIELD(RealCodeHeader, /*pointer*/, MethodDesc, offsetof(RealCodeHeader, phdrMDesc))
CDAC_TYPE_END(RealCodeHeader)

CDAC_TYPE_BEGIN(CodeHeapListNode)
CDAC_TYPE_FIELD(CodeHeapListNode, /*pointer*/, Next, offsetof(HeapList, hpNext))
CDAC_TYPE_FIELD(CodeHeapListNode, /*pointer*/, StartAddress, offsetof(HeapList, startAddress))
CDAC_TYPE_FIELD(CodeHeapListNode, /*pointer*/, EndAddress, offsetof(HeapList, endAddress))
CDAC_TYPE_FIELD(CodeHeapListNode, /*pointer*/, MapBase, offsetof(HeapList, mapBase))
CDAC_TYPE_FIELD(CodeHeapListNode, /*pointer*/, HeaderMap, offsetof(HeapList, pHdrMap))
CDAC_TYPE_END(CodeHeapListNode)

CDAC_TYPES_END()

CDAC_GLOBALS_BEGIN()
Expand Down Expand Up @@ -369,6 +410,7 @@ CDAC_GLOBAL(DirectorySeparator, uint8, (uint8_t)DIRECTORY_SEPARATOR_CHAR_A)
CDAC_GLOBAL(MethodDescAlignment, uint64, MethodDesc::ALIGNMENT)
CDAC_GLOBAL(ObjectHeaderSize, uint64, OBJHEADER_SIZE)
CDAC_GLOBAL(SyncBlockValueToObjectOffset, uint16, OBJHEADER_SIZE - cdac_data<ObjHeader>::SyncBlockValue)
CDAC_GLOBAL(StubCodeBlockLast, uint8, STUB_CODE_BLOCK_LAST)
CDAC_GLOBAL_POINTER(ArrayBoundsZero, cdac_data<ArrayBase>::ArrayBoundsZero)
CDAC_GLOBAL_POINTER(ExceptionMethodTable, &::g_pExceptionClass)
CDAC_GLOBAL_POINTER(FreeObjectMethodTable, &::g_pFreeObjectMethodTable)
Expand All @@ -378,6 +420,7 @@ CDAC_GLOBAL_POINTER(StringMethodTable, &::g_pStringClass)
CDAC_GLOBAL_POINTER(SyncTableEntries, &::g_pSyncTable)
CDAC_GLOBAL_POINTER(MiniMetaDataBuffAddress, &::g_MiniMetaDataBuffAddress)
CDAC_GLOBAL_POINTER(MiniMetaDataBuffMaxSize, &::g_MiniMetaDataBuffMaxSize)
CDAC_GLOBAL_POINTER(ExecutionManagerCodeRangeMapAddress, cdac_data<ExecutionManager>::CodeRangeMapAddress)
CDAC_GLOBALS_END()

#undef CDAC_BASELINE
Expand Down
Loading

0 comments on commit 3475298

Please sign in to comment.