DonaldW's github pages
Unreal Source Explained (USE) is an Unreal source code analysis, based on profilers.
For more infomation, see the repo in github.
See Table of Contents for the complete content list. Some important contents are listed below,
To observe resource creation, the best view for us is the Allocation profiler. Below is the memory allocation summary of IOAccelResourceCreate()
, which is the entrance of Metal’s resource allocations. We can see that GPU memory is allocated for various kinds of resources, e.g., textures, encoders, (vertex and index) buffers and shaders, etc.
Unreal uses mtlpp, a C++ Metal wrapper, to glue its RHI codes and Metal APIs together, we can see almost all
IOAccelResourceCreate()
is called indirectly frommtlpp
.
We can dig deeper and summerize by this following table.,
Resource Allocation | Reference Variable | Engine API | RHI API High Level |
RHI API Low Level |
Wrapper API (mtlpp) |
Graphic API (Metal) |
---|---|---|---|---|---|---|
Texture | UTexture::TextureReference (link) |
FTexture2DResource::InitRHI() (link), etc. |
CreateTexture2D() (link) |
FDynamicRHI::RHICreateTexture2D() ’s implementation(link) |
mtlpp::Device::NewTexture() |
MTLIOAccelTexture initWithDevice] |
Vertex Buffer | FVertexBuffer::VertexBufferRHI (link) |
FPositionVertexBuffer::InitRHI() (link),FColorVertexBuffer::InitRHI() (link),FStaticMeshVertexBuffer::InitRHI() (link),etc. |
RHICreateVertexBuffer() (link) |
FDynamicRHI::RHICreateVertexBuffer() ’s implmentation(link) |
mtlpp::Device::NewBuffer() |
[MTLIOAccelBuffer initWithDevice] |
Index Buffer | FIndexBuffer::IndexBufferRHI (link) |
FRawStaticIndexBuffer::InitRHI() FParticleIndexBuffer::InitRHI() ,etc. |
RHICreateIndexBuffer() (link) |
FDynamicRHI::RHICreateIndexBuffer() ’s implmentation(link) |
same as above | same as above |
Uniform Buffer | FPrimitiveSceneProxy::UniformBuffer (link)FMaterialRenderProxy::UniformExpressionCache[] (link),etc. |
FPrimitiveSceneProxy::UpdateUniformBuffer() (link)FMaterialRenderProxy::EvaluateUniformExpressions() (link)TUniformBufferRef<TBufferStruct>::CreateUniformBufferImmediate() (link),etc. |
CreateUniformBuffer() (link) |
FDynamicRHI::RHICreateUniformBuffer() ’s implmentation(link) |
same as above | same as above |
Shader | GGraphicsPipelineCache (link) |
FMeshDrawCommand::SubmitDraw() (link), etc. |
SetGraphicsPipelineState() (link) |
FDynamicRHI::RHICreateGraphicsPipelineState() (link) |
mtlpp::Device::NewRenderPipelineState() |
[MTLCompiler compileFunctionInternal] |
RenderPass | GRHICommandList (link) |
FMobileSceneRenderer::Render() (link), etc. |
BeginRenderPass() (link) |
IRHICommandContext::RHIBeginRenderPass() ’s implementation(link) |
mtlpp::CommandBuffer::RenderCommandEncoder() |
MTLIOAccelCommandBufferStorageAllocResourceAtIndex |
All resources are allocated insdie render thread. Vertex buffer, index buffer, and textures are allocated from various InitRHI()
s since they all implement FRenderResource::InitRHI()
(link) abstract method.
For your information, below is the full call stack for various allocaiton,
So far, we know Unreal allocates various resources and use FRenderResource
’s derived implementation to hold the resource reference. But, how does it pass resources from the material shader into the GPU?
Recall in the Rendering Basics chapter that Unreal’s material has lots of expressions, such as texture sample node. So the key question is, how does one expression relate to the resource?
In FUniformExpressionSet::FillUniformBuffer()
(link), expressions’ value is extracted and fill into TempBuffer
.
void FUniformExpressionSet::FillUniformBuffer(const FMaterialRenderContext& MaterialRenderContext, uint8* TempBuffer, ...) const {
...
void* BufferCursor = TempBuffer;
...
// Dump vector expression into the buffer.
for(int32 VectorIndex = 0;VectorIndex < UniformVectorExpressions.Num();++VectorIndex) {
FLinearColor VectorValue(0, 0, 0, 0);
UniformVectorExpressions[VectorIndex]->GetNumberValue(MaterialRenderContext, VectorValue);
FLinearColor* DestAddress = (FLinearColor*)BufferCursor;
*DestAddress = VectorValue;
BufferCursor = DestAddress + 1;
}
// Cache 2D texture uniform expressions.
for(int32 ExpressionIndex = 0;ExpressionIndex < Uniform2DTextureExpressions.Num();ExpressionIndex++) {
const UTexture* Value;
Uniform2DTextureExpressions[ExpressionIndex]->GetTextureValue(MaterialRenderContext,MaterialRenderContext.Material,Value);
void** ResourceTableTexturePtr = (void**)((uint8*)BufferCursor + 0 * SHADER_PARAMETER_POINTER_ALIGNMENT);
void** ResourceTableSamplerPtr = (void**)((uint8*)BufferCursor + 1 * SHADER_PARAMETER_POINTER_ALIGNMENT);
BufferCursor = ((uint8*)BufferCursor) + (SHADER_PARAMETER_POINTER_ALIGNMENT * 2);
...
*ResourceTableTexturePtr = Value->TextureReference.TextureReferenceRHI;
FSamplerStateRHIRef* SamplerSource = &Value->Resource->SamplerStateRHI;
*ResourceTableSamplerPtr = *SamplerSource;
}
...
}
Where dose TempBuffer
come from? From the call stack below, we can find out.
Expressions’ value is recoreded in FMaterialRenderProxy::UniformExpressionCache[]
(link). That makes sense, because FMaterialRenderProxy
is the render proxy of a material.
/**
* A material render proxy used by the renderer.
*/
class ENGINE_VTABLE FMaterialRenderProxy : public FRenderResource {
public:
/** Cached uniform expressions. */
mutable FUniformExpressionCache UniformExpressionCache[ERHIFeatureLevel::Num];
/** Cached external texture immutable samplers */
mutable FImmutableSamplerState ImmutableSamplerState;
/**
* Evaluates uniform expressions and stores them in OutUniformExpressionCache.
* @param OutUniformExpressionCache - The uniform expression cache to build.
* @param MaterialRenderContext - The context for which to cache expressions.
*/
void ENGINE_API EvaluateUniformExpressions(FUniformExpressionCache& OutUniformExpressionCache, const FMaterialRenderContext& Context, class FRHICommandList* CommandListIfLocalMode = nullptr) const;
virtual bool GetVectorValue(const FMaterialParameterInfo& ParameterInfo, FLinearColor* OutValue, const FMaterialRenderContext& Context) const = 0;
virtual bool GetScalarValue(const FMaterialParameterInfo& ParameterInfo, float* OutValue, const FMaterialRenderContext& Context) const = 0;
virtual bool GetTextureValue(const FMaterialParameterInfo& ParameterInfo,const UTexture** OutValue, const FMaterialRenderContext& Context) const = 0;
// FRenderResource interface.
ENGINE_API virtual void InitDynamicRHI() override;
ENGINE_API virtual void ReleaseDynamicRHI() override;
ENGINE_API virtual void ReleaseResource() override;
...
private:
/**
* Tracks all material render proxies in all scenes, can only be accessed on the rendering thread.
* This is used to propagate new shader maps to materials being used for rendering.
*/
ENGINE_API static TSet<FMaterialRenderProxy*> MaterialRenderProxyMap;
...
};
FUniformExpressionCache
wraps a FUniformBufferRHIRef
, which essentially is a reference to FUniformBufferRHI
(link).
So, the key question is, what is a uniform buffer?
Unreal uses Uniform Buffer to represent Constant Buffer and Resource Table in RHI. Different graphic API implements FUniformBufferRHI
to create the actual constant buffer and resource table.
From UniformExpressionSet::FillUniformBuffer()
above, we know it fills binary streams into the content of FUniformBufferRHI
. It fills color as value into the binary stream as FLinearColor
, and it fills texture as pointer void*
with TextureReferenceRHI
. Note all FXXXRHIRef
is of type TRefCountPtr<XXX>
, and TRefCountPtr
is just a plain object with only one (RHI object) pointer(link) and no virtual methods hence no vtable, hence this pointer is copied into the binary stream . See also C++ Object Model for the C++ memory layout details.
Which also means, if we need to modify
TRefCountPtr
(What?!), we must keep its memory layout’s first word is always this RHI pointer, do not add virtual functions or new data member before this pointer
After the uniform binary stream content is filled, the content is pass to the RHI to create a new, or update an existed, uniform buffer(link).
void FMaterialRenderProxy::EvaluateUniformExpressions(FUniformExpressionCache& OutUniformExpressionCache, const FMaterialRenderContext& Context, ...) const
{
...
const FRHIUniformBufferLayout& UniformBufferLayout = UniformExpressionSet.GetUniformBufferLayout();
FMemMark Mark(FMemStack::Get());
uint8* TempBuffer = FMemStack::Get().PushBytes(UniformBufferLayout.ConstantBufferSize, SHADER_PARAMETER_STRUCT_ALIGNMENT);
UniformExpressionSet.FillUniformBuffer(Context, OutUniformExpressionCache, TempBuffer, UniformBufferLayout.ConstantBufferSize);
...
if (IsValidRef(OutUniformExpressionCache.UniformBuffer)) {
RHIUpdateUniformBuffer(OutUniformExpressionCache.UniformBuffer, TempBuffer);
}
else {
OutUniformExpressionCache.UniformBuffer = RHICreateUniformBuffer(TempBuffer, UniformBufferLayout, UniformBuffer_MultiFrame);
}
...
OutUniformExpressionCache.bUpToDate = true;
}
Note when creating, content binary stream’s layout format is also pass to RHI to deserialize the uniform buffer.
Different graphic API has different implementation of FRHIUniformBuffer
, and holds the actual GPU buffer.
class FRHIUniformBuffer : public FRHIResource {
...
/** Layout of the uniform buffer. */
const FRHIUniformBufferLayout* Layout;
uint32 LayoutConstantBufferSize;
};
FD3D11UniformBuffer
(link)’s Resource
holds the actual D3D11 buffer pointer:
/** Uniform buffer resource class. */
class FD3D11UniformBuffer : public FRHIUniformBuffer {
public:
/** The D3D11 constant buffer resource */
TRefCountPtr<ID3D11Buffer> Resource;
...
/** Resource table containing RHI references. */
TArray<TRefCountPtr<FRHIResource> > ResourceTable;
...
private:
class FD3D11DynamicRHI* D3D11RHI;
};
FMetalUniformBuffer
(link)’s Buffer
hold the actual Metal buffer reference which stores the contant value.
class FMetalRHIBuffer;
class FMetalUniformBuffer : public FRHIUniformBuffer, public FMetalRHIBuffer {
...
/** Resource table containing RHI references. */
TArray<TRefCountPtr<FRHIResource> > ResourceTable;
};
class FMetalRHIBuffer {
...
// balsa buffer memory
FMetalBuffer Buffer;
// A temporary shared/CPU accessible buffer for upload/download
FMetalBuffer CPUBuffer;
/** Buffer for small buffers < 4Kb to avoid heap fragmentation. */
FMetalBufferData* Data;
// Initial buffer size.
uint32 Size;
...
};
In Direct3D, it’s created via FD3D11DynamicRHI::RHICreateUniformBuffer()
(link),
In iOS, FMetalUniformBuffer::FMetalUniformBuffer()
(link) allocates about 120MB memory, which is huge.
To sum up, again, Unreal uses uniform buffer to reference the constant values buffer and the FRHIResource
table. The constant value buffer is allocated in the GPU memory. The ResourceTable
is an TArray<FRHIResource>
allocated in the heap to store used pointer to the various FRHIResource
objects, such as FRHITexture2D
(‘s actual implementation, e.g. FMetalTexture2D
).
Now, we’ve known how various resources are allocated, and how they are recorded in the uniform buffer. The next question is, how does the uniform buffer is chosen, and bind the resource to the GPU. Who does the job to pass the actual resource argument to the shader?
The mechanism Shader (Resource) binding binds a shader’s resources to the shader.
Shader bindings are stored in FMeshDrawCommand
, see the ownership chain below,
Shader binding data:
FMeshDrawShaderBindings FMeshDrawCommand::ShaderBindings
FData FMeshDrawShaderBindings::Data
(link)Shader binding layout:
FMeshDrawShaderBindings FMeshDrawCommand::ShaderBindings
TArray<FMeshDrawShaderBindingsLayout> FMeshDrawShaderBindings::ShaderLayouts
(link)class FMeshDrawSingleShaderBindings : public FMeshDrawShaderBindingsLayout
const FShaderParameterMapInfo& FMeshDrawShaderBindingsLayout::ParameterMapInfo
During building the mesh draw commands, FMeshPassProcessor::BuildMeshDrawCommands<..>()
pulls the shader binding data from the shader, as follows,
FShaderParameterMapInfo
(link) describes the layout of shader’s parameters. It contains parameters’ base index and size of various resources (e.g., Uniform Buffers, Texture Samplers, SRVs and loose parameter buffers). It’s serialized into FShaderResource::ParameterMapInfo
during the game start up or the new level streamed in.
class FShaderParameterInfo
{
public:
uint16 BaseIndex;
uint16 Size;
...
};
class FShaderParameterMapInfo
{
public:
TArray<FShaderParameterInfo> UniformBuffers;
TArray<FShaderParameterInfo> TextureSamplers;
TArray<FShaderParameterInfo> SRVs;
TArray<FShaderLooseParameterBufferInfo> LooseParameterBuffers;
...
};
FMeshDrawShaderBindingsLayout
(link) references one FShaderParameterMapInfo
and provides some additional layout accessors.
/** Stores the number of each resource type that will need to be bound to a single shader, computed during shader reflection. */
class FMeshDrawShaderBindingsLayout
{
public:
const FShaderParameterMapInfo& ParameterMapInfo;
...
protected:
inline uint32 GetUniformBufferOffset() const { return 0; }
inline uint32 GetSamplerOffset() const
{
return ParameterMapInfo.UniformBuffers.Num() * sizeof(FRHIUniformBuffer*);
}
...
friend class FMeshDrawShaderBindings;
};
FMeshDrawSingleShaderBindings
(link) inherits from FMeshDrawShaderBindingsLayout
, and does the actual resource binding according to the layout. Its Data
is a pointer to the FMeshDrawShaderBindings::Data
(link) and is a binary stream recording shader resources’ references.
class FMeshDrawSingleShaderBindings : public FMeshDrawShaderBindingsLayout
{
private:
uint8* Data;
public:
template<typename UniformBufferStructType>
void Add(const TShaderUniformBufferParameter<UniformBufferStructType>& Parameter, const TUniformBufferRef<UniformBufferStructType>& Value)
{
...
// writes value to `Data` with correct offset
WriteBindingUniformBuffer(Value.GetReference(), Parameter.GetBaseIndex());
}
...
void AddTexture(
FShaderResourceParameter TextureParameter,
FShaderResourceParameter SamplerParameter,
FRHISamplerState* SamplerStateRHI,
FRHITexture* TextureRHI)
{
...
// writes value to `Data` with correct offset
WriteBindingTexture(TextureRHI, TextureParameter.GetBaseIndex());
...
WriteBindingSampler(SamplerStateRHI, SamplerParameter.GetBaseIndex());
}
...
};
But since BuildMeshDrawCommands<>()
(link) is the only place that do the shader bindings, how can it handles various different bindings? In fact, it’s a template method, the template argument make it possible to handle it, see the snippet below,
template<typename PassShadersType, typename ShaderElementDataType>
void FMeshPassProcessor::BuildMeshDrawCommands(..., PassShadersType PassShaders, const ShaderElementDataType& ShaderElementData)
{
...
if (PassShaders.VertexShader)
{
FMeshDrawSingleShaderBindings ShaderBindings = SharedMeshDrawCommand.ShaderBindings.GetSingleShaderBindings(SF_Vertex);
PassShaders.VertexShader->GetShaderBindings(..., ShaderElementData, ShaderBindings);
}
if (PassShaders.PixelShader) { ... }
...
const int32 NumElements = MeshBatch.Elements.Num();
for (int32 BatchElementIndex = 0; BatchElementIndex < NumElements; BatchElementIndex++)
{
if ((1ull << BatchElementIndex) & BatchElementMask)
{
const FMeshBatchElement& BatchElement = MeshBatch.Elements[BatchElementIndex];
FMeshDrawCommand& MeshDrawCommand = DrawListContext->AddCommand(SharedMeshDrawCommand);
if (PassShaders.VertexShader)
{
FMeshDrawSingleShaderBindings VertexShaderBindings = MeshDrawCommand.ShaderBindings.GetSingleShaderBindings(SF_Vertex);
PassShaders.VertexShader->GetElementShaderBindings(..., ShaderElementData, VertexShaderBindings);
}
if (PassShaders.PixelShader) { ... }
...
}
}
}
Based on the input template argument PassShadersType
and ShaderElementDataType
, BuildMeshDrawCommands<>()
can handle different passes and diffrent shader bindings, by calling the template’s.
Take TMobileBasePassPSPolicyParamType<FUniformLightMapPolicy>::GetShaderBindings()
(link) for example, which is the most common pixel shader parameter which handles lightmap,
void FUniformLightMapPolicy::GetPixelShaderBindings(
const FPrimitiveSceneProxy* PrimitiveSceneProxy,
const ElementDataType& ShaderElementData,
const PixelParametersType* PixelShaderParameters,
FMeshDrawSingleShaderBindings& ShaderBindings)
{
FRHIUniformBuffer* PrecomputedLightingBuffer = nullptr;
FRHIUniformBuffer* LightmapResourceClusterBuffer = nullptr;
FRHIUniformBuffer* IndirectLightingCacheBuffer = nullptr;
SetupLCIUniformBuffers(PrimitiveSceneProxy, ShaderElementData, PrecomputedLightingBuffer, LightmapResourceClusterBuffer, IndirectLightingCacheBuffer);
ShaderBindings.Add(PixelShaderParameters->PrecomputedLightingBufferParameter, PrecomputedLightingBuffer);
ShaderBindings.Add(PixelShaderParameters->IndirectLightingCacheParameter, IndirectLightingCacheBuffer);
ShaderBindings.Add(PixelShaderParameters->LightmapResourceCluster, LightmapResourceClusterBuffer);
}
Its ShaderElementData
is of type const TMobileBasePassShaderElementData<FUniformLightMapPolicy>&
, therefore, it can handles custom shader bindings about lightmaps, i.e., it calls FUniformLightMapPolicy::GetPixelShaderBindings()
with ShaderElementData.LightMapPolicyElementData
.
Note that the lightmap shader resources is recored in a “uniform buffer”, which records resource “handle”s, not the resource data, see the image below,
After the uniform buffer pointer is collected, FUniformLightMapPolicy::GetPixelShaderBindings()
add them into the shader binding with Shader Parameter as the key.
The actual lightmap shader parameter is decaled as below(link),
BEGIN_GLOBAL_SHADER_PARAMETER_STRUCT(FLightmapResourceClusterShaderParameters,ENGINE_API)
SHADER_PARAMETER_TEXTURE(Texture2D, LightMapTexture)
SHADER_PARAMETER_TEXTURE(Texture2D, SkyOcclusionTexture)
SHADER_PARAMETER_SAMPLER(SamplerState, LightMapSampler)
SHADER_PARAMETER_SAMPLER(SamplerState, SkyOcclusionSampler)
...
END_GLOBAL_SHADER_PARAMETER_STRUCT()
That’s the boilerplate to declare shader parameters in c++, see macro BEGIN_SHADER_PARAMETER_STRUCT
(link) and BEGIN_GLOBAL_SHADER_PARAMETER_STRUCT
(link) for more details.
In the end, these parameters are translated into the actual shader parameter, based on the actual graphic API, e.g., Metal in iOS: