Author Topic: Rendering performance for large batches  (Read 97 times)

Presently for batch rendering with the commandline tools I've been modifying an SBS file(looking up resource bitmaps and adjusting the external linked filepath), exporting the SBSAR with sbscooker and rendering with sbsrender, the performance is a bit slow and GPU/CPU does not seem to be that high in usage, I'm assuming the I/O of writing the modifications + creating new SBSAR to render each time is responsible?

I have seen that the python demos using batchtools_utilities.py can split the cooking/rendering into a queue of tasks to make more use of the CPU, to make more use of the GPU would it be wise to use convert the substance graph to an instance of inputs nodes(instead of direct bitmap nodes) with a main graph containing the instances and bitmap nodes connected into the instanced graph so that I can render a chunk of a sequence at a time? Is there an advisable amount(I guess not as it'd depend on available GPUs and their memory/performance?)

I'm not at a point where I can give strong advice around this quite yet but I see performance improvements when running things in parallel so there are certainly tasks that are IO bound in addition but I'm not sure what the perfect balance is for every machine but there are I/O and serial stages in some of the processes and you can make significant gains by running more things.
Another performance benefit you can get in sbsrender is making sure you are using the dx10 engine (on windows). Add something like this to the command line of sbsrender: --engine d3d10pc.

Let me know if this helps or if you need more details.

As mentioned, I would modfiy an sbs file, export the sbsar and then render, repeating the process several thousand times. The sbsar files would be several MB with current substances. I'm guessing beyond that, there might be some warmup/init time that could be avoided/reduced by having a substance that instances multiple copies of the graph I'm modifying/rendering, so that I could reduce the I/O/render calls by 5-10x say.

Depends if the approach will have substance use more of the GPU resources(which I note sbsrender has a default vram limit of 1000MB that I'd need to be aware of as well). I do know that one of the demos I saw for SAT rendered many texture outputs at once(our substances take several inputs but only one output is rendered by it's graph).

I'll be giving it a try soon to see if it reduces rendering time, if not I'm sure utilizing more cores will help too :) I have an idea how to scale the resource usage to make the most of the given hardware, although I'm curious about how sbsrender chooses which GPU(s) to use and if I can influence priority/selection.

Yes, I'm already using the appropriate engine parameter like d3d10pc on Windows, I found out about it in the past when I was confused why the outputs were capping the render size to 2048px.

Working in your scale these are all valid points.
We are working on a way of telling sbsbaker to bake multiple maps in one call to reduce the overhead in fbx loading etc and it might make sense to add that type of mode for sbsrender too.
If I allow myself to speculate a bit here (I don't know what your data looks like) I would recommend you to work with uncompressed texture formats for input images if you can take the hit in disk space since I would assume a lot of the overhead is in image loading and it might make a difference on your scale.