Hi ,
Thanks for reaching out to Microsoft Q&A.
in synapse pipelines, you cannot dynamically choose a spark pool at runtime using available built in features in the notebook activity. The spark pool is statically defined in the notebook activity configuration. But, there are practical workarounds to achieve your goal of dynamic pool selection and improved parallelism.
Workarounds that you can try:
- Use Web Activity + REST API to trigger Notebooks
Instead of using the built-in Notebook activity, use Web activity to call the synapse REST API and start the notebook programmatically. This lets you pass the Spark pool name as a parameter.
Steps:
- Use Synapse REST API:
- Endpoint:
POST https://<workspace-name>.dev.azuresynapse.net/pipelines/<pipeline-name>/createRun?api-version=2020-12-01
You can invoke a pipeline or notebook with parameters. - Create multiple pipelines, each tied to a different Spark pool (each hardcoded).
- From the "controller" pipeline, use logic (If Condition + Web Activity) to route to one of those child pipelines depending on some dynamic condition (like iteration ID, load, etc.).
This is indirect dynamic routing.
- Create Multiple Pipelines or Activities with Different Pools
Define three separate notebook activities in the pipeline, each configured with a different Spark pool, and use an If Condition
activity to select which one to run based on input parameters.
- Use Azure Function to Orchestrate
Create an Azure Function that:
Receives pipeline input
Applies logic to choose the Spark pool
Calls Synapse REST API to run notebook on selected pool
You call the function from synapse pipeline using web activity. This is more flexible but adds azure function as a dependency.
- Partition Input and Run Multiple Pipelines
If you control the orchestration:
Split input data/workload into parts
Start three separate pipelines, each tied to a different Spark pool
Use parallel Execute Pipeline
activities
This ensures parallel runs and avoids pool bottlenecks.
Limitations
No native support for setting Spark pool name as a parameter inside a notebook activity.
You cannot change the pool used by a notebook at runtime unless using the REST API.
Recommendation
If your goal is maximizing parallelism with multiple pipelines, the simplest solution is:
Create multiple pipelines with different pools
Use a controller pipeline to decide which to run
OR, split notebook activities in the same pipeline with If Condition
blocks
If you need more flexibility, go with the REST API approach or Azure Function orchestration.
Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.