Deployment of resource type MachineLearning fails using Lighthouse

Question

Deployment of resource type MachineLearning fails using Lighthouse

Ivelin Andreev 25

We are deploying a large number of Azure resources to customer subscription using AZ Lighthouse. The ARM templates work well when executed with a member/guest user directly in the subscription, but do not when we use resource delegation with Lighthouse:

The error is AccessDenied
The whole deployment stalls and after hours the whole deployment fails with a timeout.

As of request interaction with support team, Ref# 2504240050002808, we shall raise the topic to Lighthouse PG for further actions. The ticket was managed by the Machine Learning service support team. During our online meeting with them they checked an internal documentation and made the statement that Lighthouse does not support ML Service.

Could you please verify and provide a workaround or estimated time for support.

Cheers

Ashok Gandhi Kotnana 10,430 Reputation points Microsoft External Staff Moderator

2025-05-16T15:43:14.2033333+00:00

Hi @Ivelin Andreev,

The key error in your message is:

'Empty oid in access token claim'

This means:

The identity (token) used in the deployment request does not contain an oid (Object ID).

This typically happens when:

A managed identity or service principal from a different tenant (via Lighthouse) is used

And Azure Machine Learning (Microsoft.MachineLearningServices) internally requires a user or service principal with a valid oid in the token

But Lighthouse delegates don't always pass valid oid claims for internal service requests

Azure ML (and Azure AI Hub built on it):

Uses authorization checks against Entra ID directly

Often makes internal calls from the ML control plane to services like authorization.vienna-swedencentral.svc

These calls fail if the access token from a delegated principal (via Lighthouse) lacks a proper oid

In contrast, most ARM-based services (VMs, Storage, etc.) only check the RBAC assignment — and work fine through Lighthouse.

Unfortunately, there is no full support for Azure ML workspace creation via delegated identities through Lighthouse today, due to internal API calls requiring an oid in the token — which delegated identities don’t provide.

If you found the answer helpful, please consider upvoting it. Let me know if the issue is resolved. If you encountered any issues or need further clarification, please feel free to let us know, as we are always here to help whenever you need us.
Ashok Gandhi Kotnana 10,430 Reputation points Microsoft External Staff Moderator

2025-05-19T03:27:42.5133333+00:00

Hi @Ivelin Andreev,

I just wanted to check if the above provided information helped you or if you need any further assistance?

Please feel free to let us know, as we are always here to help whenever you need us.

Additionally, if you have a moment, please do upvote it if the information provided has been helpful. This can be beneficial to other community members and would be greatly appreciated.

Thank you.
Ashok Gandhi Kotnana 10,430 Reputation points Microsoft External Staff Moderator

2025-05-20T06:36:18.9433333+00:00

Hi @Ivelin Andreev,

We haven't heard back from you. Please reply if you have any questions in this matter and we will gladly continue the discussion.

Please do not forget to “upvote it” wherever the information provided helps you, this can be beneficial to other community members.it would be greatly appreciated and helpful to others.
Anonymous

2025-05-26T16:59:46.75+00:00

Hi Ivelin Andreev,

We haven’t heard from you on the last response, and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, please respond with more details and we will try to help.
Ivelin Andreev 25 Reputation points

2025-05-26T20:23:19.3066667+00:00

@Ashok Gandhi Kotnana , I was N/A for a week but did not receive a notification via email for the response, hence my comment back was delayed. Thank you for the swift reply!
As of your comment it appears we hit a limitation and there is no full support for Azure ML workspace creation using resource delegation. Could you please specify whether there is a workaround except for running the ARM template within the same tenant?

Greetings
Anonymous

2025-05-28T07:59:28.8766667+00:00

Hi Ivelin Andreev,

We haven’t heard from you on the last response, and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, please respond with more details and we will try to help.
Ivelin Andreev 25 Reputation points

2025-05-28T23:12:52.2866667+00:00

@Vinay B , Thank you for the frank statement that what we need is not supported. As mentioned, we are using the workaround with member users, but it does not work for all environments we have, because we are not granted such access. We instead have to ask customers to execute this, which is quite uncomfortable and makes some bad impression of low UX. I somehow also understand that due to the fundamental architectural challenge, such support shall not be expected in near future.

In this respect it appears that this shall be converted (or linked to existing) feature request.
Anonymous

2025-05-29T10:57:57.8933333+00:00

Hi Ivelin Andreev,

Thank you again for the thoughtful and transparent follow-up. You're absolutely right relying on customer-executed steps not only introduces friction but also undermines the very goals of automation, delegation, and scalability that Azure Lighthouse is meant to deliver. Your point about the user experience is well taken and entirely valid.

Given the architectural constraints around oid handling in cross-tenant contexts, it’s clear that this isn’t a simple implementation gap but a deeper integration challenge that requires coordinated changes across multiple Azure control planes.

That said, your scenario strongly reinforces the need for first-class support of Azure Machine Learning in Lighthouse-based deployments.

Please let us know if it was helpful, and feel free to reach out if you have any further queries.
Ivelin Andreev 25 Reputation points

2025-05-29T22:33:13.4333333+00:00

@Vinay B, thanks you as well. Besides the comment it is not made clear whether there is a ticket or I shall create such somehow that the PG could spot and potentially take into account.
Anonymous

2025-05-30T11:12:32.8066667+00:00

Hi Ivelin Andreev,

Please review the private message where I explained in detail about the ticket how we handle this issue from now on priavtely.

Thanks

Accepted answer

1 additional answer

Your answer

Ashok Gandhi Kotnana 10,430 Reputation points Microsoft External Staff Moderator

2025-05-16T15:43:14.2033333+00:00

Hi @Ivelin Andreev,

The key error in your message is:

'Empty oid in access token claim'

This means:

The identity (token) used in the deployment request does not contain an oid (Object ID).

This typically happens when:

A managed identity or service principal from a different tenant (via Lighthouse) is used

And Azure Machine Learning (Microsoft.MachineLearningServices) internally requires a user or service principal with a valid oid in the token

But Lighthouse delegates don't always pass valid oid claims for internal service requests

Azure ML (and Azure AI Hub built on it):

Uses authorization checks against Entra ID directly

Often makes internal calls from the ML control plane to services like authorization.vienna-swedencentral.svc

These calls fail if the access token from a delegated principal (via Lighthouse) lacks a proper oid

In contrast, most ARM-based services (VMs, Storage, etc.) only check the RBAC assignment — and work fine through Lighthouse.

Unfortunately, there is no full support for Azure ML workspace creation via delegated identities through Lighthouse today, due to internal API calls requiring an oid in the token — which delegated identities don’t provide.

If you found the answer helpful, please consider upvoting it. Let me know if the issue is resolved. If you encountered any issues or need further clarification, please feel free to let us know, as we are always here to help whenever you need us.
Ashok Gandhi Kotnana 10,430 Reputation points Microsoft External Staff Moderator

2025-05-19T03:27:42.5133333+00:00

Hi @Ivelin Andreev,

I just wanted to check if the above provided information helped you or if you need any further assistance?

Please feel free to let us know, as we are always here to help whenever you need us.

Additionally, if you have a moment, please do upvote it if the information provided has been helpful. This can be beneficial to other community members and would be greatly appreciated.

Thank you.
Ashok Gandhi Kotnana 10,430 Reputation points Microsoft External Staff Moderator

2025-05-20T06:36:18.9433333+00:00

Hi @Ivelin Andreev,

We haven't heard back from you. Please reply if you have any questions in this matter and we will gladly continue the discussion.

Please do not forget to “upvote it” wherever the information provided helps you, this can be beneficial to other community members.it would be greatly appreciated and helpful to others.
Anonymous

2025-05-26T16:59:46.75+00:00

Hi Ivelin Andreev,

We haven’t heard from you on the last response, and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, please respond with more details and we will try to help.
Ivelin Andreev 25 Reputation points

2025-05-26T20:23:19.3066667+00:00

@Ashok Gandhi Kotnana , I was N/A for a week but did not receive a notification via email for the response, hence my comment back was delayed. Thank you for the swift reply!
As of your comment it appears we hit a limitation and there is no full support for Azure ML workspace creation using resource delegation. Could you please specify whether there is a workaround except for running the ARM template within the same tenant?

Greetings
Anonymous

2025-05-28T07:59:28.8766667+00:00

Hi Ivelin Andreev,

We haven’t heard from you on the last response, and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, please respond with more details and we will try to help.
Ivelin Andreev 25 Reputation points

2025-05-28T23:12:52.2866667+00:00

@Vinay B , Thank you for the frank statement that what we need is not supported. As mentioned, we are using the workaround with member users, but it does not work for all environments we have, because we are not granted such access. We instead have to ask customers to execute this, which is quite uncomfortable and makes some bad impression of low UX. I somehow also understand that due to the fundamental architectural challenge, such support shall not be expected in near future.

In this respect it appears that this shall be converted (or linked to existing) feature request.
Anonymous

2025-05-29T10:57:57.8933333+00:00

Hi Ivelin Andreev,

Thank you again for the thoughtful and transparent follow-up. You're absolutely right relying on customer-executed steps not only introduces friction but also undermines the very goals of automation, delegation, and scalability that Azure Lighthouse is meant to deliver. Your point about the user experience is well taken and entirely valid.

Given the architectural constraints around oid handling in cross-tenant contexts, it’s clear that this isn’t a simple implementation gap but a deeper integration challenge that requires coordinated changes across multiple Azure control planes.

That said, your scenario strongly reinforces the need for first-class support of Azure Machine Learning in Lighthouse-based deployments.

Please let us know if it was helpful, and feel free to reach out if you have any further queries.
Ivelin Andreev 25 Reputation points

2025-05-29T22:33:13.4333333+00:00

@Vinay B, thanks you as well. Besides the comment it is not made clear whether there is a ticket or I shall create such somehow that the PG could spot and potentially take into account.
Anonymous

2025-05-30T11:12:32.8066667+00:00

Hi Ivelin Andreev,

Please review the private message where I explained in detail about the ticket how we handle this issue from now on priavtely.

Thanks

Answer 1

Hi Ivelin Andreev,

The challenge you’re encountering stems from a fundamental architectural constraint in how Azure Machine Learning (AML) handles identity when used in cross-tenant scenarios via Azure Lighthouse. Specifically, AML requires access tokens that include a valid oid (Object ID) claim, which delegated service principal tokens commonly used via Lighthouse often lack. As a result, internal control plane operations fail authorization, leading to long-running deployments that eventually time out.

At this time, Azure Machine Learning does not fully support creation or management through delegated identities enabled via Azure Lighthouse. This limitation has been confirmed internally through Microsoft support engagements and aligns with how the AML backend validates identity during resource provisioning.

Recommended Workarounds:

To proceed while staying within supported boundaries, here are a few alternatives you can adopt:

Use a service principal or managed identity native to the customer tenant (i.e., where the ML workspace is being deployed) to ensure tokens contain the correct oid claims.

Run deployment tasks using Azure DevOps agents or Azure Automation hosted in the customer tenant, where full Entra ID context is available.

Split deployment into stages deploy delegate-supported resources through Lighthouse, and deploy AML workspaces using an identity that exists within the customer’s tenant.

To ensure this gap is visible to the Product Group (PG) and prioritized accordingly, I’ll take the following steps:

Since it is not publicly available to report this. I will Internally surface this scenario through Microsoft’s private feedback channels related to Azure ML and Lighthouse product teams.

In parallel, I strongly encourage submitting this as a public request via link under the Azure Machine Learning category. This allows the broader community and PG to gauge interest and urgency.

If you do create a feedback item, feel free to share the link. I’ll make sure it gets amplified on our side as well.

I hope this has been helpful!

If anything remains unclear, or you’d like further clarification, feel free to drop a comment below.

Please click Accept if this response answers your question. It helps the community discover solutions more quickly.

User's image

Answer 2

Ivelin Andreev hi ))) u’re stuck in that awkward spot where u have to rely on customers to run the ARM template because u can’t get member access everywhere, and it’s clunky as heck )) yeah, that’s a rough UX look for sure.

bad news first u’re right, this isn’t just a "flip a switch and it’ll work tomorrow" thing. the way azure ml workspace hooks into identity and resource management makes it a real headache for lighthouse delegation. microsoft’s docs don’t sugarcoat it either cross-tenant ml workspace deployment isn’t officially supported (check the "limitations" section).

lets workarounds together? let’s brainstorm so, service principal + custom role in customer tenant – if u can get the customer to set up a service principal with just the permissions u need (ml workspace contributor + network/keys/etc.), u could automate deployments without full member access. still a bit of back-and-forth, but less than asking them to run ARM manually. guide here. azure devops pipelines in their tenant – if they let u drop a pipeline in their subscription, u could trigger deployments remotely. not ideal, but better than "hey customer, click this button for us" ) bicep/terraform + approval flows if u’re using infra-as-code, u could package the template and have customers approve/reject via azure blueprints or a devops pipeline. still manual, but at least it’s a one-click thing for them.

would u like a feature request? ABSOLUTELY. u’re spot-on – this needs to be a loud voice in microsoft’s ear. upvote existing requests (or make a new one) on azure feedback. the more noise, the better. reference ur support case when u post – it shows real-world pain, not just "wouldn’t it be nice". tag it with both machine learning and lighthouse so the right teams see it.

it’s a bummer, but u’re not alone this trips up a lot of folks trying to do multi-tenant ml ops. for now, the service principal hack is the least-worst option unless microsoft surprises us with a lighthouse update )) keep pushing for that feature request though! the more of us yell, the faster they’ll prioritize it. and yes I know microsoft u’re listening, so… pretty please?

)))))) have a good fridy!

Best regards,

Alex

and "yes" if you would follow me at Q&A - personaly thx.
P.S. If my answer help to you, please Accept my answer
PPS That is my Answer and not a Comment

https://ctrlaltdel.blog/

Ivelin Andreev 25 Reputation points

2025-06-01T17:09:00.6233333+00:00

@Alex Burlachenko , your response is exceptional, thorough and really good, though still a workaround. Some of our customers create their subscriptions because of us, after having GCP subscriptions for example. Even now we spend few meetings to go through some scripts and explain what is what (i.e. Graph API and consents, enterprise applications). Our philosophy is to guarantee security, but also to minimize time we spend and pipelines would really be too much. We have pipelines for each of the environments for which customers are paying in our own subscriptions, though, so the workflow is familiar to us. I promise I will ask a couple of customers about the pipelines, but please note - this conversation is on the top on the conversations and explanations we do to convince them use Lighthouse. Probably you are aware how reluctant are organizations to provide any access to anything and they prefer being completely in control. I did a feedback submission and that was also Vinay's proposal: https://feedback.azure.com/d365community/idea/d2833d2e-0c3f-f011-a2d9-7c1e5243a09b
Don van Meel | Twyzer 40 Reputation points

2025-06-18T06:17:13.2866667+00:00

We have the same problem deploying our solution via Lighthouse. We want to create an AI Hub; The problem also exists when we want to deploy our solution via the Azure Marketplace.

https://learn.microsoft.com/en-us/answers/questions/2278707/problem-deploying-ai-foundry-hub-via-bicep-to-mana#message-1749711890730

Have you heard an progress about the problem ?

Share via

Deployment of resource type MachineLearning fails using Lighthouse

1 additional answer

Your answer