Hi Janice Chi
Is Databricks Repos recommended for multi-developer teams
Yes, Databricks Repos is the recommended approach for teams working on shared notebooks, especially when collaboration, version control, and traceability are important. It enables Git integration (Azure Repos, GitHub, Bitbucket, etc.), so developers can use familiar workflows like branching, pull requests, and code reviews directly from the Databricks UI.
More info Docs: CI/CD techniques with Git folders (Repos)
Are there Microsoft-supported alternatives to Repos for version control and CI/CD
Yes, but with trade-offs:
- You can manage notebooks in external Git repos (outside Databricks) and sync them manually using the Databricks CLI or REST API.
- You can also build and deploy Python wheels or JARs to Databricks jobs using pipelines in Azure DevOps or GitHub Actions.
These alternatives work well for code-based (non-notebook) projects but can be harder to manage for notebook-heavy pipelines and lack native collaboration support.
Risks of not using Repos in an enterprise data pipeline
Some potential limitations of skipping Repos:
- No native version control inside Databricks you lose commit history and rollback options
- Higher risk of overwrite conflicts between developers
- Manual syncing between notebooks and Git introduces errors
- Harder to implement automated CI/CD and enforce governance
For enterprise-scale ingestion pipelines with audit, traceability, and reliability needs these risks can become blockers over time.
More info Docs: Software engineering best practices for notebooks
Does using ADF with job-scoped clusters change this recommendation?
Not at all, your setup using ADF to trigger Databricks jobs with job-scoped clusters works very well with Repos. You can store your production-ready notebooks in a Git-connected Repo, reference them in Databricks jobs, and call those jobs from ADF.
This structure supports:
- Clear separation of orchestration (ADF) and logic (Databricks)
- Code promotion through environments (dev/test/prod)
- CI/CD pipelines to automate deployment
More info Docs: CI/CD on Azure Databricks (Overview)
Finally:
- Use Databricks Repos for collaborative notebook development
- Integrate Git (Azure Repos, GitHub, etc.) to track changes and manage PRs
- Continue using ADF to orchestrate job runs using job-scoped clusters
- Set up CI/CD pipelines with Azure DevOps or GitHub Actions to promote code across environments
Hope this helps. If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.