To update labels in a Custom Text Classification project without recreating it each time, it's important to understand how Azure Machine Learning handles file ingestion and caching. Initially, your labeling worked correctly because the system treated the uploaded .txt
files and JSON annotations as new. However, when attempting to update the data using the same filenames and structure, Azure likely cached the original versions, causing the updates to be ignored and old labels to persist.
To ensure updates are recognized, follow a structured approach. First, modify your .txt
files—either by renaming them or slightly changing their content—to ensure Azure treats them as new assets. Then, generate a fresh JSON annotation file that adheres to the required schema, including metadata such as project version
, string index type
, and asset definitions
.
After preparing your files, upload them to Azure storage and confirm successful ingestion. Next, import the new JSON annotation file using Azure Language Studio or Machine Learning Studio. If updates still don’t reflect, Microsoft recommends troubleshooting steps such as clearing your browser cache, verifying file permissions, avoiding duplicate filenames, and ensuring the JSON format is strictly followed.
To avoid similar issues in the future, implement best practices like version control for annotation files, documenting schema changes, testing updates on a small subset of data, and cleaning up unused files. This ensures smooth label management and eliminates the need to recreate your project for every update.
Reference : Label data with Azure Ml , Custom Text Classification accepted Data formats , Manage Labeling Projects
Hope it Helps
Thanks