When you're working with Serverless SQL Pool in Synapse Analytics connected to Azure Cosmos DB Analytical Store, encountering non-JSON objects is a common issue, especially when:
- The source data has inconsistencies in formatting.
- Certain items in the Analytical Store are malformed or not valid JSON.
- There are differences in expected schema structures (e.g., missing or extra properties).
- The Synapse view over the Cosmos DB container cannot infer a proper schema due to data anomalies.
The Synapse Serverless SQL pool queries the Analytical Store using a defined schema, and it expects all items to be valid JSON documents. If even a single record doesn't conform to JSON standards (e.g., trailing commas, incorrect escaping, invalid characters), it will be interpreted as a non-JSON object, and you'll see issues like empty rows, null
values, failure to read specific rows, _raw
or similar columns showing unparsed strings.
When you paste the document into Notepad++ and it looks like JSON, you're seeing visually valid JSON — but Synapse requires technically valid JSON per the JSON spec. You might want to check for unescaped quotes, backslashes, or control characters, numeric fields using the wrong type (e.g., NaN
, Infinity
), or comments and trailing commas, which JSON doesn’t support.
To validate:
A. Use OPENROWSET with RAW format - you can try querying the raw data to inspect problematic rows:
SELECT TOP 10 *
FROM OPENROWSET(
BULK 'https://<cosmos-db>.documents.azure.com/dbs/<db>/colls/<collection>',
FORMAT='CSV',
DATA_SOURCE = 'CosmosDBDataSource'
) AS rows
Or try:
SELECT TOP 100 *
FROM [CosmosDB].[database].[collection]
WHERE ISJSON(_raw) = 0
This assumes _raw
is available or you’re querying through a view over the Analytical Store.
B. Add IS_JSON()
check - you can filter for only valid JSON rows like this:
SELECT *
FROM [CosmosDB].[database].[collection]
WHERE IS_JSON([your_column]) = 1
This should help temporarily skip malformed records.
If possible, fix the malformed documents in Cosmos DB directly. You can:
- Query for invalid JSON items using Cosmos DB SDK or Azure Data Explorer.
- Rewrite or delete problematic entries.
In Synapse, you can use TRY_CAST
or TRY_CONVERT
to avoid query failures:
SELECT
TRY_CAST(JSON_VALUE([your_column], '$.someProperty') AS VARCHAR(100)) AS SomeValue
FROM [CosmosDB].[database].[collection]
If the JSON structure is inconsistent, you might use Azure Data Factory or Synapse Dataflows to flatten or clean the data before querying it.
If the above response helps answer your question, remember to "Accept Answer" so that others in the community facing similar issues can easily find the solution. Your contribution is highly appreciated.
hth
Marcin