I'm getting valid JSON out of gpt-3.5-turbo without trouble. I supply an example via the assistant context, and tell it to output JSON with specific fields I name.
It does fail roughly 1/10th of the time, but it does work.
10% failure rate is too damn high for a production use case.
What production use case, you ask? You could do zero-shot entity extraction using ChatGPT if it were more reliable. Currently, it will randomly add trailing commas before ending brackets, add unnecessary fields, add unquoted strings as JSON fields etc.
Which is why this is just an experiment. I’ve gone back to standard translation APIs for everything except the final summarizing (and even them I might go there as well).
It does fail roughly 1/10th of the time, but it does work.