So even if GPT-4 "understands" what needs to be drawn (in the sense that the higher-level concepts and relations between them are embedded in its weights), it's difficult to get DALL-E to draw it correctly because DALL-E doesn't have that same "understanding". OpenAI will need to combine these models somehow, or train DALL-E so that it has the same conceptual understanding.