OpenAIās current documentation draws a clear operational line between three kinds of AI work: cache-friendly interactive requests, long-running background jobs, and offline batch runs. If a team still pushes all of that through one synchronous path, it is usually paying more than necessary in latency, API cost, andę²»ē?