AI-ing fast and slow

15 Oct, 2024

Consider you want to create a language learning application where you use AI's real time voice mode to create a language tutor.

In order for it to become an effective tutor, you need

near-instantaneous response for UX
methodical consideration of the history with the user to tailor the response to the user

This mirrors quite nicely the system 1 and 2 paradigm from thinking fast and slow.

System 1 thinking can be provided by the realtime api itself. System 2 could be mimicked by tool calls. Consider that the response from the tool call itself could be the output of a model, for instance one with large context length and great performance (Gemini 1.5 Pro?). This tool call can generate insights from the entire conversation history that are synthesised and passed as small context objects to the realtime voice api without incurring crazy costs.

I spoke before about how a built-up relationship (read context) is the moat for AI applications. This is how I would implement it.

#post #posts