Ollama streaming response on ignition

I’m trying to integrate Ollama (a local LLM server) with Ignition’s Web Dev module to get real-time streaming responses instead of waiting for the full reply. My current script works fine with "stream": false using HttpURLConnection, but I can’t get "stream": true working. How can I safely implement asynchronous or streaming HTTP responses in Ignition so clients receive data progressively?

I'm confused.

WebDev is used to write HTTP APIs, for consumption by other HTTP clients.

If you're trying to talk to Ollama, presumably Ignition is the HTTP client (and Ollama is the API), so the WebDev module is not involved.

Assuming I'm not completely misunderstanding you, then what you want is system.net.httpClient (not HttpURLConnection - that API is unpleasant to use). If you use system.net.httpClient().getJavaClient() you get access to the underlying java.net.HttpClient.
You can invoke an HTTP request manually and provide your own "body handler" that does whatever you want with the input response; see BodyHandlers.

This is a case where an LLM would actually probably do okay, with guardrails - ask your LLM de jure for a Jython script, using Python 2.7 syntax, preferring Java imports, that uses Java's standard library net httpclient to consume a streaming response.

I don't know anything about Ollama's API or how it separates chunks of data - that will be another thing for you to figure out.

Good luck.

3 Likes