Ollama streaming response on ignition

I’m trying to integrate Ollama (a local LLM server) with Ignition’s Web Dev module to get real-time streaming responses instead of waiting for the full reply. My current script works fine with "stream": false using HttpURLConnection, but I can’t get "stream": true working. How can I safely implement asynchronous or streaming HTTP responses in Ignition so clients receive data progressively?

I'm confused.

WebDev is used to write HTTP APIs, for consumption by other HTTP clients.

If you're trying to talk to Ollama, presumably Ignition is the HTTP client (and Ollama is the API), so the WebDev module is not involved.

Assuming I'm not completely misunderstanding you, then what you want is system.net.httpClient (not HttpURLConnection - that API is unpleasant to use). If you use system.net.httpClient().getJavaClient() you get access to the underlying java.net.HttpClient.
You can invoke an HTTP request manually and provide your own "body handler" that does whatever you want with the input response; see BodyHandlers.

This is a case where an LLM would actually probably do okay, with guardrails - ask your LLM de jure for a Jython script, using Python 2.7 syntax, preferring Java imports, that uses Java's standard library net httpclient to consume a streaming response.

I don't know anything about Ollama's API or how it separates chunks of data - that will be another thing for you to figure out.

Good luck.

4 Likes

Thanks for response,

The main underlying issue is Java’s HttpClient.send() is blocking — it waits until the full response is complete before returning control to Jython.

  • It buffers the response internally.

  • You get all the lines after the HTTP connection closes.

  • So you can iterate over resp.body() — but that’s post-stream, not while the stream is coming in.


What True Streaming Would Require

To get actual live streaming (i.e., each chunk arrives as soon as Ollama emits it), you must:

  1. Use sendAsync() instead of send(), and

  2. Provide a custom body subscriber that processes chunks as they come in.

For example (in Java):

client.sendAsync(request, HttpResponse.BodyHandlers.ofInputStream())
    .thenAccept(response -> {
        InputStream is = response.body();
        // read from InputStream continuously
    });

But…

:warning: Ignition’s Jython (2.7 + JVM sandbox) cannot easily consume async Java Futures or attach reactive callbacks.

So, i need a Java thread that continuously reads the InputStream and prints chunks as they arrive
Thanks!

I'm not going to debate further with your large language model.

Anything you can write in Java you can write in Jython. You can absolutely consume a streaming response via Jython, by translating that exact same Java code you have into Jython.

7 Likes

Thanks a lot for the documentation of java.net.HttpClient it helped me connect to my server with streaming response.

thanks again for help!

1 Like