Yeah, you can try bumping up the 500 byte connection size to 4000, but it's still not going to get to 0 with that many tags. Also, do you need them all at 3500ms? Are tags structured in UDTs or arrays? Is this using PlantPAx?
As an example, here's my most recent project (missing from the screenshot is my 30000ms scan rate for string data) which isn't using Phil's driver, but due to structure of tags, etc it performs well:
Although my ethernet comms of my processor is pushed to its limit:

