Tag/View json format - Could arrays be replaced with dictionaries for better efficiency and speed?

nminchin · February 15, 2022, 2:05am

Having dealt a lot with tag json in scripting, it occured to me that it would be easier to deal with, especially for large numbers of tags, if the json was formatted differently by replacing arrays with dictionaries when housing tags and folders. What defines the tag structure is the names, not the positions of them in an array.

For example:

JSON Now:

{
  "name": "New Folder",
  "tagType": "Folder",
  "tags": [
    {
      "name": "Area 2",
      "tagType": "Folder",
      "tags": [
        {
          "name": "Sub Area 2",
          "tagType": "Folder",
          "tags": [
            {
              "valueSource": "memory",
              "name": "Tag 1",
              "value": 10,
              "tagType": "AtomicTag"
            }
          ]
        },
        {
          "name": "Sub Area 1",
          "tagType": "Folder",
          "tags": [
            {
              "valueSource": "memory",
              "name": "Tag 3",
              "value": 20,
              "tagType": "AtomicTag"
            },
            {
              "valueSource": "memory",
              "name": "Tag 1",
              "value": 10,
              "tagType": "AtomicTag"
            },
            {
              "valueSource": "memory",
              "name": "Tag 2",
              "value": 20,
              "tagType": "AtomicTag"
            }
          ]
        }
      ]
    },
    {
      "name": "Area 1",
      "tagType": "Folder",
      "tags": [
        {
          "name": "Sub Area 1",
          "tagType": "Folder",
          "tags": [
            {
              "valueSource": "memory",
              "name": "Tag 1",
              "value": 10,
              "tagType": "AtomicTag"
            },
            {
              "valueSource": "memory",
              "name": "Tag 2",
              "value": 20,
              "tagType": "AtomicTag"
            }
          ]
        },
        {
          "name": "Sub Area 2",
          "tagType": "Folder",
          "tags": [
            {
              "valueSource": "memory",
              "name": "Tag 1",
              "value": 10,
              "tagType": "AtomicTag"
            }
          ]
        }
      ]
    }
  ]
}

My new JSON:

{
    "name": "New Folder",
    "tagType": "Folder",
    "tags": {
        "Area 2": {
            "name": "Area 2",
            "tagType": "Folder",
            "tags": {
                "Sub Area 2": {
                    "name": "Sub Area 2",
                    "tagType": "Folder",
                    "tags": {
                        "Tag 1": {
                            "valueSource": "memory",
                            "name": "Tag 1",
                            "value": 10,
                            "tagType": "AtomicTag"
                        }
                    }
                },
                "Sub Area 1": {
                    "name": "Sub Area 1",
                    "tagType": "Folder",
                    "tags": {
                        "Tag 3": {
                            "valueSource": "memory",
                            "name": "Tag 3",
                            "value": 20,
                            "tagType": "AtomicTag"
                        },
                        "Tag 1": {
                            "valueSource": "memory",
                            "name": "Tag 1",
                            "value": 10,
                            "tagType": "AtomicTag"
                        },
                        "Tag 2": {
                            "valueSource": "memory",
                            "name": "Tag 2",
                            "value": 20,
                            "tagType": "AtomicTag"
                        }
                    }
                }
            }
        },
        "Area 1": {
            "name": "Area 1",
            "tagType": "Folder",
            "tags": {
                "Sub Area 1": {
                    "name": "Sub Area 1",
                    "tagType": "Folder",
                    "tags": {
                        "Tag 1": {
                            "valueSource": "memory",
                            "name": "Tag 1",
                            "value": 10,
                            "tagType": "AtomicTag"
                        },
                        "Tag 2": {
                            "valueSource": "memory",
                            "name": "Tag 2",
                            "value": 20,
                            "tagType": "AtomicTag"
                        }
                    }
                },
                "Sub Area 2": {
                    "name": "Sub Area 2",
                    "tagType": "Folder",
                    "tags": {
                        "Tag 1": {
                            "valueSource": "memory",
                            "name": "Tag 1",
                            "value": 10,
                            "tagType": "AtomicTag"
                        }
                    }
                }
            }
        }
    }
}

E.g. instead of snippet:

 "tags": [
    {
      "name": "Area 2",

it becomes:

 "tags": { <<<==
    "Area 2": { <<<===
      "name": "Area 2", # this could essentially be removed and the dictionary key be used for the name

To access the tag: New Folder/Area 2/Sub Area 2/Tag 1.tagType
Now:
json['tags'][0]['tags'][0]['tags'][0]['tagType']
Mine:
json['tags']['Area 2']['tags']['Sub Area 2']['tags']['Tag 1']['tagType']

Consider that now, you don’t know the indexes of the tags arrays of the folders/tag you’re looking for, and so really every level you have to loop through each item in the tags array, read its name field and compare it to what you’re looking for, then keep looping inside of that until you eventually get to what you want.

Using dictionaries would significantly reduce the complexity of code as well as increase the speed of execution, especially when you have 350k+ tags, or much larger systems with millions of tags.

The additional benefit would be that now tag json would be able to be compared using standard json compare tools. Currently this is impossible and custom compare scripts must be written so that like tag paths are compared and not just items of arrays whose order isn’t guaranteed to match (read: rarely matches). I actually convert these raw json arrays into dictionaries already whenever I need to do any comparisons but it’s far from efficient and speed is hampered.

Really, this applies to Views as well as its components all have names that could be used as the dictionary keys and would provide the same benefits.

Am I overlooking something?

justin.brzozoski · February 15, 2022, 7:39pm

I agree with what you're trying to do, but it gets sticky when some parts of Ignition/Perspective expect the view to be "proper" JSON, which enforces more restrictions on keys than Python dicts do...

Among other things, they must start with a letter (no numbers or other symbols) and are not supposed to have spaces in them. Some bits of Ignition will start complaining if you break these rules.

Again, I really like your idea and wish IA could make it happen, because I hate traversing tag config trees and/or view trees in their JSON format right now. Having to walk and check each member of a list looking for a matching name is annoying. I'm just pointing out the reason I'm aware of that would explain why they did it the way they did.

And while it's far less pretty than your "new" style, this is currently possible:

def lookup(array, name, key='name'):
    return next(iter(x for x in array if x[key] == name))

tagType = lookup(lookup(lookup(sample['tags'],'Area 2')['tags'],'Sub Area 2')['tags'],'Tag 1')['tagType']

Carl.Gould · February 16, 2022, 3:22pm

I think that there are two things here. One is that the json’s order and structure should be deterministic so that it can be reliably compared using json compare tools. The dictionary vs. array question doesn’t really come into play here - just deterministic ordering. This is something we should definitely do, and for reference we are tracking this internally as IGN-4488.

Now, as for using arrays vs dictionaries, I understand that it is a bit inconvenient but I actually think it’s better to use arrays, at least for us. As @justin.brzozoski points out, by doing it this way, we aren’t constricted by the json spec as to what a valid name is. Furthermore, changing it would be a backwards-compatibility nightmare.

Maybe what we should be talking about is adding built-in support for something like JsonPath in order to more easily work with documents structures without having to actually traverse them…

justin.brzozoski · February 16, 2022, 4:39pm

I've used something similar while scripting AWS CLI commands. A quick search suggests that AWS uses JMESPath and not JsonPath, but they look very similar at first glance.

That would probably help with some of the issues I bump into, depending on what functionality it provides. I took a quick peek at their API and like the option to have it return a list of matching paths instead of matching values. If you do move towards adding this sort of functionality to Ignition, please try to keep that available.

nminchin · February 16, 2022, 7:53pm

This would definitely help in the situations where you're comparing two tag sets that have the same tags defined but with differences in config, but it still won't help to compare tag sets with additional/removed tags as then array indices will not match up between the same named tags. I understand now though that it's the json naming constraints that makes using dictionaries is impossible to suit everyone. So I guess the answer is, I did overlook this and forgot about json naming constraints