Extract substrings from string (parsing expression to get all tag references within)

I'm using system.tag.getConfiguration() to get the expression out of an expression tag. From this, I want to extract a list of all the tags referenced in the expression. For example, if the expression is

{[.]../FolderA/Tag1}||{[.]../Tag2}||{[.]../FolderB/FolderC/Tag3}

I would like to return

{"[.]../FolderA/Tag1","[.]../Tag2","[.]../FolderB/FolderC/Tag3"}

I was hoping to find a script function that would search a string using regex to find a substring match, but I'm not seeing anything comparable to the IndexOf expression function.

Any recommendations on how to do this?

Searching "regex to find substrings enclosed in special characters" on google, the first result is this:

Replace their characters with yours and you should be okay.

edit: meh. This regex is a bit too specific to the user's case. Should have read BEFORE replying.
I'll try to make it work

edit2:
maybe this ?
re.findall(r"\{[^}]*\}", s)

  • \{ matches the opening {
  • [^}]* matches everything until the closing }
  • \} matches the closing '}'

if you want to make sure there's a provider: re.findall(r"\{\[[^]]*\][^}]*\}", s)

edit3:
actually... something like this ?
re.findall(r"(?<={)[^}]*", s)

  • (?<={) selects things AFTER the opening {
  • [^}]* selects everything up to the closing }

This way the {} are not included in the results

I group the look-aheads and look-behinds. To me, anyway, the pattern is more readable.

import re
        
regex = r"(?<={)(.*?)(?=})"

test_str = "{[.]../FolderA/Tag1}||{[.]../Tag2}||{[.]../FolderB/FolderC/Tag3}"

matches = re.findall(regex, test_str, re.MULTILINE)

print matches

Output:

['[.]../FolderA/Tag1', '[.]../Tag2', '[.]../FolderB/FolderC/Tag3']
1 Like

Just use Java's String.split()

I find that it's easier if you include the delimiters, that way you can eliminate any expressions that might be between tags.

I assume from your expected output that you were wanting a set, which would eliminate duplicate tag entries. A list could just as easily be used.

There is a java.util.StringTokenizer which will do this same thing, however, that has been deprecated. The recommended method is now String.split()

from java.lang import String

test = "{[.]../FolderA/Tag1}||{[.]../Tag2}||{[.]../FolderB/FolderC/Tag3}"

tokens = String(test).split("((?=\{|\})|(?<=\{|\}))")

tags = set()
enumToken = enumerate(tokens)

for cursor,token in enumToken:
	if token == "{":
		#start of tag path, consume the token
		cursor,token = next(enumToken)
		tags.add(token)
		
		#consume the ending brace
		cursor,token = next(enumToken)

print tags

output:

>>> 
set([u'[.]../Tag2', u'[.]../FolderA/Tag1', u'[.]../FolderB/FolderC/Tag3'])
>>> 
1 Like