here is a shell command you can run to test whether the value of MAYBE_NCNAME is an ncname or not; returns exit status 0 if it is and 1 otherwise
printf '%s\n' '<transform xmlns="http://www.w3.org/1999/XSL/Transform" xmlns:exsldyn="http://exslt.org/dynamic" version="1.0"><param name="thing"/><template match="/"><choose><when test="/self::node()[translate(normalize-space($thing), " /([,*", "")=string($thing) and exsldyn:evaluate(concat("not(self::exsldyn:", $thing, ")"))]">ok</when><otherwise>ng</otherwise></choose></template></transform>' | xsltproc --stringparam thing "${MAYBE_NCNAME}" --html --novalid - /dev/null 2>/dev/null | grep -F -q -x 'ok'
@Lady Can't do it in sed?
@aschmitz you could do it in sed but i’m not sure the regex would be shorter and personally i would rather not worry that i may have a bug in my regular expression
@aschmitz (grep actually would probably be more appropriate but same difference)
@aschmitz (and if we want to be fully honest, unicode support in grep/sed is dicey so the answer might actually be no)
@Lady At least cross-platform, I guess.
@aschmitz i would like to think you could LC_ALL=POSIX and write some extended regular expressions which just manually manage the UTF-8 bytes but (a) that sounds terrible and (b) i’m not actually sure that all implementations allow this in practice
i need at least cross-platform between macOS and debian for anything i write, and unicode support and error handling is one of those things that is liable to be subtly different between those platforms