@Lady No fan of U+FFFC?
@aschmitz usually the best‐practice as i understand it when using noncharacters (which FFFE and FFFF are) is to first search for them in the string and replace any existing ones with FFFD
this would need to happen to make the XML valid anyway, so that seems acceptable to me; i agree that in the general case you probably shouldn’t assume valid input tho
@aschmitz the other best‐practice with noncharacters is to never store them in a place where anyone other than the program which understands their meaning will see them
having the noncharacters produce XML which isn’t valid provides a bit of a guarantee against that; a downstream recipient SHOULD error out if it receives a document where the noncharacter wasn’t handled/removed
@Lady Ideally! (In my world, most XML parsers are extremely far from validating, but a final check that things are valid as they depart is feasible enough, at least.)
@aschmitz i am very disappointed in the state of XML parsers as well
@Lady Fair enough I suppose, though expecting that your input will always be valid feels like asking for a certain kind of trouble. But if you're the one writing it you're probably okay. (And yeah, though FFFE is theoretically allowed I'd avoid it for the reason you say unless you can guarantee it won't show up early.)