« Ontologies processed in 265036 ms by HermiT » 😬

Follow

« Ontologies which previously required minutes or hours to classify can often by classified in seconds by HermiT, and HermiT is the first reasoner able to classify a number of ontologies which had previously proven too complex for any available system to handle. » well i guess we¦re quickly trucking into that second category huh

· · Web · 1 · 2 · 1

this ontology ISN¦T complex it¦s just BIG

one of the big bottlenecks apparently is the fact that i have multiple classes with

“rdf:value exactly 1
rdf:value only xsd:string”

i have no idea why THIS is the thing making everything slow, but when i remove it the speed increases are Quite Substantial

anyway i don¦t really care; i¦m not expecting anyone to actually use this Full Ontology for anything, it¦s supposed to be subsetted, but i do need to like, do a basic sanity check to make sure it isn¦t inconsistent, at least

anyway we¦re ALMOST done i just need to add AWOL

(i was saying “we¦re ALMOST done i just need to add Web Annotations” up until yesterday, and then i realized i hated how Web Annotations do TextualBody¦s so now we need AWOL as well)

programmers!! stop defining Json‐L·D syntaxes which encode text as plain strings with separate properties for language, direction, and media type!! Json‐L·D explicitly supports tagging direction AND language directly on the string itself!! and rdf:XMLLiteral exists for data which is serializable to X·M·L!

YES it is more processing work to explicitly handle all those possibilities but it is BETTER i tell you!!

(actually it¦s probably best to keep direction as a separate property unless and until they update all of the downstream specifications to support R·D·F 1.2, since 1.1 only supports language, but shhhhh)

Json·L·D supports two different R·D·F 1.1 equivalences for strings with both language and direction, but unfortunately both are basically unusable in Owl (as one is a custom datatype, which cannot be reasoned about, and the other is a blank node, which cannot be the value of a data property)

@Lady Is the reason to not just add U+200E/F to strings so you don't modify them while encoding them? (This seems lightly silly if they are *strings*, but not if they are actually defined as byte or codepoint sequences, I suppose.)

@aschmitz “string” in this context means <w3.org/TR/2012/REC-xmlschema11>, « the set of finite-length sequences of zero or more characters (as defined in [XML]) that match the Char production from [XML]. »

as far as your actual question, i’ll point you to <w3.org/TR/string-meta/> which discusses strengths and drawbacks of various approaches in depth

@aschmitz (as an aside, this definition of string means R·D·F strings cannot contain U+0000, U+FFFE, or U+FFFF, despite those being perfectly usable Unicode characters in other contexts)

@Lady That's fair enough I suppose, though the list of issues with just whacking in a RLM/LRM marker feels like it's pretty much paralleled by doing it in metadata. (With the exception of needing to detect an existing mark, which is a reasonable complaint. That said, it's not clear to me whether the additional work to do so is greater or lesser than that required to handle separate metadata, and as you - and the W3C - point out, doing so isn't always an option provided in any given spec.)

@aschmitz i think the main advantage to doing it in metadata is it’s easier to bestpractice it, in the same way that language tags are bestpractice, whereas doing it with unicode characters leads to situations where a lot of people don’t bother because it works for them in their languages on their machines

@Lady Yeah, that's a reasonably compelling argument. (I'm also not a particularly huge fan of the W3C's first approach of first-strong property detection because it appears to be more or less the same thing you point out is bad, not to mention that it requires a bunch of extra processing for each viewer even if it actually works consistently for everyone.)

@aschmitz (also, for things like books, you can’t identify the directionality of a book from the directionality of its first characters. but you also probably shouldn’t be conveying books as plain strings.)

Sign in to participate in the conversation
📟🐱 GlitchCat

A small, community‐oriented Mastodon‐compatible Fediverse (GlitchSoc) instance managed as a joint venture between the cat and KIBI families.