can anyone tell me what the difference between carriage return (U+000D), line feed (U+000A), and next line (U+0085) are

Follow

@wallhackio sure, the difference is that none of them are line separator (U+2028)

@Lady you revealed the existence of this additional, seemingly redundant code point to me and therefore i will blame you for its existence.

@wallhackio they all come from a confusion between control characters, which control the display of terminals, and “meaningful” characters, which are intended for transmission and display

ASCII confused these quite heavily and said things like “computers can support O<backspace>" to get Ö”; it was really a standard of codepoints for text display on terminals and only ancillarily a standard of text encoding for storage; this of course meant that when computers needed to decide how to represent linebreaks in plain text, they differed on how to do it

some picked CR, some picked LF, and some picked CR+LF

well then ASCII was extended into 8-bit space and folks decided that having these control characters with confusing and overloaded meanings was a problem, so they introduced a new character, NEL, which almost nobody actually adopted. some weirdos decided to do CR+NEL just to be spicy

then Unicode came along. Unicode decided that mandating the meaning of control characters was a pointless pursuit so they refused to specify behaviours for them. but of course it would be silly to not have a linebreak character in Unicode, so they added LSEP. they also added Paragraph Separator at U+2029 as a substitute for form feed i guess?

Sign in to participate in the conversation
📟🐱 GlitchCat

A small, community‐oriented Mastodon‐compatible Fediverse (GlitchSoc) instance managed as a joint venture between the cat and KIBI families.