oh, it still doesn't work with pleroma and i don't understand why
well, at least my research led me to one interesting idea: if you have a personal blog, you can create redirects like domain.tld/:some_custom_emoji:/ -> domain.tld/blog/2020-12-04, and use the cute emojified links in your mastodon posts :p
i chose a different approach for @meowViewer: https://infosec.exchange/@leip4Ier/105277917679284776. it most likely is an example of over-engineering, but hey, my code will work in all cases, it won't break even for posts that contain <script> tags! (mastodon's code would) (which will never happen, because all <script>, <style>, etc tags are stripped by the sanitizer..)
i talked about other implementations here: https://infosec.exchange/@leip4Ier/105298240582645224. mastodon uses a state machine and only replaces emoji shortcodes in text (even if that text is inside a link!), it doesn't touch html attributes. friendica does the same thing as pleroma, misskey is too complicated for me to understand, and the language barrier doesn't help. i can only tell that it doesn't replace shortcodes inside the link text. here's how it displays the first post in this thread: https://misskey.io/notes/8fcvichcs2.
strip_tags replaces quotes with escape codes using this function: https://github.com/elixir-plug/plug/blob/be7f5d93e36a5f16870861751ae7909a308ac988/lib/plug/html.ex#L36-L39. still looks weird to me, but well, it works.
the explanation: pleroma-fe uses the list of emoji attached to a post, and just replaces every emoji shortcode in raw html with an img tag. you can see it here: https://git.pleroma.social/pleroma/pleroma-fe/-/blob/42c747a342cd7d435dcbe411276ac4999ff92395/src/services/entity_normalizer/entity_normalizer.service.js#L223-232. it works most of the time, because you rarely have an emoji shortcode both inside the url and in attached metadata. but still, links can get broken.
you may notice that it doesn't sanitize emoji shortcodes or image urls. it is safe, because that's done on the backend: https://git.pleroma.social/pleroma/pleroma/-/blob/1d04bd08944f281644d9f6a002ed4cea825a95d8/lib/pleroma/web/mastodon_api/views/status_view.ex#L518-523.
function overloading looks weird to me
(i'm just glad i didn't spend my time researching custom emoji replacement for nothing x.x)
@leip4Ier Hey, I remember that when it recently turned out that Apple is spying on what apps users launch, you pointed out that OSCP basically does the same thing. I just learned about CRLite, Mozilla's initiative to fix this by building a Bloom filter of expired certificates and shipping that to Firefox users: https://blog.mozilla.org/security/2020/01/09/crlite-part-2-end-to-end-design/ Pretty cool, eh?
i thought that with most tags not being nestable, you only need to do a bunch of simple regexes (`\[tag\](.*)\[\/tag\]`) and then sanitize the attributes that go into html. but, the latter isn't that obvious, at least not for friendica devs: they sanitized some attributes, but not the others. and it seems like an easy mistake to make, so in the end you're better off running a sanitizer on the resulting html code anyway. at which point, why not use html in the first place?
Now back to #BBCode. It being seemingly simple means that most implementations don’t bother with HTML sanitization. Instead, the expectation is that you run a bunch of regexps to produce HTML code and it will just be fine. Except that usually it’s not: https://jeffchannell.com/Other/bbcode-xss-howto.html
a strange girl interested in infosec and coding. loves her girlfriend. interact with my posts please!
A Mastodon instance for info/cyber security-minded people.