Formally define the goals of Codex #5

emilyyyylime · 2024-11-17T21:02:46Z

I think we should have some agreed upon document (even just as part of the README) directly stating which characters are or are not in scope for inclusion in this project, and how simple should a given character be to access.

This would also include some guiding principle for assigning names to characters, such as when abbreviations are okay and when they aren't, when can a character be accessible through multiple distinct names (and whether one of them should be considered "canonical"?), whether we strive to describe the usage of characters, their origin/formal meaning, or their visual appearance (or rather when do we do which).

In my opinion this will greatly help prioritise new additions to the repo and help reviewers to decide what changes to approve

(side note: do we have a preference for how to stylise the name; e.g. codex, Codex, or CodeX?)

dccsillag · 2024-11-17T21:03:46Z

Strong agree. Maybe we can pull some points from the recent Discord conversations on this?

(Re. name styling: all sound good to me.)

MDLC01 · 2024-11-17T21:38:54Z

Regarding the scope, I would say any Unicode character that is not part of a natural writing system may be considered for inclusion. But this may be too broad, and it is also immediately contradicted by most Greek and Hebrew letters being already included (and rightfully so).

(Re. name styling: all sound good to me.)

Please not CoDeX! I would say Codex (or CODEX in a capitalize context) is fine.

emilyyyylime · 2024-11-17T22:13:57Z

Are there any specific Unicode (or other) categorisations we could use to help refine this definition? I think it should firstly focus on the needs of Typst users (⇒ symbols used in academic writing and other typesetting settings, that are often not easily accessible in the keyboard layouts of users wishing to type them), and expand from there with more specific examples

mkorje · 2024-11-18T14:28:38Z

I'm of the opinion that Codex's scope should be fairly broad: to assign names to most Unicode characters. Whilst I agree that we should focus on the needs of Typst users first, I wouldn't want this to discourage adding names for symbols that fall outside of this. For example, #3, I think is a fine addition. Die face symbols aren't apart of any natural writing system (afaik...), and the name is clear-cut.

Regardless, I think any scope we set is destined to end up being too broad/restrictive. So I share @MDLC01's view that anything not part of a natural writing system may be considered, and things part of a natural writing system may be allowed if its inclusion makes sense. I'd suggest that the criterion for this inclusion is something along the lines of substantial usage and accessibility in users' keyboard layouts. (And this would rightly justify the inclusion of most Greek and Hebrew characters, I believe.)

With regards to naming guidelines, I think writing out the existing implicit "rules" would be a great start. For example, .rev and .not being the standard modifiers for the reversed variant of a character and the variant of a character with a forward slash through it.

MDLC01 · 2024-11-18T16:38:57Z

With regards to naming guidelines, I think writing out the existing implicit "rules" would be a great start. For example, .rev and .not being the standard modifiers for the reversed variant of a character and the variant of a character with a forward slash through it.

I don't have any issue with something that would present itself as a sort of cheat-sheet, or as a general guideline. However, laying out precise normative rules is not a good idea, because most implicit rules have legitimate counter-examples.

emilyyyylime · 2024-11-19T15:06:11Z

+1 to @MDLC01

It seems that there's consensus that any non-deprecated Unicode character is theoretically in scope for Codex. Is coverage of all Unicode characters (that fit our criteria) a goal? That is; should we aim to map every character that could be useful to a convenient name?

One other question to consider that came to my mind is: should there be more namespaces than sym and emoji? I remember @laurmaedje has shown interest in keeping mathematical symbols in their own module (though of course that would be a very major breaking change and impractical to simply implement). A few new namespaces have already been suggested in The Symbols Document, but they've for the most part been contained under sym (e.g. see #2).

MDLC01 · 2024-11-19T15:18:22Z

Is coverage of all Unicode characters (that fit our criteria) a goal?

I wouldn't say so. Some characters don't make sense to include. For example, control characters, or characters that are meant to be used as part of greater clusters.

should we aim to map every character that could be useful to a convenient name?

I may agree with this more. Specifically, the "that could be useful" part is important.

emilyyyylime · 2024-11-19T16:09:04Z

Alright, "could be useful" is part of our criteria for including characters then.

What I was trying to get at with "full-coverage" is, should we keep looking for new characters that fit our criteria throughout the Unicode planes, and then call the project "complete" until a new version of Unicode is released? Or perhaps we could take more of an 'on-demand' approach to adding new characters; only adding characters when the need for them presents itself (which could still include someone finding a character and figuring we should include it)

MDLC01 · 2024-11-19T16:41:11Z

I would say a long term goal is to eventually consider every character. Short term, it of course make sense to start with blocks that contain more useful characters, and take people's suggestions into account. Also, deciding not to add a character at some point should not prevent us from adding it later based on demand.

If I understand well, this means: short term, take an "on-demand" approach; long term, take the "full coverage" approach.

emilyyyylime · 2024-11-19T19:13:09Z

Alright. Would anyone like to begin work on writing down the goals? Possibly we could draft it in a new file in the Proposals document and once everyone is happy with it create a PR to add it in README or a specific guidelines.md file

MDLC01 · 2024-11-19T20:12:16Z

This should probably be in a separate document. I can create a new document in the Codex team on the webapp (formerly Symbols team, I just renamed it) if you want.

MDLC01 · 2024-11-19T20:16:16Z

Alright, I created a document anyone can write to: https://typst.app/project/wfsdJgobtek11i1cZVXqIe. We should also be able to use the webapp's comment feature.

mkorje · 2024-11-20T02:28:27Z

With regards to naming guidelines, I think writing out the existing implicit "rules" would be a great start. For example, .rev and .not being the standard modifiers for the reversed variant of a character and the variant of a character with a forward slash through it.

I don't have any issue with something that would present itself as a sort of cheat-sheet, or as a general guideline. However, laying out precise normative rules is not a good idea, because most implicit rules have legitimate counter-examples.

Fair point, I agree then a general guideline would be the way to go (where we make clear that these are not normative).

MDLC01 · 2024-11-20T14:03:27Z

There is also another question, which is whether Codex should be considered independant from Typst, or if Typst is the only use cas we should have in mind.

For example, we do not define names for mathematical calligraphic letters, because they are already accessible in Typst using other means. This is not compatible with the idea that Codex should be an independant library.

If Codex wants to be independant, some functions that are currently implemented in the Typst codebase should be moved here. Otherwise, we may prevent other use cases.

The fact that it is maintained separately from Typst, and has a unique name, makes me lean toward Codex should be usable outside of Typst. But in the end this is probably the Typst team's decision.

emilyyyylime · 2024-11-20T17:33:40Z

Yeah, I'd love to hear input from one of them

laurmaedje · 2024-12-08T16:52:28Z

If Codex wants to be independant, some functions that are currently implemented in the Typst codebase should be moved here. Otherwise, we may prevent other use cases.

How would that look for calligraphic etc.?

dccsillag · 2024-12-08T17:37:45Z

I strongly think that Codex should not be indenpendent from Typst. It's a cute idea, but probably would cause a lot of problems down the road. By keeping it 'tied' to Typst, we have clear end-users in mind, which should help us make decisions, especially any more subjective ones.

That said, I think we should strive to make it easily usable from outside the Typst codebase. But I believe that having a clear notion of our end-users is essential.

P.S.: also, calibraphic letters are another font, right? Feels like there are things that a caligraphic font enables that just the unicode symbols don't, but I'm not sure.

MDLC01 · 2024-12-09T06:54:23Z

P.S.: also, calibraphic letters are another font, right? Feels like there are things that a caligraphic font enables that just the unicode symbols don't, but I'm not sure.

For use in maths, Unicode defines a set of calligraphic counterparts of Latin letters: https://unicode.org/charts/PDF/U1D400.pdf.

MDLC01 · 2024-12-09T20:44:06Z

How would that look for calligraphic etc.?

I'm not sure. I also realize that I was probably too affirmative in my message. It seems reasonable to consider moving some functions here, but I no longer believe we should do it.

laurmaedje · 2024-12-09T22:21:52Z

Okay. I think I'm in agreement with @dccsillag that the main focus should be on Typst, at least for now. We can always expand things here later, especially should there be interest in usage outside of Typst.

emilyyyylime added the meta Discussion about the structure of this repo label Nov 17, 2024

emilyyyylime mentioned this issue Nov 17, 2024

Add die face symbols (Proposal 2) #3

Merged

MDLC01 mentioned this issue Dec 16, 2024

Add math styling module #20

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Formally define the goals of Codex #5

Formally define the goals of Codex #5

emilyyyylime commented Nov 17, 2024

dccsillag commented Nov 17, 2024 •

edited

Loading

MDLC01 commented Nov 17, 2024

emilyyyylime commented Nov 17, 2024

mkorje commented Nov 18, 2024

MDLC01 commented Nov 18, 2024

emilyyyylime commented Nov 19, 2024

MDLC01 commented Nov 19, 2024

emilyyyylime commented Nov 19, 2024

MDLC01 commented Nov 19, 2024

emilyyyylime commented Nov 19, 2024

MDLC01 commented Nov 19, 2024

MDLC01 commented Nov 19, 2024

mkorje commented Nov 20, 2024 •

edited

Loading

MDLC01 commented Nov 20, 2024

emilyyyylime commented Nov 20, 2024

laurmaedje commented Dec 8, 2024

dccsillag commented Dec 8, 2024 •

edited

Loading

MDLC01 commented Dec 9, 2024

MDLC01 commented Dec 9, 2024

laurmaedje commented Dec 9, 2024

Formally define the goals of Codex #5

Formally define the goals of Codex #5

Comments

emilyyyylime commented Nov 17, 2024

dccsillag commented Nov 17, 2024 • edited Loading

MDLC01 commented Nov 17, 2024

emilyyyylime commented Nov 17, 2024

mkorje commented Nov 18, 2024

MDLC01 commented Nov 18, 2024

emilyyyylime commented Nov 19, 2024

MDLC01 commented Nov 19, 2024

emilyyyylime commented Nov 19, 2024

MDLC01 commented Nov 19, 2024

emilyyyylime commented Nov 19, 2024

MDLC01 commented Nov 19, 2024

MDLC01 commented Nov 19, 2024

mkorje commented Nov 20, 2024 • edited Loading

MDLC01 commented Nov 20, 2024

emilyyyylime commented Nov 20, 2024

laurmaedje commented Dec 8, 2024

dccsillag commented Dec 8, 2024 • edited Loading

MDLC01 commented Dec 9, 2024

MDLC01 commented Dec 9, 2024

laurmaedje commented Dec 9, 2024

dccsillag commented Nov 17, 2024 •

edited

Loading

mkorje commented Nov 20, 2024 •

edited

Loading

dccsillag commented Dec 8, 2024 •

edited

Loading