-
Notifications
You must be signed in to change notification settings - Fork 26
Description
Is your feature request related to a problem? Please describe.
Current Definition of "Name" Token
Only letters a-z are allowed.
| token Name = | |
| ( 'a'..'z' | 'A'..'Z' | '_' | '$' ) | |
| ( 'a'..'z' | 'A'..'Z' | '_' | '0'..'9' | '$' )*; |
Limitation for UML-Languages
The name token is used in nearly all monticore-languages. The restrictions of the name make it harder for users to describe their problem in their language.
e.g. CDs:
class Käse { // "ä" not allowed
bool flüßig; // "ü" and "ß"
}or ODs:
object Époisses: Käse { // "É"
flüßig = false;
}and so on.
Limitation for General Languages
Other languages have a much broader definition of names. A monticore-grammar for these languages is either more restrictive and cannot parse all valid instances, or it redefines the name token and is hard to use with other monticore-languages.
Java:
https://docs.oracle.com/javase/specs/jls/se23/html/jls-3.html#jls-3.8
Letters and digits may be drawn from the entire Unicode character set, which supports most writing scripts in use in the world today, including the large sets for Chinese, Japanese, and Korean. This allows programmers to use identifiers in their programs that are written in their native languages.
αρετη is explicitly mentioned in the java specification as an allowed identifier.
XML
XML also allows unicode-characters in the identifier. As a consequence, the MontiCore-XML Language Overrides the name token:
https://github.com/MontiCore/xml/blob/ed432849540eab55c952aabfa748b923c541b55c/src/main/grammars/de/monticore/lang/XMLBasis.mc4#L22-L47
Describe the solution you'd like?
Allow Unicode-Characters for name token in MCBasis.mc4. This allows the developer to create models closer to her native language, and ensures that general languages such as Java & XML can be parsed without overwriting the name token.
There is a unicode-identifier standard, which can serve as a language-independent basis: https://www.unicode.org/reports/tr31/
Java-RTE also knows the unicode-identifier standard: https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/lang/Character.html#isUnicodeIdentifierStart(int)
Describe alternatives you've considered
No response
Additional context
No response