-
Notifications
You must be signed in to change notification settings - Fork 0
Try to parse changelog lines and format them on the web #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
When loading repository data from local files using the --repo-data-dir option, use the same file names that are used when the files are downloaded. These became outdated by 0da67c5.
Parse the lines of a changelog entry using the somewhat standardised method of writing changelog messages. The parsed messages will be displayed in a nicely formatted way. If the parser cannot make any sense out of it, it will fall back to displaying the changelog in a good, old <pre>-block.
|
@HenkKalkwater, I briefly tried to determine where the changelog is sourced from … and failed. Hence I just denote the usual sources of changelogs I can think of:
These three sources are consolidated by the Luckily the P.S.: Another source for a changelog which some developers use elaborately is the |
|
So far, I'm parsing the changelogs from <?xml version="1.0" encoding="UTF-8"?>
<otherdata xmlns="http://linux.duke.edu/metadata/other" packages="1869">
<package pkgid="47853ce7329ddae40cd5bb61b2bd1e8a2c5350ff6e43bc13d623260bc02dfc15" name="AtomicParsley" arch="src">
<version epoch="0" ver="0.9.6.20221229+obs1" rel="1.4.1.bso"/>
<changelog author="Nephros <[email protected]> 0.9.0" date="1594728000">- Package for SailfishOS</changelog>
<changelog author="Nephros <[email protected]> 0.9.0-2" date="1594728001">- rename binary so youtube-dl can find it</changelog>
<!-- More entries follow … -->
</package>
<!-- More packages follow… -->
</otherdata>I thought this was generated by the changelog from the The state right now is that I parse out the version number from the "author" tag (who decided that it should be put in there?) with the following regex that will work 99% of the time (?P<author>.*) *<(?P<email>.*)>[ -]*(?P<version>.*)and copy inner text of the
When the parsing of this fails due to the changelog not adhering to the formatting guidelines, it simply drops them in a preformatted text block, like the current situation. I considered formatting them as below, but that may rearrange items in the changelog, which can make it harder to follow, which is why I did not implement it.
Affected part 1
Affected part 2
|
|
A
Yes, IMO this is the easiest way to obtain that data nicely pre-parsed into specific fields / variables.
IMO any further "parsing" is futile: The old thread on the Basis for this consideration:
Relevant partBut I might have misunderstood the intention to further analyse the "[<epoch>:]<version>[-<release>]" string, but only to dissect the author field. Then my reply is "Yes, I think your RegEx points in the right direction" (slightly enhanced to also accept horizontal tabs as whitespace characters, for better anchoring and to avoid duplicate namings): In cases this RegEx does not match (which focuses on anchoring of the email address within Notes:
|
|
I do not care about parsing the semantics of the versioning, all I want to split the author from the version number :). I order the changelog by the order of appearance (which happens to be in order of date ascending) and straight up display the version number to the end user. |
Do not restrict it to word characters (\w)
For that I marked the start of the part of my prior message which is relevant for you with "Relevant part", now. P.S.: > all I want to split the author from the version number … and keep the email address as part of the author field, or to extract it separately? HTH & Cheers |
|
Aha, I tried your regexes, but since I´m using Python it does not understand things like ^(?P<contributor>.+?)[-\s]+(?P<versionstring>\S+)$ |
|
I researched a basis to answer your question:
I will try to look at it tomorrow during a long train ride. P.S.: I hate these trivial "shorthand character classes", because then more modern ones ( |
Close, and I refrained from re-evaluating my suggestions, because they likely are "close but not really doing it", too. Constructing a RegEx, again:
Yes, it works like that (pseudocode - last edited 2025-05-28): I you also want the email address separated, I think this is best done in a staged fashion, from the extracted |
Edit: IMO it is better to first strictly dissect HTH! P.S.: |
|
@HenkKalkwater, does the pseudo-code below (copied from my two preceding messages ([1] & [2]), transformed to JavaScript fulfil your needs and work well, or is there anything left I can contribute to this PR? This pseudocode requires the variable
|
|
Hi @Olf0, Currently I'm a little busy, that's why I haven't had the time to look at your suggestions, implement and test them. I'll try to look at it in one of the upcoming weekends |
That is absolutely fine, please take your time! It was just a kind reminder, in case it slipped off your radar; which it did not, as I understand now. |
Parse the lines of a changelog entry using the somewhat standardised
method of writing changelog messages. The parsed messages will be
displayed in a nicely formatted way.
If the parser cannot make any sense out of it, it will fall back to
displaying the changelog in a good, old
<pre>-block.