-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
directive %r containing \n #1
Comments
This works fine for me: from apachelogs import LogParser
line = '[23/Jul/2020:11:21:48 +0100] 66.240.192.138 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "\\n" 226'
parser = LogParser('%t %h %{tls}x %{encr}x \"%r\" %b')
entry = parser.parse(line)
print(vars(entry)) Note that I doubled the backslash before the |
Thanks for your reply! |
When you say "imported from an apache access.log file", is your Python code opening & reading from the file (which is what I would generally recommend & expect and which requires no special processing to deal with backslashes), or are you copying & pasting the contents of the file into the Python program? If the latter, making the log file entries into raw strings (If you have a whole big block of them, by enclosing the block in |
I am simply opening/reading from a file. Indeed, i tried enclosing into smt like
And here are some lines of the log file
|
Try this: Wrap the
Then re-run your script and tell me what exactly it outputs. |
Hi, I have tested again and here is what I can tell: case A) using case B) using Case B) is not my use case so we can ignore it for the moment. Let's focus on case A) |
That's not going to change. When Apache writes out \+n in an access log entry, it means that the respective field ( Why exactly do you want the newline to be escaped? A literal newline in the middle of a field in a CSV file isn't a problem as long as the field is quoted. If the problem is that whatever's reading the CSV file can't handle such newlines, you may be better served by configuring the CSV dialect into something compatible with that program. |
Thanks for the tip with the I want the result to be printed exactly as it was in the input file because I am doing some forensics. The output file will contain all requests sent from suspicious IPs to my servers (more than 20 access.log files actually). I can compute how many requests were sent, from which IPs, which date but I cannot easily say top 5 request that generated the most bytes out. These requests are sent by some bots scanning reachable IPs for vulnerabilities. Requests are crafted on purpose to manage to get some data out (eg. |
By the way, .rstrip() is not documented:
|
I don't understand why you want one representation of the request line and not another, but OK.
|
Ah OK! Thought it was part of your library. Not enough used to python ^^ Well I just want the request line to be printed out as it was sent so that I can say for instance that request Thanks for your quick replies anytime. |
Probably the same reason why Apache project is storing the request this way into access.log Cannot get better than this:
What about adding an option to enable/disable writing special characters as mentionned above ? |
Hi,
Thanks for your parser, working great.
Please look at this access log entry where "%r" contains \n. I get an "InvalidEntryError : Could not match log entry..."
[23/Jul/2020:11:21:48 +0100] 66.240.192.138 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "\n" 226
LogParser('%t %h %{tls}x %{encr}x \"%r\" %b')
Is there smt wrong with my regexp or the way apachelogs handles \n ?
The text was updated successfully, but these errors were encountered: