Skip to content

Conversation

plusls
Copy link

@plusls plusls commented Sep 2, 2025

In python 2.7, some times have JUMP_FORWARD after RAISE_VARARGS, and sometimes not, we should check if we are in the end of if block

JUMP_FORWARD example:

                    Object Name: _check_closed
                    Arg Count: 1
                    Locals: 1
                    Stack Size: 2
                    Flags: 0x00000043 (CO_OPTIMIZED | CO_NEWLOCALS | CO_NOFREE)
                    [Names]
                        'closed'
                        'ValueError'
                    [Var Names]
                        'self'
                    [Free Vars]
                    [Cell Vars]
                    [Constants]
                        'Raises a ValueError if the underlying file object has been closed.\n\n        '
                        'I/O operation on closed file.'
                        None
                    [Disassembly]
                        0       LOAD_FAST                       0: self
                        3       LOAD_ATTR                       0: closed
                        6       POP_JUMP_IF_FALSE               24
                        9       LOAD_GLOBAL                     1: ValueError
                        12      LOAD_CONST                      1: 'I/O operation on closed file.'
                        15      CALL_FUNCTION                   1
                        18      RAISE_VARARGS                   1
                        21      JUMP_FORWARD                    0 (to 24)
                        24      LOAD_CONST                      2: None

no JUMP_FORWARD example:

                        404     STORE_ATTR                      25: compress
                        407     JUMP_FORWARD                    17 (to 427)
                        410     LOAD_GLOBAL                     26: IOError
                        413     LOAD_CONST                      14: 'Mode '
                        416     LOAD_FAST                       2: mode
                        419     BINARY_ADD                      
                        420     LOAD_CONST                      15: ' not supported'
                        423     BINARY_ADD                      
                        424     RAISE_VARARGS                   2
                        427     LOAD_FAST                       4: fileobj
                        430     LOAD_FAST                       0: self
                        433     STORE_ATTR                      27: fileobj
                        436     LOAD_CONST                      8: 0

This crash cause by #561

@plusls plusls force-pushed the fix_py27_raise_ret branch from 8227fdc to 8133b2a Compare September 2, 2025 07:36
Comment on lines 1830 to 1835
// sometimes JUMP ins after raise
// so we should check the end of current block
// current not in block end, skip 1 ins
if (prev->end() != pos) {
bc_next(source, mod, opcode, operand, pos);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I am not sure I understand why this needs to be handled in RAISE_VARARGS opcode and not in JUMP opcode.
  • We should only skip JUMP opcodes. More interestingly, in the example you have provided JUMP_FORWARD jumps to the just next instruction and therefore does not seem like it should lead to any crashes!
  • If this is only for py2.7, we should not perform this logic for all python versions. The changes should be wrapped under a version check.
  • Test should be added to avoid future regressions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I put it in RAISE_VARARGS because of Fix bug in RAISE_VARARGS #561.
  • I’m not really sure what the most appropriate handling would be; this is just a mitigation. Since I’m not very familiar with pycdc itself, I haven’t figured out exactly where it should be added.
  • Also, I only encountered this issue in Python 2.7, so I’m not sure whether other versions have similar cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thanks, for now let's add some tests with some compiled files. With those I will be more easily able to understand if we need to put it somewhere else.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thanks, for now let's add some tests with some compiled files. With those I will be more easily able to understand if we need to put it somewhere else.

Ok, I add some test, and I also found some else bug...

After reading the source code, I realized that pycdc doesn’t have the concept of basic blocks. In fact, the root cause of this bug lies in the incorrect handling of dead instructions. However, pycdc doesn’t care about which instructions will be executed and which won’t.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I mean pycdc is a decompiler and not an interpreter. Its' goal is to understand the instructions general flow and construct reasonable python code out of it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not very familiar with decompilation, but from what I see, decompilers like IDA and Ghidra all build basic blocks, so I think this should be a necessary step in the decompilation process?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a broad sense i think pycdc does it, eg try-except blocks or if-else blocks etc. Apart from that I am not sure.

ASTree.cpp Outdated
curblock->append(prev.cast<ASTNode>());

bc_next(source, mod, opcode, operand, pos);
if (prev->end() != pos) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Again as per above, this should be handled in JUMP opcode instead of in RAISE_VARARGS or RETURN_VALUE.
  • Test should be added to avoid future regressions.

@plusls plusls force-pushed the fix_py27_raise_ret branch from 542a941 to 36e0491 Compare September 2, 2025 08:57
@whoami730
Copy link
Contributor

whoami730 commented Sep 2, 2025

Please avoid force pushes/rebases. Old comments are marked as outdated otherwise and makes it harder for reviewers to keep track of what changes have been reviewed and what haven't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants