Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
a2d7ec1
pyscg restructuring to adress #894
myteron Nov 17, 2025
fc739ae
adding missing rules to main readme:209, 404, 460, 1335, 366, 472\nfi…
myteron Dec 9, 2025
298c19c
added missing links for CWE-362 and CWE-584
myteron Dec 9, 2025
87138ec
Merge branch 'ossf:main' into pyscg_new_layout
myteron Dec 9, 2025
53cef48
pyscg changes for 01_introduction
myteron Dec 9, 2025
4ca8cbf
batch of commits for '02 Encoding and Strings'
myteron Dec 9, 2025
df3ec82
adding 03_numbers
myteron Dec 9, 2025
66fa649
adding 04_neutralization
myteron Dec 9, 2025
e142535
adding 05_exception_handling
myteron Dec 9, 2025
9c3a473
adding 06 Logging
myteron Dec 9, 2025
74c2052
adding 08 concurrency
myteron Dec 9, 2025
edda291
adding 09 conding standards
myteron Dec 9, 2025
6a889f4
adding 10 cryptography
myteron Dec 9, 2025
ac740c1
Update README.md fixing linter issue
myteron Dec 9, 2025
95d969b
Update README.md, fixing linter issue
myteron Dec 9, 2025
db05df3
Update README.md 03_numbers 0003 fixing linting
myteron Dec 9, 2025
4573f90
Update README.md 03 numbers 0004 fix lint
myteron Dec 9, 2025
223062b
Update README.md 03 numbers fixing 0005 linting
myteron Dec 9, 2025
e51418e
Update README.md 03 numbers 0006 linting
myteron Dec 9, 2025
fb38a47
Update README.md 03 numbers 0007 linting
myteron Dec 9, 2025
80e4e70
Update README.md 04 neutralization 0008 linting
myteron Dec 9, 2025
e7a1934
Updated 03 numbers section to fix linter issues
myteron Dec 9, 2025
9b16f21
updating commit, new titles
myteron Dec 9, 2025
4466eeb
updating commit 01 introduction, new titles
myteron Dec 12, 2025
527667d
shorted section titles to do's. moved 0023 into 04. moved section ids…
myteron Dec 12, 2025
460ba99
updated contrib guide
myteron Dec 12, 2025
2969481
fixed linting issue and link to 07_concurrency
myteron Dec 12, 2025
d61045c
rat...
myteron Dec 12, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-501: Trust Boundary Violation
# pyscg-0040: Respect Trust Boundaries

Python's trust boundaries rely on explicit process isolation, rather than in-process access control within a single interpreter.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-798: Use of hardcoded credentials
# pyscg-0041: Avoid Hardcoded Credentials

Ensure that unique keys or secrets can be replaced or rejected at runtime and never hard-code sensitive information, such as passwords, and encryption keys in a component.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-783: Operator Precedence Logic Error
# pyscg-0042: Mind Operator Precedence

Failing to understand the order of precedence in expressions that read and write to the same object can lead to unintended side effects.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-472: External Control of Assumed-Immutable Web Parameter
# pyscg-0055: Validate Web Parameters

Ensuring user roles are determined on the server side prevents attackers from manipulating permissions through client-side data.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-175: Improper Handling of Mixed Encoding
# pyscg-0043: Handle Mixed Character Encoding

Locale-dependent programs may produce unexpected behavior or security bypasses in an environment whose locale is unset, or not set to an appropriate value.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-180: Incorrect Behavior Order: Validate Before Canonicalize
# pyscg-0044: Validate Before Canonicalize

Normalize/canonicalize strings before validating them to prevent risky strings such as `../../../../passwd` allowing directory traversal attacks, and to reduce `XSS` attacks.

Expand All @@ -7,7 +7,7 @@ The need for supporting multiple languages requires the use of an extended list
Character Encoding systems such as `ASCII`, `Windows-1252`, or `UTF-8` consist of an agreed mapping between byte values and a human-readable character known as code points. Each code point represents a single relation between characters such as a fixed number "`\u002e`", its graphical representation "`.`", and name "`FULL STOP`" [[Batchelder 2022]](https://www.youtube.com/watch?v=sgHbC6udIqc). Using the same encoding assures that equivalent strings have a unique binary representation Unicode Standard _annex #15, Unicode Normalization Forms_ [[Davis 2008]](https://wiki.sei.cmu.edu/confluence/display/java/Rule+AA.+References#RuleAA.References-Davis08). Different or unexpected changes in encoding can allow attackers to workaround validation or input sanitation affords.

> [!WARNING]
> Ensure to use allow lists to avoid having to maintain an deny list on a continuous basis (as exclusion lists are a moving target) as per [CWE-184: Incomplete List of Disallowed Input - Development Environment](../../CWE-693/CWE-184/README.md).
> Ensure to use allow lists to avoid having to maintain an deny list on a continuous basis (as exclusion lists are a moving target) as per [pyscg-0047: Use Allow Lists Over Deny Lists](../../04_neutralization/pyscg-0047/README.md).

<table>
<tr>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# CWE-182: Collapse of Data into Unsafe Value
# pyscg-0045: Enforce Consistent Encoding

Handling data between different encodings or while filtering out untrusted characters and strings can cause malicious content to slip through input sanitation.

Encoding changes, such as changing from `UTF-8` to pure `ASCII`, can result in turning non-functional payloads, such as `<script生>`, into functional `<script>` tags. Mixed encoding modes [CWE-180: Incorrect Behavior Order: Validate Before Canonicalize - Development Environment](../../CWE-707/CWE-180/) can also play a role. The recommendation by [Batchelder 2022](https://www.youtube.com/watch?v=sgHbC6udIqc) to use a single type of encoding and mode is only applicable for a single project or supplier. The recommendation to always choose the `UTF-8` by [W3c.org 2025](https://www.w3.org/International/questions/qa-what-is-encoding) provides no guarantee and is already flawed by Windows having `Windows-1252` encoding for some Python installations.
Encoding changes, such as changing from `UTF-8` to pure `ASCII`, can result in turning non-functional payloads, such as `<script生>`, into functional `<script>` tags. Mixed encoding modes [pyscg-0044: Validate Before Canonicalize](../pyscg-0044/README.md) can also play a role. The recommendation by [Batchelder 2022](https://www.youtube.com/watch?v=sgHbC6udIqc) to use a single type of encoding and mode is only applicable for a single project or supplier. The recommendation to always choose the `UTF-8` by [W3c.org 2025](https://www.w3.org/International/questions/qa-what-is-encoding) provides no guarantee and is already flawed by Windows having `Windows-1252` encoding for some Python installations.

The `example01.py` is a crudely simplified version of two methods simulating two completely different systems using different encodings. We are simulating the data at rest and data in transit part in a variable named `floppy`. The write_message and read_message method would be delivered independently in a real world scenario, each with their own encoding.

Expand Down Expand Up @@ -64,8 +64,8 @@ The `example01.py` turns a non-functional `UTF-8` encoded message `<script��

A compliant solution will have to adhere to at least:

* [CWE-180: Incorrect Behavior Order: Validate Before Canonicalize](../../CWE-707/CWE-180/)
* [CWE-184: Incomplete List of Disallowed Input - Development Environment](../CWE-184/README.md)
* [pyscg-0044: Validate Before Canonicalize](../pyscg-0044/README.md)
* [pyscg-0047: Use Allow Lists Over Deny Lists](../../04_neutralization/pyscg-0047/README.md)

Reduction of data into a subset is not limited to strings and characters.

Expand All @@ -83,8 +83,8 @@ Reduction of data into a subset is not limited to strings and characters.
|[MITRE CWE](http://cwe.mitre.org/)|Pillar: CWE-693, Protection Mechanism Failure \[online\], available from <https://cwe.mitre.org/data/definitions/693.html> \[Accessed April 2025\]|
|[MITRE CWE](http://cwe.mitre.org/)|Base: CWE-182: Collapse of Data into Unsafe Value \[online\], available from <https://cwe.mitre.org/data/definitions/182.html> \[Accessed April 2025\]|
|[SEI CERT Coding Standard for Java](https://wiki.sei.cmu.edu/confluence/display/java/SEI+CERT+Oracle+Coding+Standard+for+Java)|IDS11-J. Perform any string modifications before validation\[online\], available from: <https://wiki.sei.cmu.edu/confluence/display/java/IDS11-J.+Perform+any+string+modifications+before+validation> \[Accessed April 2025\]|
|[OpenSSF Secure Coding in Python](https://github.com/ossf/wg-best-practices-os-developers/tree/main/docs/Secure-Coding-Guide-for-Python)|CWE-180: Incorrect Behavior Order: Validate Before Canonicalize \[online\], available from <https://github.com/ossf/wg-best-practices-os-developers/blob/main/docs/Secure-Coding-Guide-for-Python/CWE-707/CWE-180> \[Accessed April 2025\]|
|[OpenSSF Secure Coding in Python](https://github.com/ossf/wg-best-practices-os-developers/tree/main/docs/Secure-Coding-Guide-for-Python)|CWE-184: Incomplete List of Disallowed Input \[online\], available from <https://github.com/ossf/wg-best-practices-os-developers/blob/main/docs/Secure-Coding-Guide-for-Python/CWE-693/CWE-184/README.md> \[Accessed April 2025\]|
|[OpenSSF Secure Coding in Python](https://github.com/ossf/wg-best-practices-os-developers/tree/main/docs/Secure-Coding-Guide-for-Python)|pyscg-0044: Validate Before Canonicalize \[online\], available from <https://github.com/ossf/wg-best-practices-os-developers/blob/main/docs/Secure-Coding-Guide-for-Python/02_encoding_and_strings/pyscg-0044/README.md> \[Accessed April 2025\]|
|[OpenSSF Secure Coding in Python](https://github.com/ossf/wg-best-practices-os-developers/tree/main/docs/Secure-Coding-Guide-for-Python)|pyscg-0047: Use Allow Lists Over Deny Lists \[online\], available from <https://github.com/ossf/wg-best-practices-os-developers/blob/main/docs/Secure-Coding-Guide-for-Python/04_neutralization/pyscg-0047/README.md> \[Accessed April 2025\]|

## Bibliography

Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# CWE-838: Inappropriate Encoding for Output Context
# pyscg-0046: Context-Appropriate Output Encoding

Inappropriate handling of an encoding from untrusted sources or unexpected encoding can lead to unexpected values, data loss, or become the root cause of an attack.

Mixed encoding can lead to unexpected results and become a root cause for attacks as showcased in [CWE-180: Incorrect behavior order: Validate before Canonicalize](https://github.com/ossf/wg-best-practices-os-developers/blob/main/docs/Secure-Coding-Guide-for-Python/CWE-707/CWE-180) and [CWE-175: Improper Handling of Mixed Encoding.](https://github.com/ossf/wg-best-practices-os-developers/blob/main/docs/Secure-Coding-Guide-for-Python/CWE-707/CWE-175/README.md) This rule showcases capturing the root cause by untrusted source its original binary without compromising the logging system for forensics.
Mixed encoding can lead to unexpected results and become a root cause for attacks as showcased in [pyscg-0044: Validate Before Canonicalize](../pyscg-0044/README.md) and [pyscg-0043: Handle Mixed Character Encoding](../pyscg-0043/README.md) This rule showcases capturing the root cause by untrusted source its original binary without compromising the logging system for forensics.

> [!CAUTION]
> Processing any type of forensic data requires an environment that is sealed off to an extent that prevents any exploit from reaching other systems, including hardware!
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-1339: Insufficient Precision or Accuracy of a Real Number
# pyscg-0001: Control Numeric Precision

Avoid floating-point and use integers or the `decimal` module to ensure precision in applications that require high accuracy, such as in financial or banking computations.

Expand Down Expand Up @@ -110,4 +110,5 @@ print(
|:---|:---|
|[Bloch 2008]|Item 48, "Avoid `float` and `double` If Exact Answers Are Required"|
|[Bloch 2005]|Puzzle 2, "Time for a Change"|

|[IEEE 754]||
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-191, Integer Underflow (Wrap or Wraparound)
# pyscg-0002: Handle Integer Overflow

Ensure that integer overflow is properly handled in order to avoid unexpected behavior. Python data types can be divided into two categories:

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-1335: Promote readability and compatibility by using mathematical written code with arithmetic operations instead of bit-wise operations
# pyscg-0003: Use Arithmetic Over Bitwise Operations

Avoid using bitwise operations for calculations, write math as math instead to ensure code clarity, compatibility and maintainability.

Expand Down Expand Up @@ -104,7 +104,7 @@ The statement in `compliant01.py` clarifies the programmer's intention.
print(8 * 4 + 10)
```

It is recommended by *[CWE-191, Integer Underflow (Wrap or Wraparound)](../../CWE-191/README.md)* to also check for under or overflow.
It is recommended by *[pyscg-0002: Handle Integer Overflow](../pyscg-0002/README.md)* to also check for under or overflow.

## Non-compliant Code Example (Right Shift)

Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# CWE-197: Numeric Truncation Error
# pyscg-0004: Use Integer Loop Counters

Ensure to have predictable outcomes in loops by using int instead of `float` variables as a counter.

Floating-point arithmetic can only represent a finite subset of real numbers [[IEEE Std 754-2019](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8766229)], such as `0.555....` represented by `0.5555555555555556` also discussed in [CWE-1339: Insufficient Precision or Accuracy of a Real Number](https://github.com/ossf/wg-best-practices-os-developers/tree/main/docs/Secure-Coding-Guide-for-Python/CWE-682/CWE-1339). Code examples in this rule are based on [Albing and Vossen, 2017].
Floating-point arithmetic can only represent a finite subset of real numbers [[IEEE Std 754-2019](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8766229)], such as `0.555....` represented by `0.5555555555555556` also discussed in [pyscg-0001: Control Numeric Precision](../pyscg-0001/README.md). Code examples in this rule are based on [Albing and Vossen, 2017].

Side effects of using `float` as a counter is demonstrated in `example01.py` showcasing that calculating `0.1 + 0.2` does not end up as `0.3`.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-197: Control rounding when converting to less precise numbers
# pyscg-0005: Control Rounding Behavior

While defensive coding requires enforcing types, it is important to make conscious design decisions on how conversions are rounded.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-681: Incorrect Conversion between Numeric Types
# pyscg-0006: Avoid Float String Comparisons

String representations of floating-point numbers must not be compared or inspected outside of specialized modules such as `decimal` or `math`.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-681: Avoid an uncontrolled loss of precision when passing floating-point literals to a Decimal constructor
# pyscg-0007: Avoid Float Literals

When working with decimal numbers in Python, using floating-point literals as input to the `Decimal` constructor can lead to unintended imprecision due to the limitations of `IEEE 754` [Wikipedia 2025](https://en.wikipedia.org/wiki/IEEE_754) floating-point representation; therefore, to ensure accurate decimal representation, it is advisable to avoid using floating-point literals.

Expand Down Expand Up @@ -50,4 +50,5 @@ print(Decimal("0.45"))
|||
|:---|:---|
|[Wikipedia 2025](https://en.wikipedia.org)|IEEE 754 [online]. Available from: [https://en.wikipedia.org/wiki/IEEE_754](https://en.wikipedia.org/wiki/IEEE_754)|

|[Python docs](https://docs.python.org/3/)|decimal — Decimal fixed-point and floating-point arithmetic [online]. Available from: [https://docs.python.org/3/library/decimal.html](https://docs.python.org/3/library/decimal.html) [accessed 2 February 2025]|
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-1335: Incorrect Bitwise Shift of Integer
# pyscg-0053: Handle Bitwise Shifts Safely

Avoid mixing bitwise shifts with arithmetic operations, instead, use clear mathematical expressions instead to maintain predictable behavior, readability, and compatibility.

Expand Down Expand Up @@ -166,7 +166,7 @@ print("Will never reach here")

## Compliant Solution

Bit-shifting is an optimization pattern that works better for languages closer to the CPU than Python. Math in Python is better done by arithmetical functions in Python as stated by *CWE-1335: Promote readability and compatibility by using mathematical written code with arithmetic operations instead of bit-wise operations* [[OpenSSF Secure Coding in Python 2025]](https://github.com/ossf/wg-best-practices-os-developers/blob/main/docs/Secure-Coding-Guide-for-Python/CWE-682/CWE-1335/01/README.md).
Bit-shifting is an optimization pattern that works better for languages closer to the CPU than Python. Math in Python is better done by arithmetical functions in Python as stated by *CWE-1335: Promote readability and compatibility by using mathematical written code with arithmetic operations instead of bit-wise operations* [[pyscg-0003: Use Arithmetic Over Bitwise Operations]](../pyscg-0003/README.md).
Understanding `ctypes` or `C` requires understanding the *CERT C Coding Standard* [[SEI CERT C 2025]](https://www.securecoding.cert.org/confluence/display/seccode/CERT+C+Coding+Standard)and setting boundaries manually in Python.

## Automated Detection
Expand All @@ -181,7 +181,7 @@ Not available
<a href="https://github.com/ossf/wg-best-practices-os-developers/tree/main/docs/Secure-Coding-Guide-for-Python">[OpenSSF Secure Coding in Python 2025]</a>
</td>
<td>
<a href="https://github.com/ossf/wg-best-practices-os-developers/blob/main/docs/Secure-Coding-Guide-for-Python/CWE-682/CWE-1335/01/README.md">CWE-1335: Promote readability and compatibility by using mathematical written code with arithmetic operations instead of bit-wise operations</a>
<a href="https://github.com/ossf/wg-best-practices-os-developers/blob/main/docs/Secure-Coding-Guide-for-Python/03_numbers/pyscg-0003/README.md">pyscg-0003: Use Arithmetic Over Bitwise Operations</a>
</td>
</tr>
<tr>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-134: Use of Externally-Controlled Format String
# pyscg-0008: Prevent Format String Injection

Ensure that all format string functions are passed a static string which cannot be controlled by the user [[MITRE 2023]](https://cwe.mitre.org/data/definitions/134.html)

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-78: Improper Neutralization of Special Elements Used in an OS Command ("OS Command Injection")
# pyscg-0009: Prevent OS Command Injection

Avoid input from untrusted sources to be used directly as part of an OS command and use specialized Python modules where possible instead.

Expand Down Expand Up @@ -31,7 +31,7 @@ Any variation of using input from a lesser trusted source as part of a command l

* *CWE-184: Incomplete List of Disallowed Input.*
* *CWE-209: Generation of Error Message Containing Sensitive Information.*
* *[CWE-501: Trust Boundary Violation](https://github.com/ossf/wg-best-practices-os-developers/blob/main/docs/Secure-Coding-Guide-for-Python/CWE-664/CWE-501/README.md)*
* *[pyscg-0040: Respect Trust Boundaries](../../01_introduction/pyscg-0040/README.md)*

## Non-Compliant Code Example (Read Only)

Expand Down Expand Up @@ -75,7 +75,7 @@ The `FileOperations().list_dir()` method allows an attacker to add commands via

The attack surface increases if a user is also allowed to upload or create files or folders.

The `noncompliant02.py` example demonstrates the injection via file or folder name that is created prior to using the `list_dir()` method. We assume here that an untrusted user is allowed to create files or folders named `& calc.exe or ;ps aux` as part of another service such as upload area, submit form, or as a result of a zip-bomb as per *[CWE-409: Improper Handling of Highly Compressed Data](../../CWE-664/CWE-409/README.md) (Data Amplification)*. Encoding issues as described in *[CWE-180: Incorrect Behavior Order: Validate Before Canonicalize](../CWE-180/README.md)* must also be considered.
The `noncompliant02.py` example demonstrates the injection via file or folder name that is created prior to using the `list_dir()` method. We assume here that an untrusted user is allowed to create files or folders named `& calc.exe or ;ps aux` as part of another service such as upload area, submit form, or as a result of a zip-bomb as per *[pyscg-0012: Handle Data Amplification](../pyscg-0012/README.md)*. Encoding issues as described in *[pyscg-0044: Validate Before Canonicalize](../../02_encoding_and_strings/pyscg-0044/README.md)* must also be considered.

The issue occurs when mixing shell commands with data from a lesser trusted source.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CWE-89: Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')
# pyscg-0010: Prevent SQL Injection

To prevent SQL injections, use input sanitization and parameterized queries instead of `executescript()`.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# CWE-843: Access of Resource Using Incompatible Type ('Type Confusion')
# pyscg-0011: Prevent Type Confusion

When operating on unsigned values coming from external sources, such as `C` or `C++` applications, they should be unpacked using variable types that can represent their entire value range.
This rule is related to [CWE-197: Control rounding when converting to less precise numbers](https://github.com/ossf/wg-best-practices-os-developers/blob/main/docs/Secure-Coding-Guide-for-Python/CWE-664/CWE-197/01/README.md).
This rule is related to [pyscg-0005: Control Rounding Behavior](../../03_numbers/pyscg-0005/README.md).

The scenario in `example01.py` demonstrates what can go wrong when Python needs to interact with `C` or `C++` data types using the `struct` module [[Python docs](https://docs.python.org/3/library/struct.html)]. This can be either over the network, via file, or an interaction with the operating system. A file or stream is simulated with `io.BytesIO`.

Expand Down
Loading