A High-Performance, Production-Ready Python Implementation of C# LINQ with Deferred Execution.
Data manipulation in Python often leads to highly nested comprehensions, unreadable functional chains (map, filter, reduce), or unnecessary memory overhead when processing large data streams.
linqex brings the elegance and power of C# LINQ (Language Integrated Query) directly into the Python ecosystem. It allows you to query, transform, and manipulate iterable sequences using a fluent, declarative syntax while maintaining absolute type safety and phenomenal execution speeds.
Standard Python list comprehensions compute the entire result set in memory at once. If you only need the first 3 matching elements from a 10 GB log file, loading it all into memory is disastrous.
linqex is built on a pure lazy-evaluation architecture using native Python yield generators and the C-based itertools library. The data pipeline you define (e.g., .where().select().order_by()) is never executed until a terminal operation like .to_list(), .first(), or .count() is invoked. This results in an
- 100% C# LINQ Parity: Supports almost all LINQ operators from .NET 8, including modern additions like
.chunk(),.max_by(), and.distinct_by(). - Deferred Execution: Chain as many operations as you want. The engine only computes exactly what it needs, exactly when it needs it.
- Pythonic Fast-Paths: If you pass an in-memory sequence (like a
listortuple), methods like.count(),.element_at(), and.reverse()bypass O(N) iterations and execute instantly in O(1) constant time leveraging Python's__len__and__getitem__. - Zero Overhead Memory: Utilizes strict
__slots__across all classes, eliminating dynamic dictionary allocations and keeping memory usage razor-thin even when spawning millions of groups or ordered states. - Strict Exception Parity: Replicates C#'s robust exception behavior. Operations like
.single()throw exceptions on duplicates, and.to_dict()fiercely guards against silent key overwrites, ensuring data integrity. - Absolute Type Safety: Meticulously annotated with Python
typinggenerics (Generic[T],TypeVar). It provides flawless IDE autocomplete (VS Code, PyCharm) and fully supports static analyzers likemypy. - Stable Multi-Level Sorting: Offers
.order_by().then_by_descending()chaining without re-evaluating the source, natively leveraging Python's lightning-fast Timsort algorithm.
Engineering facts developers need to know when using this library:
- The Generator Exhaustion Reality:
Python generators can only be traversed once. If you pass a generator expression
(x for x in ...)intoEnumerableand execute a terminal operation like.count(), the generator is consumed. A subsequent.to_list()will return an empty array. To perform multiple terminal operations, ensure you pass an in-memory collection (like alist) to the engine or explicitly call.to_list()first. - Terminal vs. Intermediate Operations:
Methods like
where,select, andskipare Intermediate (they return a new Enumerable and do no work). Methods liketo_list,count,sum, andfirstare Terminal (they force the evaluation of the pipeline). - Lookup vs. Dictionary:
In LINQ, a
Dictionarymaps one key to one value, while aLookupmaps one key to a collection of values.linqexstrictly follows this. Furthermore, requesting a non-existent key from a.to_lookup()result returns an emptyEnumerableinstead of throwing aKeyError, making grouped data access incredibly safe.
- No external dependencies.
- Only Python Standard Library (
itertools,collections,functools,typing). - Fully compatible with Python 3.9+.
The library has zero external dependencies and works natively with Python's core toolkit.
-
Clone the repository
git clone https://github.com/TahsinCr/python-linqex.git
-
Install via PIP
pip install linqex
Cleanly filter, sort, and project data without nested comprehensions.
from linqex import Enumerable
data = [
{"name": "Alice", "age": 28, "role": "Dev"},
{"name": "Bob", "age": 35, "role": "HR"},
{"name": "Charlie", "age": 42, "role": "Dev"},
{"name": "Dave", "age": 22, "role": "Dev"}
]
# Pipeline is lazy. No iteration happens yet.
devs = (Enumerable(data)
.where(lambda x: x["role"] == "Dev")
.where(lambda x: x["age"] > 25)
.order_by_descending(lambda x: x["age"])
.select(lambda x: x["name"]))
# Terminal operation executes the pipeline
print(devs.to_list())
# Output: ['Charlie', 'Alice']Finding the maximum element based on a specific property, similar to .MaxBy() in C#.
from linqex import Enumerable
inventory = [
{"id": 1, "product": "Laptop", "price": 1200},
{"id": 2, "product": "Mouse", "price": 45},
{"id": 3, "product": "Monitor", "price": 300}
]
stream = Enumerable(inventory)
# Finds the actual dictionary object of the most expensive item
most_expensive = stream.max_by(lambda x: x["price"])
print(most_expensive["product"]) # Output: Laptop
# O(1) Fast-path count execution since the source is a List
total_items = stream.count() Process millions of records in chunks for database batch inserts without blowing up the RAM.
from linqex import Enumerable
def massive_database_stream():
for i in range(1, 1000000):
yield {"id": i, "status": "pending"}
stream = Enumerable(massive_database_stream())
# Groups data into lists of 500 items lazily
batches = stream.chunk(500)
for batch in batches.take(3): # Only process the first 3 batches
print(f"Executing SQL bulk insert for {len(batch)} items...")Easily group data by a specific key and perform aggregate calculations on the sub-groups.
from linqex import Enumerable
orders = [
{"customer": "C1", "amount": 100},
{"customer": "C2", "amount": 50},
{"customer": "C1", "amount": 200},
{"customer": "C3", "amount": 300}
]
report = (Enumerable(orders)
.group_by(lambda o: o["customer"])
.select(lambda group: {
"customer": group.key,
"total_spent": group.sum(lambda x: x["amount"]),
"order_count": group.count()
})
.to_list())
# [{'customer': 'C1', 'total_spent': 300, 'order_count': 2}, ...]Merge two disparate data sources safely and efficiently.
from linqex import Enumerable
employees = [{"id": 1, "name": "Alice", "dept_id": 10}, {"id": 2, "name": "Bob", "dept_id": 20}]
departments = [{"id": 10, "name": "Engineering"}, {"id": 20, "name": "Sales"}]
joined_data = Enumerable(employees).join(
inner=departments,
outer_key=lambda e: e["dept_id"],
inner_key=lambda d: d["id"],
selector=lambda e, d: f"{e['name']} works in {d['name']}"
).to_list()
# ['Alice works in Engineering', 'Bob works in Sales']The open-source community is the perfect place to push the boundaries of high-performance libraries. Any contributions you make to render linqex faster, safer, or more capable are greatly appreciated!
We are especially looking forward to your contributions in the following areas:
- β‘ Algorithmic Optimizations: New Fast-Path bypasses for specific data types.
- ποΈ New Operators: Expanding the API with niche LINQ operators.
- π Edge-Case Testing: Expanding the already comprehensive unit test suite.
If you have a great idea or solution, please follow the steps below to create a Pull Request (PR). You can also open an Issue with the "enhancement" tag to suggest a new feature.
Don't forget to give the project a Star (β) on the top right if you found it useful. Thanks for your support!
- Fork the project to your own account.
- Create your Feature Branch:
git checkout -b feature/AmazingFeature
- Commit your changes (Make sure to use descriptive messages):
git commit -m 'feat: Added a new Fast-Path for Tuple evaluations'
- Push to the Branch:
git push origin feature/AmazingFeature
- Open a Pull Request on this repository.
β οΈ Important Developer Note: Thelinqexarchitecture relies heavily on generators and iterator logic. Before opening a PR, please run the full unit test suite to ensure 100% Code Coverage is maintained and your code complies with Python 3.9+ standards.
This project is fully open-source under the MIT License (License).
- PyPI: linqex on PyPI
- Source Code: Tahsincr/python-linqex
If you find any bugs or want to make an architectural contribution, feel free to open an Issue or submit a Pull Request on GitHub!
X: @TahsinCrs
Linkedin: @TahsinCr
Email: [email protected]