diff --git a/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_activity_monitor_ss1.png b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_activity_monitor_ss1.png
new file mode 100644
index 0000000..7d543d6
Binary files /dev/null and b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_activity_monitor_ss1.png differ
diff --git a/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_activity_monitor_ss2.png b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_activity_monitor_ss2.png
new file mode 100644
index 0000000..3747eed
Binary files /dev/null and b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_activity_monitor_ss2.png differ
diff --git a/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_activity_monitor_ss3.png b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_activity_monitor_ss3.png
new file mode 100644
index 0000000..4365134
Binary files /dev/null and b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_activity_monitor_ss3.png differ
diff --git a/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_activity_monitor_ss4.png b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_activity_monitor_ss4.png
new file mode 100644
index 0000000..60fbcad
Binary files /dev/null and b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_activity_monitor_ss4.png differ
diff --git a/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_heatmap_1.png b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_heatmap_1.png
new file mode 100644
index 0000000..7b429f1
Binary files /dev/null and b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_heatmap_1.png differ
diff --git a/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_heatmap_2.png b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_heatmap_2.png
new file mode 100644
index 0000000..e0a38b2
Binary files /dev/null and b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/aditij_heatmap_2.png differ
diff --git a/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/index.md b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/index.md
new file mode 100644
index 0000000..03f3702
--- /dev/null
+++ b/content/posts/networkx/nx_parallel/Contri_Phase_Outreachy/index.md
@@ -0,0 +1,133 @@
+---
+title: "Starting Contributing to NetworkX"
+date: 2024-03-11T19:33:00+0530
+tags: ["nx-parallel", "NetworkX", "Outreachy", "first-time-contributor"]
+draft: false
+description: "Journey of contributing to NetworkX of an open-source newbie during the contribution phase of Outreachy's December 2023 cohort."
+author: ["Aditi Juneja"]
+displayInList: true
+---
+
+I started contributing to the [NetworkX repository](https://github.com/networkx/networkx) in October'23 as part of the contributing stage of Outreachy's December 2023 cohort. I chose to contribute to the NetworkX project because after reading through the project details I felt like it was one of the very few projects that seemed interesting to me and I felt like I'll be able to complete it during the 10-11 weeks of the internship. The project was to add 2-3 parallel implementations of NetworkX algorithms in the [nx-parallel repository](https://github.com/networkx/nx-parallel). In this blog, I'll walk you through my first 2.5-3 months of contribution journey with NetworkX.
+
+## TSP-related PRs
+
+I started by exploring the TSP section of the repo (because I looove TSP!). After quickly going through all the TSP-related docs and code. I searched through all the issues, PRs, and discussions with the "TSP" keyword, and I ended up creating two documentation changes thereby closing two issues. I also went through some of the [GSoC blogs by Matt Schwennesen](https://blog.scientific-python.org/tags/traveling-salesman-problem/) on the Scientific Python Blog website. At the end of week 2, I also wrote a [blog](https://schefflera-arboricola.github.io/Schefflera-Arboricola/TSP-approximation-algorithms) on TSP approximation algorithms.
+
+One of the TSP-related [issues](https://github.com/networkx/networkx/issues/6748) was about getting unexplained outputs while running the `nx.approximation.traveling_salesman_problem` algorithm, for the case when the edge weights in a graph are such that they don't fulfill the triangle inequality condition. I proposed 3 possible fixes for it, first to add a note in the documentation about how the Networkx API currently handles it, second was to add a warning when a graph having a triangle inequality is passed and the third(proposed by the issue's author) was to raise an error and not accept the triangle inequality case at all. But, in [the PR](https://github.com/networkx/networkx/pull/6995) I only implemented the first one as it felt best at the moment because it didn't change the NetworkX API but rather explained what the API did. This PR took a long while to get merged, so if you are also starting out, just be patient and keep contributing. The [other TSP-related PR](https://github.com/networkx/networkx/pull/7013) was a very small documentation change.
+
+## Parallelizing algorithms and contributing to nx-parallel
+
+There were 5 initial tasks that our mentor([@MridulS](https://github.com/MridulS)) wanted us to do as Outreachy's potential interns, to get familiar with the codebase and set up the development environment. One of the tasks was to run the [timing_individual_function.py](https://github.com/networkx/nx-parallel/blob/main/timing/timing_individual_function.py) script in nx-parallel, and generate a heatmap depicting how much faster the nx-parallel implementations are as compared to the standard, sequential networkx implementations for the `betweenness_centrality` function. [At first](https://github.com/networkx/outreachy/pull/206), I wasn't able to observe any speedup values greater than 1 in my heatmap, which led me to explore how the timing script was working and dig deeper into how the repository was set up.
+
+
+ {{< image >}}
+ src = 'aditij_heatmap_1.png'
+ alt = 'heatmap for betweenness_centality with no speedups'
+ width = 800
+ align = 'center'
+ {{< /image >}}
+ Heatmap for betweenness_centality algorithm with negligible speedups
+
+
+While trying to figure this out, I enjoyed observing and playing around with all the stats in the Activity Monitor(in Mac). At the time I didn't know how the parallel implementation worked so when I observed the number of threads increase and the CPU% increase for the `kernel_task` process, I thought that maybe parallel implementation does threading. And it made sense because I was running the timing script through my terminal and I had heard people refer terminal as a kernel sometimes.
+
+
+ {{< image >}}
+ src = 'aditij_activity_monitor_ss1.png'
+ alt = 'While sequential implementation (low % CPU for kernel_task)'
+ width = 800
+ align = 'center'
+ {{< /image >}}
+ While sequential implementation (low % CPU for kernel_task)
+
+
+
+ {{< image >}}
+ src = 'aditij_activity_monitor_ss2.png'
+ alt = 'While parallel implementation (number of threads and %CPU for kernel_task increases)'
+ width = 800
+ align = 'center'
+ {{< /image >}}
+ While parallel implementation (number of threads and %CPU for kernel_task increases)
+
+
+But later, when I looked at the code, I discovered that the default backend(`loky`) of `joblib.Parallel` creates parallel processes and not threads, and then I did see there were also new python3.11 processes getting formed. I also observed the usage of all the CPU cores (ref. the small window in the screenshots).
+
+
+ {{< image >}}
+ src = 'aditij_activity_monitor_ss3.png'
+ alt = 'Only one python3.11 process with 99.9% CPU while sequential implementation'
+ width = 800
+ align = 'center'
+ {{< /image >}}
+ Only one python3.11 process with 99.9% CPU while sequential implementation
+
+
+
+ {{< image >}}
+ src = 'aditij_activity_monitor_ss4.png'
+ alt = '8 more python3.11 processes using CPU while parallel implementation is running'
+ width = 800
+ align = 'center'
+ {{< /image >}}
+ 8 more python3.11 processes using CPU while parallel implementation is running
+
+
+But, later I figured there were no speedups because I didn't set up the development environment properly. And, I also ended up [fixing a bug](https://github.com/networkx/nx-parallel/pull/13) in the timing script, related to edge probability. Here is the final heatmap:
+
+
+ {{< image >}}
+ src = 'aditij_heatmap_2.png'
+ alt = 'heatmap for betweenness_centality with speedups'
+ width = 800
+ align = 'center'
+ {{< /image >}}
+ Heatmap for betweenness_centality algorithm with speedups
+
+
+### 1. Parallelizing `all_pairs_bellman_ford_path` algorithm [ref. [PR](https://github.com/networkx/nx-parallel/pull/14)]
+
+I was going through the discussions and there was [this discussion](https://github.com/networkx/networkx/discussions/7000) that felt like I could answer, about shortest paths and multigraphs. While I was going down the rabbit hole of functions to figure out what happens to the `weights` when the shortest path function is called on a multigraph, I encountered this comment `#TODO This can be trivially parallelized.` in the `all_pairs_bellman_ford_path` algorithm. It felt very much related to the internship project and I thought it was an issue. I later discovered there are several of these comments all over the codebase and also several issues/PRs related to parallelization - 2 main ones were [issue 4064](https://github.com/networkx/networkx/issues/4064) and [PR 6306](https://github.com/networkx/networkx/pull/6306). I started working on this "issue" and I implemented a skeleton parallel implementation and made my first significant PR in nx-parallel, which got merged around the start of my internship(end of Dec 23). By working on this, I got a deeper understanding of how generators work and learned about [how to implement embarrassingly parallel algorithms using `joblib.Parallel`](https://joblib.readthedocs.io/en/latest/parallel.html), and about how we can parallelize the functions that return generators.
+
+### 2. ASV benchmarking [ref. [PR](https://github.com/networkx/nx-parallel/pull/24)]
+
+While I was exploring the nx-parallel repository, there was only [this one issue](https://github.com/networkx/nx-parallel/issues/12) about building a benchmarking infrastructure, so I started working on it at the start of week 2 of the contribution stage, and by the end of week 3, I made a PR with a skeleton benchmarking infrastructure for the nx-parallel repo. Setting up the config file for benchmarks took me some time because I was quite new to how projects are set up so I didn't know much about it. And, in hindsight, I think I should have probably asked about it from my mentor. But, this helped me get a better understanding of the nx-parallel project, and things like hatch-build, wheels, understanding the NetworkX backend architecture and dispatching, learning about tests and workflows and Sphinx docs, etc.
+
+While configuring and structuring the benchmarks I also enjoyed exploring how the benchmarks were configured and structured in other projects like networkx, numpy, scipy, pandas, etc. By the end of Dec'23, I had implemented benchmarks for almost all the algorithms that were present in the nx-parallel repository. I used the backend, number of nodes, and edge probability as the benchmarking parameters.
+
+While working on both of the PRs and communicating with my mentors, I got a better understanding of how the project needs to proceed and how to implement things so that they could be scaled in the future.
+
+## Some maintenance and documentation PRs
+
+I was looking for things to contribute to I guess... I was mostly just worried I hadn't made a PR in a week because I was working on the benchmarking and I didn't feel like I was making any progress on that, so I made [this PR](https://github.com/networkx/networkx/pull/7027) to update the algorithms in `core.py` (btw, if you are looking for somewhere to start contributing, there are a lot of places in networkx which require some kind of maintenance), but I started actively working on it after I was satisfied by my work on the benchmarking PR. I later made a [similar PR](https://github.com/networkx/networkx/pull/7110) updating documentation and tests of most of the functions in `similarity.py` and `link_prediction.py`. I also made [another PR](https://github.com/networkx/networkx/pull/7121) for deprecating behavior of functions in `core.py` which gave me insights into the depreciation process of NetworkX, and again if you are looking for somewhere to start, then depreciating things from this [deprecations.rst](https://github.com/networkx/networkx/blob/main/doc/developer/deprecations.rst) would be usually pretty straightforward and easy to implement. I opened some documentation PRs as well, while I was exploring the codebase. [Here](https://github.com/networkx/networkx/pulls?q=is%3Apr+author%3ASchefflera-Arboricola+is%3Aclosed) are all my PRs.
+
+## Drafting the Final Application
+
+I went through all the algorithms in the NetworkX repo and all the past commits in the nx-parallel repo (there were only 30 commits) to get an idea about where the project was going. I went through a few papers on parallel graph algorithms. I don't think [my proposal](https://docs.google.com/document/d/1xLr4_kqxWU3dB2GB5AAsSvr3VDtKMssQs5iYoxJhffg/edit?usp=sharing) was "perfect", but I think contributing and communicating with my mentor throughout was what mattered more and built trust, in my case. Also, I had a hard time making a week-wise timeline. And Mridul was fine with what I proposed but I wasn't, and I kept updating my own personal timeline as the internship progressed. I would add more ideas and how much time they will take. But, I think the more I understood things the better I got at estimating how much time they'll take. I'm still learning I guess to get better at it.
+
+Some of the resources I found useful during the initial application and while writing a proposal:
+
+1. [Outreachy's website (Applicant's guide)](https://www.outreachy.org/docs/applicant/)
+2. [Outreachy Blogs by Princiya](https://princiya.com/blog/ace-your-outreachy-application/)
+3. [Vani Chitkara's blogs](https://vanichitkara.wordpress.com/blog/)
+4. [A walkthrough of an Outreachy proposal](https://youtu.be/Mr5lFGlB8d0?si=lVYmXU-SPoUmTFGF)
+
+## After the Final Application
+
+I continued contributing to my PRs after submitting the proposal. I didn't know this, but you can keep adding new contributions to the projects on the Outreachy website even a few days after the final proposal deadline. And I also discovered these [blogs](https://github.com/20kavishs/GSOCBlogNetworkX/blob/main/index.md) and the [GSoC final submission doc](https://docs.google.com/document/d/1aGnYhUlbT970HkipDZ4K7R4qqTtid29tmSFYT98hYyw/edit#heading=h.y43pkwhqzzp2) by [@20kavishs](https://github.com/20kavishs) (the person working on nx-parallel before me). They also gave some more context. After the results were announced, I wasn't feeling very... great. But, still, I sent an email to my mentor asking for feedback on my performance and how I could have been better. And, in the reply, I got to know that I wasn't selected because I wasn't eligible for Dec'23 cohort as I was a student at a university in the northern hemisphere. My mentor also informed me that they(NetworkX) might be interested in hiring me as an intern outside of Outreachy, but nothing for sure. So, it was a bit uncertain for a few days. However, I decided to keep on contributing because I felt like I was getting value out of it and learning something, and I didn't want to abandon my open PRs. And around one week later, I got an email from Mridul to discuss further about the NetworkX Internship. And, after some discussions, I filled out the Independent Contractor form, the W8-BEN form and signed the ICA with NumFOCUS (NetworkX's fiscal sponsor) to primarily work on nx-parallel, and NetworkX in general, starting 1st Jan'24. I was paid from NetworkX's CZI grant funds, and my primary mentors were [Dan Schult](https://github.com/dschult) and [Mridul Seth](https://github.com/MridulS). Yay!
+
+## A few extra points/tips
+
+- **Exploit ChatGPT as much as you can**: Sometimes as a beginner, going through 2-3 issues in which you are bombarded with a ton of words you don't understand can seem overwhelming so what I used to do was that I used to copy-paste all the conversations under an issue/PR into ChatGPT and tell it to summarise or explain it to me, and then once I have an overview of the discussion I'd then start reading through the actual conversations. It might also come in handy while understanding the codebase, or understanding legal documents, financial terms etc. Perplexity is better when trying to understand the latest developments and also better for understanding very large code files(bcoz you can upload code files in it). Also, there is a limit to the number of co-pilot queries you can make, so use that wisely! Perplexity's results are based on the search results of your query and you can also see the resources it referred and those can lead you to somewhere good as well. To me, it was very useful in understanding the things that people were discussing in the NetworkX's dispatching meets as those things were really new and perplexity could refer me to the specific meeting notes and the discussions on the NetworkX's discourse. And, I think it's good to ask things, from something like ChatGPT or perplexity, before you ask them from your mentor unless it's a thing on which you need an experienced human's perspective.
+
+- **Outreachy specific**:
+ - I wrote my essays and submitted my initial application just 1-2 minutes before the deadline because I "knew" I wouldn't get selected because I was applying for the first time and I barely had any experience or a good background. And I didn't even check Outreachy's website for selected organizations or organizations of past cohorts. But, when I got the email that I'd been selected then I didn't waste any time because it was the first good thing that had happened to me in ages and I didn't want to lose it. Having said that, there were times when I just didn't feel good enough to be able to make it into the final internship, but then I just thought there was someone at Outreachy who believed that I could, so I just didn't want to let them down and I just tried to be better. But now I think it's better if I'm that person for myself and not rely on others too much. Anyways, in the first 2-3 days I went through all the projects and listed 4-5 projects I was interested in. I categorized all the projects into 3 categories - "Not Interested", "Interested but not skilled enough", and "Interested and skilled". And concentrated on the projects in the last category. So, to sum it up, JUST APPLY... even if you think that you really really really can't get selected, because you might, and even if you don't, you will at least learn how "not" to do it :) All the best!
+ - One of the points mentioned by Mridul in the "Mentoring Style" section on the project page was to give weekly updates on the work. So, I started emailing him my work updates in the contribution stage and it also helped me ask him all my questions at the end of each week. So, you can also try to follow your mentor's mentoring style if it doesn't annoy them too much. And just try to understand what's really expected out of you, other than what's written on the project page, by communicating with your mentor and the community members. And I didn't know this, but in the case of NetworkX (or any other project in the Scientific Python Ecosystem), you can also join the [weekly community meetings](https://scientific-python.org/calendars/), if you like (or go through the meeting notes, if you are meeting-shy!), just to get an idea about what's happening and how things work in the community. They are open to all :)
+ - Outreachy requires all organizations to secure funding for at least one intern. So, if you were contributing through Outreachy and later on got hired by the organization instead, like in my case, then you should probably get paid that secured amount. (ref. [Outerachy - intern funding](https://www.outreachy.org/docs/community/#intern-funding))
+
+[All of my involvements in networkx](https://github.com/search?q=repo%3Anetworkx%2Fnetworkx+involves%3ASchefflera-Arboricola&type=issues)!
+
+Check out the next blog - [NetworkX Internship - Working on nx-parallel](../networkx_internship/)!
+
+Thank you :)
diff --git a/content/posts/networkx/nx_parallel/NetworkX_Internship/aditij_backend_box_ss.png b/content/posts/networkx/nx_parallel/NetworkX_Internship/aditij_backend_box_ss.png
new file mode 100644
index 0000000..cf813c3
Binary files /dev/null and b/content/posts/networkx/nx_parallel/NetworkX_Internship/aditij_backend_box_ss.png differ
diff --git a/content/posts/networkx/nx_parallel/NetworkX_Internship/aditij_nxp_illustration.png b/content/posts/networkx/nx_parallel/NetworkX_Internship/aditij_nxp_illustration.png
new file mode 100644
index 0000000..c1ef0ae
Binary files /dev/null and b/content/posts/networkx/nx_parallel/NetworkX_Internship/aditij_nxp_illustration.png differ
diff --git a/content/posts/networkx/nx_parallel/NetworkX_Internship/index.md b/content/posts/networkx/nx_parallel/NetworkX_Internship/index.md
new file mode 100644
index 0000000..a970536
--- /dev/null
+++ b/content/posts/networkx/nx_parallel/NetworkX_Internship/index.md
@@ -0,0 +1,131 @@
+---
+title: "NetworkX Internship - Working on nx-parallel"
+date: 2024-03-11T19:34:00+0530
+tags: ["nx-parallel", "NetworkX", "NetworkX-Internship", "CZI"]
+draft: false
+description: "Two months of contributing to nx-parallel as an independent contractor."
+author: ["Aditi Juneja"]
+displayInList: true
+resources:
+ - name: featuredImage
+ src: "aditij_nxp_illustration.png"
+ params:
+ description: "Illustration"
+ showOnTop: false
+---
+
+In the [last blog](../contri_phase_outreachy/), I walked you through my journey of contributing to NetworkX as an open-source newbie. In this blog, I’ll share with you my journey of working primarily on [nx-parallel](https://github.com/networkx/nx-parallel) and NetworkX in general, during the NetworkX Internship as an Independent Contractor. I hope you will find this blog helpful if you are someone thinking of contributing to nx-parallel in the future.
+
+I won't be concentrating too much on the details of the work I did in this blog, but I will be sharing the resources I found helpful, the problems I faced, the solutions I came up with, the things I learned, and the future todos for nx-parallel. If you are interested in the exact work that I did, then going through these [internship meeting notes](https://hackmd.io/@Schefflera-Arboricola/HkHl6IPaa) (weekly work updates included) might be insightful.
+
+## Docs and backend configurations
+
+In the first week, I started off with maintenance and documentation stuff. I also added `n_jobs` and other joblib-related kwargs (ref. [`joblib.Parallel` docs](https://joblib.readthedocs.io/en/latest/generated/joblib.Parallel.html)), which would let the user decide things like how many CPU cores to use, which joblib `backend` to use, enable `verbose`(lets you see logs) etc. I started off by digging into how joblib's `Parallel` class and [sklearn](https://scikit-learn.org/stable/computing/parallelism.html) do it. I initially implemented to take these arguments from the user in the function call and then pass them in `joblib.Parallel` call. But, having them as backend configurations rather than keyword arguments(as suggested by Mridul) made more sense (to me) because that way we wouldn’t be populating all the function headers (and calls) with the same arguments, and instead there would be a global variable or file for storing all the backend configurations. We also decided that in the future when we will add this config dictionary we wouldn’t allow the user to change certain joblib-related kwargs that can potentially break the API, like `return_as`.(ref. [PR](https://github.com/networkx/networkx/pull/7225) - adds `config` for nx backends)
+
+For the nx-parallel website, I started looking into [how the docs were set up for nx-guides](https://github.com/networkx/nx-guides/pull/21) and the configs in the `doc` folder in networkx. But, we decided that we can have a separate website for nx-parallel once a good number of algorithms are added, and until then we can just have a few lines about the parallel implementation and any additional parameters on the main NetworkX docs in the "Additional Backend implementation" box at the end of algorithm's docs page, something like this:
+
+
+ {{< image >}}
+ src = 'aditij_backend_box_ss.png'
+ alt = "backend implementation box at the end of algorithm's docs page"
+ width = 800
+ align = 'center'
+ {{< /image >}}
+ ref.
nx.square_clustering
+
+
+For that I updated the docstrings and added a `get_info` function to the `backend_info` [entry point](https://packaging.python.org/en/latest/specifications/entry-points/#entry-points) that would automatically extract the additional docs and parameters from an algorithm's docstring. (ref. [PR](https://github.com/networkx/nx-parallel/pull/27)).
+
+And initially, I created one PR for all these things and then I had to split it into 4-5 different PRs. So, it's better to discuss first and then implement, and not to create one PR for different things just because they seem like "small" changes at the moment because they might become quite big and hard to review later on.
+
+## Chunking and adding more algorithms
+
+Over the past two months, I implemented around 15 parallel algorithms in nx-parallel. Out of which three of them (`global_reaching_centrality`, `local_reaching_centrality` and `all_pairs_shortest_path_length`) didn't show any speedups, even after increasing the graph's size (ref. [PR](https://github.com/networkx/nx-parallel/pull/44), [PR](https://github.com/networkx/nx-parallel/pull/33)). Now, before going into why they didn't show any speedups, let us just take a step back and understand what "chunking" is, and even before that let's quickly see how we can use joblib to create and run parallel processes.
+
+So, joblib is a high-level parallel computing library and it's really easy to parallelize a simple `for` loop with independent iterations, like this:
+
+```.py
+>>> from joblib import Parallel, delayed
+>>> num = [1, 2, 3, 4, 5, 6, 7, 8]
+>>> squares = Parallel(n_jobs=-1)(delayed(lambda x: x ** 2)(n) for n in num)
+>>> squares
+[1, 4, 9, 16, 25, 36, 49, 64]
+```
+
+here, `n_jobs=-1` indicates that we want to use all the available CPU cores. And `Parallel(n_jobs=-1)` creates an object of [`Parallel` class in joblib](https://github.com/joblib/joblib/blob/master/joblib/parallel.py#L937). And `(delayed(lambda x: x ** 2)(n) for n in num)` is a generator that is passed into the `Parallel` object. The `delayed` is a decorator function provided by joblib that captures the arguments of a function, (ref. [1](https://stackoverflow.com/questions/42220458/what-does-the-delayed-function-do-when-used-with-joblib-in-python), [2](https://github.com/joblib/joblib/blob/master/joblib/parallel.py#L654C7-L654C7), [3](https://joblib.readthedocs.io/en/latest/parallel.html#joblib.delayed)). From the docs: "The delayed function is a simple trick to be able to create a tuple (function, args, kwargs) with a function-call syntax.". So, the `delayed` function delays the execution of the lambda function until it is called by the `Parallel` when running the tasks in parallel.
+
+Now, often in parallel computing, we "chunk" the data before putting it into any kind of parallel machinery (like `joblib.Parallel`), like this:
+
+```.py
+>>> l_chunks = [[1, 2], [3, 4], [5, 6], [7, 8]]
+>>> square_chunks = Parallel(n_jobs=-1)(delayed(lambda chunk: [n**2 for n in chunk])(chunk) for chunk in l_chunks)
+>>> squares = [square for chunk in square_chunks for square in chunk]
+>>> squares
+[1, 4, 9, 16, 25, 36, 49, 64]
+```
+
+And chunking seems to usually improve the speedups. I only saw chunking in nx-parallel so I didn't know how [usual](https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.map) it was. So, I started looking into why chunking was improving speedups and when it wasn't (ref. [PR](https://github.com/networkx/nx-parallel/pull/42)). The main takeaway was that chunking usually improves the speedups probably because it reduces the overhead of managing and creating parallel processes. And sometimes when the processing time of chunks is too small (like in `number_of_isolates`) or the processing time of a chunk is too large (like in `all_pairs_node_connectivity`), then chunking might not improve the speedups. In the case of `all_pairs_node_connectivity`, changing the default chunking and reducing the size of the default chunk helped in getting speedups (ref. [PR](https://github.com/networkx/nx-parallel/pull/33)). But, all these conclusions were based on experimenting and benchmarking on the random graphs generated by the `nx.fast_gnp_random_graph(num, p, seed=42)` function, so I added `get_chunks` kwarg to most of the algorithms which would let the user define the chunks according to their specific needs, like if the user knows (based on some other centrality measure) that calculating the `betweenness_centrality` for some particular nodes would take a lot of time then they can make sure that those nodes are not populated in one chunk by defining the chunks using `get_chunks` kwarg (ref. [PR](https://github.com/networkx/nx-parallel/pull/29)).
+
+Now, the reason for `global_reaching_centrality` and `local_reaching_centrality` not having speedups might be that I've not tried using chunking them, yet. Another reason that I think for `global_reaching_centrality` might be that it uses the parallel implementation of the `local_reaching_centrality` algorithm and that was probably creating a lot of parallel processes and hence increasing the overhead of managing the parallel processes (A little side note: this one got me curious about what happens when we recursively creating chunks and run them parallelly, so I played around with that a little bit (ref. [1](https://stackoverflow.com/questions/78016770/why-does-recursively-running-joblib-parallel-increases-the-computation-time))). Also, I might get speedups for `all_pairs_shortest_path_length` (and also `local_reaching_centrality`) by using even bigger chunks, because their sequential implementation is already pretty fast.
+
+### Chunking in functions that `yield` instead of `return`
+
+Many `all_pairs_` functions in networkx `yield` results instead of returning a dictionary and `joblib.Parallel` has a kwarg `return_as` which when set to `generator` returns a generator object instead of a list. So, in the very first `all_pairs_` algorithm ([all_pairs_bellman_ford_path](https://github.com/networkx/nx-parallel/pull/14)) that I implemented, I didn't use any chunking and just returned the generator object returned by `joblib.Parallel`. But, when I was playing around with chunking (ref. [PR](https://github.com/networkx/nx-parallel/pull/42)) it took me a while to figure out how I could implement chunking with generators. I finally did something like this:
+
+```.py
+paths_chunk_generator = (
+ delayed(_process_node_chunk)(node_chunk) for node_chunk in node_chunks
+)
+
+for path_chunk in Parallel(n_jobs=-1)(paths_chunk_generator):
+ for path in path_chunk:
+ yield path
+```
+
+So, first, we created a generator expression that generates delayed function calls to `_process_node_chunk` for each `node_chunk`. The `delayed` defers the execution of `_process_node_chunk` until later. The generator expression `paths_chunk_generator` is then passed as input to `Parallel`, which executes the delayed function calls in parallel and then yields each `path` using a simple `for` loop. The `_process_node_chunk` function returns a list of shortest paths for all the nodes in the given chunk:
+
+```.py
+def _process_node_chunk(node_chunk):
+ return [
+ (node, single_source_bellman_ford_path(G, node, weight=weight))
+ for node in node_chunk
+ ]
+```
+
+Yielding instead of returning a generator object and adding chunking both improved the speedups (ref. [PR](https://github.com/networkx/nx-parallel/pull/49)). I also added the `get_chunks` kwarg to the `all_pairs_bellman_ford_path` algorithm and improved all the other `all_pairs_` algorithms similarly. (ref. [PR](https://github.com/networkx/nx-parallel/pull/33)).
+
+While working on nx-parallel, I found [this playlist/course](https://youtube.com/playlist?list=PLp6ek2hDcoNBAyEJmxsOowMYNTKsUmTZ8&si=-JYAMYM0pC4NJRye) helpful for understanding many of the theoretical concepts in parallel computing. It's a bit old but the concepts are still the same and I used to refer to it sometimes.
+
+## Future todos for nx-parallel
+
+- Right now, for all the algorithms in nx-parallel, joblib's default backend `loky` is being used. A potential issue with that is that when some `n` parallel processes are created then `n` copies of the same graph(and all other variables in the namespace) are also created because all processes run independently. So, it won't be very memory efficient when the graph is really huge. So, using a different joblib's backend like dask or ray, etc., and having something like distributed graph algorithms, whenever possible, might be helpful. Also, then having something like a [MapReduce model](https://en.wikipedia.org/wiki/MapReduce) in nx-parallel would make sense, as proposed in [this PR](https://github.com/networkx/nx-parallel/pull/7). Implementing a distributed algorithm for `number_of_isolates` might be an easy starting point.
+- adding `config` to nx-parallel once [this PR](https://github.com/networkx/networkx/pull/7225) gets merged in the networkx repository (and also adding `config`-related tests and benchmarks). This will let the user play around with a lot of stuff using nx-parallel!
+- having consistent heatmaps and improving the timing script (ref. [issue](https://github.com/networkx/nx-parallel/issues/51))
+- cleaning up processes once the algorithm is done running and seeing if that improves the performance in any way. (ref. [issue](https://github.com/joblib/joblib/issues/945))
+- [Erik](https://github.com/eriknw) brought it up in one of the meetings that using `bellman-ford` as the `method` (instead of the default `dijkstra`) in the `all_pairs_all_shortest_paths` algorithm improves speedups in nx-cugraphs. So, I wanted to try and see if that's also the case in nx-parallel.
+- Question: should we keep the algorithms whose sequential implementation is already really fast and that doesn't show any speedups for the random graphs we use in the current timing script(like `number_of_isolates`, `all_pairs_shortest_path_length`, etc.) in nx-parallel?
+- experiment and try to get speedups in `global_reaching_centrality` and `local_reaching_centrality` algorithms.
+- and obviously, adding more algorithms, and a lot more... :)
+
+For more look at [my GSoC'24 proposal for expanding on nx-parallel](https://docs.google.com/document/d/1xF8dW5-1OAapsTnvsdp9EBEuWAf-Jbxq3kC9kTroAiE/edit?usp=sharing).
+
+## A few other contributions
+
+While I was working on the nx-parallel, I also worked on and reviewed a few PRs in networkx as well. For details, refer to the meeting notes above, but the two PRs that I think are worth mentioning here are [adding and updating the "Backend" section docs](https://github.com/networkx/networkx/pull/7305) and [adding `sort_neighbors` to DFS functions](https://github.com/networkx/networkx/pull/7196). The first one was quite fun and the second one was quite simple. Also, various tests were failing in nx-parallel because nx-parallel uses a different `os` while running the workflow tests, so I created some issues/PRs related to those as well (most of them got fixed by adding a `seed` value). I also created some issues/PRs in other projects like joblib, asv, etc. related to issues I was working with in nx-parallel. Also, reviewing PRs really helped me create better PRs that were easier to review.
+
+[All of my involvements in nx-parallel](https://github.com/search?q=repo%3Anetworkx%2Fnx-parallel+involves%3ASchefflera-Arboricola&type=issues).
+
+
+
+**A quick side note**: One thing that you do as an Independent Contractor that you don't do as an Outreachy intern is that you have to keep track of your hours of service and submit them at the end of the month in an invoice document through [Open Collective](https://opencollective.com/how-it-works). It can be challenging at times to determine what constitutes a "service" when you're learning most things on the go. To simplify things, I organize my work items such that each week represents one work item, and the number of hours was an estimate of the average hours I worked that week. While creating the invoice document, I would add more details to the weekly work updates that I was already emailing to my mentors. But, I don't think that it was the best way to do it because the invoice document is not visible to everyone and it would have been nicer if I had added the exact work items instead of just 4 work items (work done in week1, week2, week3, week4). But, I've all my work updates in [these meeting notes](https://hackmd.io/@Schefflera-Arboricola/HkHl6IPaa), if you want to see them.
+
+## An illustration summarising nx-parallel (so far) :)
+
+{{< image >}}
+src = 'aditij_nxp_illustration.png'
+alt = 'illustration summarising nx-parallel (so far)'
+width = 800
+align = 'center'
+{{< /image >}}
+
+Thank you :)
diff --git a/content/posts/networkx/nx_parallel/_index.md b/content/posts/networkx/nx_parallel/_index.md
new file mode 100644
index 0000000..a57b955
--- /dev/null
+++ b/content/posts/networkx/nx_parallel/_index.md
@@ -0,0 +1,3 @@
+---
+title: Working on nx-parallel
+---