Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] import experiment-base #47

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft

Conversation

lgarithm
Copy link

@lgarithm lgarithm commented Jan 22, 2025

  • migrate k8s cluster tools from base experiments-base
  • cleanup README docs to only refer to one virtual environment terminal
  • fix broken figure links
  • try more experiments to make sure they are still working
    • kernels_mpi (Fig.9b)
    • kernels_omp (Fig.10)
    • lammps (Fig.9a) (stuck on 7 MPI processes with workload network, not finish in 2 hours, tried 2 times)
    • elastic (Fig.12)
    • lulesh
    • makespan
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/lg/code/repos/github.com/lgarithm/granny-experiments/tasks/makespan/scheduler.py", line 353, in thread_pool_thread
    run_kubectl_cmd("makespan", exec_cmd)
UnboundLocalError: local variable 'exec_cmd' referenced before assignment

  • migration (Fig.11)
  • motivation
  • openmpi
  • polybench

@lgarithm lgarithm marked this pull request as draft January 22, 2025 10:50
@csegarragonz
Copy link
Contributor

hey @lgarithm could you also please update the readme so that the links to the plots also contain a reference to the figure number in the paper?

many thanks!!

@csegarragonz
Copy link
Contributor

hey @lgarithm many thanks for pushing on this, it's definately getting there!!

i would say we don't need to re-run all the plots, just make sure that each experiment runs for some cluster size.
you can discard the changes in the data files so we keep the old results.

@lgarithm
Copy link
Author

lgarithm commented Feb 5, 2025

What do you mean by the old results? Currently this repo completely ignores the results and the plots folder.

i would say we don't need to re-run all the plots, just make sure that each experiment runs for some cluster size. you can discard the changes in the data files so we keep the old results.

@csegarragonz
Copy link
Contributor

What do you mean by the old results? Currently this repo completely ignores the results and the plots folder.

That is true, maybe let's keep it that way. I may try to commit the results and the plots at some point later.

@lgarithm
Copy link
Author

lgarithm commented Feb 6, 2025

The network workload of LAMMPS always got stuck at 7 MPI processes:

Running LAMMPS on Granny with 7 MPI processes (workload: network, run: 1/1)

happened for multiple times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants