Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RegisterClientDuration has big difference for two scheduler machines #120

Closed
sonyafenge opened this issue Aug 2, 2022 · 2 comments
Closed
Assignees
Milestone

Comments

@sonyafenge
Copy link
Collaborator

Repro Steps:

08/01/2022 [grs][730Release][test-5.2] Same Region - 20scheduler * 25K per scheduler - disable metric, RP down pattern

https://github.com/yb01/arktos/wiki/730-test

Extra config

--enable_metrics=false
SCHEDULER_REQUEST_MACHINE=25000
SCHEDULER_NUM=20

Build

sonyali@sonya-grs-resourcemanagement:~/go/src/global-resource-service$ git log --oneline
d27161b (HEAD -> main, upstream/main) Delay 2 hours when one RP down for one region (#117)
e98b0fc Fix redis batch persist issue, add tests in UT. (#115)
45c9c49 Add resource region simulator 'Daily' data pattern and simulate method makeOneRPdown (#113)
0e3988c Add p50/90/99 to client watch report (#111)
sonyali@sonya-grs-test-template:~/go/src/global-resource-service$ git log --oneline
d27161b (HEAD -> main, upstream/main) Delay 2 hours when one RP down for one region (#117)
e98b0fc Fix redis batch persist issue, add tests in UT. (#115)
45c9c49 Add resource region simulator 'Daily' data pattern and simulate method makeOneRPdown (#113)
0e3988c Add p50/90/99 to client watch report (#111)

#######
Env set up
########

sonyali@sonya-grs-test1:~/go/src/global-resource-service$ export GRS_INSTANCE_PREFIX=grs-down-dismt AUTORUN_E2E=true SIM_NUM=5 CLIENT_NUM=2 SERVER_NUM=1
sonyali@sonya-grs-test1:~/go/src/global-resource-service$ export SERVER_ZONE=us-central1-a   SIM_ZONE=us-central1-a CLIENT_ZONE=us-central1-a
sonyali@sonya-grs-test1:~/go/src/global-resource-service$ export SERVICE_EXTRA_ARGS="--enable_metrics=false"
sonyali@sonya-grs-test1:~/go/src/global-resource-service$ export SIM_REGIONS="Beijing,Shanghai,Wulan,Guizhou,Reserved1" SIM_RP_NUM=10 NODES_PER_RP=20000 SCHEDULER_REQUEST_MACHINE=25000 SCHEDULER_REQUEST_LIMIT=26000 SCHEDULER_NUM=20
sonyali@sonya-grs-test1:~/go/src/global-resource-service$ export SIM_DATA_PATTERN=Outage SIM_WAIT_DOWN_TIME=3,8,13,18,23
sonyali@sonya-grs-test1:~/go/src/global-resource-service$ ./hack/test-setup.sh

Expected results:

  1. two scheduler machine has almost same time for "RegisterClientDuration"

Actual results: Logs at sonyadev4: /home/sonyali/grs/logs/1se5si2cl/080122-235326

file name RegisterClientDuration
grs-down-dismt-client-us-central1-a-mig-dsd9.log.0 392.368136ms
grs-down-dismt-client-us-central1-a-mig-dsd9.log.1 568.114784ms
grs-down-dismt-client-us-central1-a-mig-dsd9.log.10 314.150714ms
grs-down-dismt-client-us-central1-a-mig-dsd9.log.2 369.051398ms
grs-down-dismt-client-us-central1-a-mig-dsd9.log.3 353.756672ms
grs-down-dismt-client-us-central1-a-mig-dsd9.log.4 338.66497ms
grs-down-dismt-client-us-central1-a-mig-dsd9.log.5 360.367899ms
grs-down-dismt-client-us-central1-a-mig-dsd9.log.6 364.995705ms
grs-down-dismt-client-us-central1-a-mig-dsd9.log.7 357.518766ms
grs-down-dismt-client-us-central1-a-mig-dsd9.log.8 359.053327ms
grs-down-dismt-client-us-central1-a-mig-dsd9.log.9 341.675423ms
grs-down-dismt-client-us-central1-a-mig-rgsc.log.0 18.804064ms
grs-down-dismt-client-us-central1-a-mig-rgsc.log.1 6.844991ms
grs-down-dismt-client-us-central1-a-mig-rgsc.log.2 5.535613ms
grs-down-dismt-client-us-central1-a-mig-rgsc.log.3 5.131245ms
grs-down-dismt-client-us-central1-a-mig-rgsc.log.4 5.062162ms
grs-down-dismt-client-us-central1-a-mig-rgsc.log.5 5.258271ms
grs-down-dismt-client-us-central1-a-mig-rgsc.log.6 5.09831ms
grs-down-dismt-client-us-central1-a-mig-rgsc.log.7 5.17747ms
grs-down-dismt-client-us-central1-a-mig-rgsc.log.8 4.816153ms
@sonyafenge sonyafenge self-assigned this Aug 3, 2022
@yb01
Copy link
Collaborator

yb01 commented Aug 8, 2022

test infra 10 schedulers per machine first.

@yb01
Copy link
Collaborator

yb01 commented Oct 14, 2022

no repro anymore

@yb01 yb01 closed this as completed Oct 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants