Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some deals won't index #1768

Closed
8 of 12 tasks
cryptowhizzard opened this issue Oct 23, 2023 · 10 comments
Closed
8 of 12 tasks

Some deals won't index #1768

cryptowhizzard opened this issue Oct 23, 2023 · 10 comments
Labels

Comments

@cryptowhizzard
Copy link
Contributor

Checklist

  • This is not a question or a support request. If you have any boost related questions, please ask in the discussion forum.
  • This is not a new feature request. If it is, please file a feature request instead.
  • This is not an enhancement request. If it is, please file a improvement suggestion instead.
  • I have searched on the issue tracker and the discussion forum, and there is no existing related issue or discussion.
  • I am running the Latest release, or the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.
  • I did not make any code changes to boost.

Boost component

  • boost daemon - storage providers
  • boost client
  • boost UI
  • boost data-transfer
  • boost index-provider
  • Other

Boost Version

Latest master

Describe the Bug

Some deals won't index properly.

The initial error visible is:

Paused at 'Sealer: Proving': failed to add index and announce deal: failed to add deal to piece metadata store: adding index for piece baga6ea4seaqdk66rv6raydpdiajbucyianrta6yisykbxlmnbbqzi25psn6widq: adding CAR index for piece baga6ea4seaqdk66rv6raydpdiajbucyianrta6yisykbxlmnbbqzi25psn6widq: add index with piece cid baga6ea4seaqdk66rv6raydpdiajbucyianrta6yisykbxlmnbbqzi25psn6widq: inserting into PayloadToPieces: gocql: no response received from cassandra within timeout period

However , when taking a closer look on the deal I see:

Scherm­afbeelding 2023-10-23 om 21 33 52

lotus state get-deal 59168307
{
"Proposal": {
"PieceCID": {
"/": "baga6ea4seaqdk66rv6raydpdiajbucyianrta6yisykbxlmnbbqzi25psn6widq"
},
"PieceSize": 34359738368,
"VerifiedDeal": true,
"Client": "f02144497",
"Provider": "f02366527",
"Label": "bafyreiaomri2xbxcwlqx2mbhn5vajjp2jiuwhdqzqpgf2t7uh3wyjt67oi",
"StartEpoch": 3334674,
"EndEpoch": 4803474,
"StoragePricePerEpoch": "0",
"ProviderCollateral": "9040517304960330",
"ClientCollateral": "0"
},
"State": {
"SectorStartEpoch": 3315429,
"LastUpdatedEpoch": -1,
"SlashEpoch": -1,
"VerifiedClaim": 41468960
}
}

boostd lid gen-index baga6ea4seaqdk66rv6raydpdiajbucyianrta6yisykbxlmnbbqzi25psn6widq
Generating index for piece baga6ea4seaqdk66rv6raydpdiajbucyianrta6yisykbxlmnbbqzi25psn6widq
Error: getting piece deals: listing deals for piece baga6ea4seaqdk66rv6raydpdiajbucyianrta6yisykbxlmnbbqzi25psn6widq: getting piece metadata: piece baga6ea4seaqdk66rv6raydpdiajbucyianrta6yisykbxlmnbbqzi25psn6widq: not found

Logging Information

See above

Repo Steps

  1. Run '...'
  2. Do '...'
  3. See error '...'
    ...
@cryptowhizzard
Copy link
Contributor Author

cryptowhizzard commented Oct 23, 2023

When I hit retry , the status of the deal changes to:

Deal not found in sector.

However:

lotus state get-deal 59168307
{
"Proposal": {
"PieceCID": {
"/": "baga6ea4seaqdk66rv6raydpdiajbucyianrta6yisykbxlmnbbqzi25psn6widq"
},
"PieceSize": 34359738368,
"VerifiedDeal": true,
"Client": "f02144497",
"Provider": "f02366527",
"Label": "bafyreiaomri2xbxcwlqx2mbhn5vajjp2jiuwhdqzqpgf2t7uh3wyjt67oi",
"StartEpoch": 3334674,
"EndEpoch": 4803474,
"StoragePricePerEpoch": "0",
"ProviderCollateral": "9040517304960330",
"ClientCollateral": "0"
},
"State": {
"SectorStartEpoch": 3315429,
"LastUpdatedEpoch": -1,
"SlashEpoch": -1,
"VerifiedClaim": 41468960
}
}

lotus-miner sectors status 21862
SectorID: 21862
Status: Proving
CIDcommD: baga6ea4seaqdk66rv6raydpdiajbucyianrta6yisykbxlmnbbqzi25psn6widq
CIDcommR: bagboea4b5abcaswk4x6wo6hkz2qjlmdirsgad5exqzs64sexbf57mj6km44viukz
Ticket: 21f62fa3c16f348d42d1b28b1bf484f2e8f29034943744bb996bc6dd4f242605
TicketH: 3313695
Seed: 5bcb66dc684b75f53313443d183b417a898c6fa80a09cc609bc4443504ed9166
SeedH: 3315286
Precommit: bafy2bzacea3yg6hqaytzjqhxhgacliydqtegn4moafmtx54isqzeyecnvcklm
Commit: bafy2bzacedhkw5zd4qci5c2vu7ikxx7rv7vs4ns35vuxopm2e346nmxpj2bqe
Deals: [59168307]
Retries: 0

@LexLuthr LexLuthr added this to the Boost v2 stability milestone Oct 24, 2023
@nonsense
Copy link
Member

nonsense commented Oct 25, 2023

Looks like the full piece was not added to the Local Index Directory. Could you try to recover the sector, with

boostd recover lid --api-fullnode=$FULLNODE_API_INFO  --api-storage=$MINER_API_INFO --sector-id  21862

The command above should re-insert all deals and indices for the sector into the Local Index Directory.


Additionally could you give a bit more information about your setup? The root cause of this issue is: gocql: no response received from cassandra within timeout period, so we should figure out why you were getting timeouts from YugabyteDB in the first place.

  1. What hardware are you using for your YugabyteDB installation?
  2. Is it the only service running on that machine?

@cryptowhizzard
Copy link
Contributor Author

Hello,

boostd recover lid --api-fullnode=$FULLNODE_API_INFO --api-storage=$MINER_API_INFO --api-lid=ws://127.0.0.1:8044 --sector-id=36389
2023-10-25T23:01:10.901+0200 WARN boostd/recover.go:357 recovery directory exists, so will continue from where recovery left off previously
2023-10-25T23:01:10.901+0200 INFO boostd/recover.go:158 running recovery tool on a single sector {"sector": "36389"}
2023-10-25T23:01:10.901+0200 INFO boostd lib/api.go:44 Using full node API at ws://XXXXX:1236/rpc/v1
2023-10-25T23:01:10.907+0200 INFO boostd lib/api.go:74 Using storage API at ws://XXXXXXX:2345/rpc/v0
2023-10-25T23:01:10.909+0200 INFO boostd lib/api.go:74 Using storage API at ws://XXXXXX:2345/rpc/v0
2023-10-25T23:01:10.910+0200 INFO boostd lib/api.go:194 Miner address: f01786XXX
2023-10-25T23:01:12.250+0200 DEBUG boostd/recover.go:685 processing sector {"sector": "36389", "deals": [59869675]}
2023-10-25T23:01:12.262+0200 DEBUG boostd/recover.go:528 processing piece {"sector": "36389", "piececid": "baga6ea4seaqdj2jiytqr6smuxihdhakx4hr24atz4eanvckimy5fsotnmos2ama", "piecesize": 17179869184, "offset": 0, "label": "bafyreie7hfbal2szg3t2fdjiq3beice76wmiiok4426nwggukx7pwvod5u"}
2023-10-25T23:01:12.337+0200 DEBUG boostd/recover.go:507 sa.IsUnsealed return true {"sector": "36389"}

It's sitting here for 12 hours now. No response.

I'm running latest boost master ( Okt 25 ) with timeout patch in it and dealfilters disabled.

Yugabyte running locally and lotus-miner. The rest is external.
Hardware 32 core Intel(R) Xeon(R) Silver 4215R CPU @ 3.20GHz / 384 G RAM , 384 GB Swap ( Emtpy ). Everything on NVME , 6,2 TB samsung.

@LexLuthr
Copy link
Collaborator

@cryptowhizzard Can you please dump boostd logs here?

@cryptowhizzard
Copy link
Contributor Author

@cryptowhizzard Can you please dump boostd logs here?

I sure can , but can you be more specific on what you need from the logs?
I need to censor them from security sensitive content + I don't want to spam you guys with GB's of logfile.

@cryptowhizzard
Copy link
Contributor Author

cat -A100 -B100 from the log on the pieceCiD. Perhaps it helps

logindex.txt

@LexLuthr
Copy link
Collaborator

@cryptowhizzard The supplied logs do not contain any log line about indexing. I would suggest turning off the debug logs for piecedoctor and turn ON debug for "piecedirectory". After that search for logs log.Debugw("add index: get index", "pieceCid", pieceCid)

@nonsense
Copy link
Member

nonsense commented Oct 26, 2023

@cryptowhizzard boostd recover should be quite fast (a few minutes max) after your sector is unsealed, which it is according to your logs. Are you sure the connection to storage is working? The unsealed sector should be transferred and indexed and stored on LID.

As @LexLuthr requested, please send all logs so we can figure out what is wrong with your deployment, but this sounds like a configuration problem.

@cryptowhizzard
Copy link
Contributor Author

cryptowhizzard commented Oct 28, 2023

Scherm­afbeelding 2023-10-28 om 19 25 07

@nonsense

I understand that you ask this question. However this is not a configuration problem I think.

This index errors only start with one client ( The one where the wallet starts with f1ps34em )
Out of the 200K deals I have never expierenced this with other deals.

I scanned a part of my log for the phrases you suggested. Here it is:

2023-10-28T17:38:30.610+0200 DEBUG piecedoc piecedirectory/doctor.go:238 unflagging piece {"piece": "baga6ea4seaqa3eeorklk4jj7okuecauwxse6gase6g2e5n77fqeg5fzppsfboea"}
2023-10-28T17:38:30.733+0200 DEBUG piecedirectory piecedirectory/piecedirectory.go:237 add index: completed {"pieceCid": "baga6ea4seaqmhruynrnh32hfw2pza6r2p4zrq24cr3s3blfx2i6i5cs4cy2tmbq"}
2023-10-28T17:38:30.733+0200 INFO boost-storage-deal logs/log.go:40 deal successfully added to LID {"id": "60914118-c9c5-40c0-9182-4ef5ff67ae9a"}
2023-10-28T17:38:30.733+0200 INFO boost-storage-deal logs/log.go:40 cleaning up deal {"id": "60914118-c9c5-40c0-9182-4ef5ff67ae9a"}
2023-10-28T17:38:30.733+0200 DEBUG piecedirectory piecedirectory/piecedirectory.go:243 add index: get index {"pieceCid": "baga6ea4seaqg3slyxyihrx2bxatq5su4sb67jypqffhrgbtsgsqhnqigrqhvgfa"}
2023-10-28T17:38:30.736+0200 INFO boost-storage-deal logs/log.go:40 deal finished {"id": "60914118-c9c5-40c0-9182-4ef5ff67ae9a"}
2023-10-28T17:38:30.738+0200 INFO boost-storage-deal logs/log.go:40 finished cleaning up deal {"id": "60914118-c9c5-40c0-9182-4ef5ff67ae9a"}
2023-10-28T17:38:30.739+0200 INFO boost-storage-deal logs/log.go:40 finished deal cleanup after successful execution {"id": "60914118-c9c5-40c0-9182-4ef5ff67ae9a"}
2023-10-28T17:38:30.826+0200 DEBUG piecedirectory piecedirectory/piecedirectory.go:251 add index: read index {"pieceCid": "baga6ea4seaqg3slyxyihrx2bxatq5su4sb67jypqffhrgbtsgsqhnqigrqhvgfa"}
2023-10-28T17:38:31.481+0200 DEBUG piecedirectory piecedirectory/doctor.go:111 checkPiece processing {"took": 2.752763905}
2023-10-28T17:38:31.513+0200 INFO index-provider-wrapper indexprovider/wrapper.go:546 skipping ingestion: piece not found {"propCid": "bafyreiaczjyhvu67if5fnxhituu4bcz5cjpjphc5uirmohyuuzrcy2owcy", "piece": "baga6ea4seaqmhruynrnh32hfw2pza6r2p4zrq24cr3s3blfx2i6i5cs4cy2tmbq", "err": "failed to get iterable index: getting piece metadata: piece baga6ea4seaqmhruynrnh32hfw2pza6r2p4zrq24cr3s3blfx2i6i5cs4cy2tmbq: not found"}
2023-10-28T17:38:31.514+0200 INFO boost-storage-deal logs/log.go:40 announced deal to network indexer {"id": "60914118-c9c5-40c0-9182-4ef5ff67ae9a", "announcement-cid": "b"}
2023-10-28T17:38:31.516+0200 INFO boost-storage-deal logs/log.go:40 updated deal checkpoint in DB {"id": "60914118-c9c5-40c0-9182-4ef5ff67ae9a", "old checkpoint": "AddedPiece", "new checkpoint": "IndexedAndAnnounced"}

I checked on this deal and it shows like this:

Scherm­afbeelding 2023-10-28 om 19 34 04

For completeness i also executed : boostd recover lid --api-fullnode=$FULLNODE_API_INFO --api-storage=$MINER_API_INFO --api-lid=ws://127.0.0.1:8044 --sector-id=37807

I attached log.txt for you with all lines in the log around this.

Let me know if you or Lex need the full log. I can zip it and send it over via slack.

log.txt

@nonsense nonsense removed this from the Boost v2 stability milestone Oct 30, 2023
@cryptowhizzard
Copy link
Contributor Author

This problem is solved upgrading to the latest RC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

No branches or pull requests

3 participants