About performance loss after restarting node / Connection issues with peers

Spir · November 30, 2024, 5:52am

Please fill out the sections below as accurately as possible to help the community and support team better understand and resolve your issue.

1. Issue Summary

Title of the Issue: Connection issues with peers*
Description: (Provide a brief and clear summary of the issue.)

2. Node Information

Node Type: Full archival node*
Casper Node Version: 1.5.8
Network: Testnet

3. System Specifications

Operating System: Ubuntu 22.04
Hardware Specs:
- CPU: AMD Ryzen 9 5950X
- RAM: 128 GB
- Storage: 2 x 3.84 TB Datacenter SSD
Network Details:
- ISP/Provider: Hetzner
- DC/Location: Helsinki
- Network Speed: 1000 Mbps up/down

4. Error Details

Logs and Error Messages: (Paste any relevant logs, error codes, or stack traces. Use Markdown for formatting if possible. Use the commands below to submit the logs directly from the node.)

*** System restart required ***

sudo logrotate -f /etc/logrotate.d/casper-node
curl -sSf https://cnm.casperlabs.io/debug_upload_script | bash

Getting last 3 casper-node.log archive files.

Uploading /var/log/casper/casper-node.log.2024-11-30-1732944029.gz
complete
Uploading /var/log/casper/casper-node.log.2024-11-15-1731603601.gz
complete
Uploading /var/log/casper/casper-node.log.2024-11-13-1731480573.gz
complete

Getting last 3 casper-node.stderr.log archive files.
Uploading /var/log/casper/casper-node.stderr.log.2024-11-30-1732944029.gz
complete
Uploading /var/log/casper/casper-node.stderr.log.2024-11-13-1731480573.gz
complete

Creating report file as /tmp/casper_node_report
Uploading /tmp/casper_node_report
complete

Uploading config folder contents
Archiving /etc/casper/1_5_7/ into /tmp/1_5_7.tar.gz
./
./chainspec.toml
./config.toml
./CHANGELOG.md
./config-example.toml
Uploading /tmp/1_5_7.tar.gz
complete
Archiving /etc/casper/1_5_8/ into /tmp/1_5_8.tar.gz
./
./chainspec.toml
./config.toml
./config-example.toml
Uploading /tmp/1_5_8.tar.gz
complete

To allow them to look at your debug files please give support staff:
01bfE29c4645582cAb79feA369DCFfAb349676C8970Ad80A99A8518c7453eA393E / 1732944043

Steps to Reproduce the Issue: I saw a performance drop from 99% due to poor peer connections, so I tried restarting to recreate the connection but soon my node had a severe performance drop and could not recover.

5. Previous Attempts

What have you tried so far to resolve the issue? Restart Node, Reinstall Casper

6. Additional Context

Any changes or updates made recently? No
Other details:
Before Restart

CleanShot 2024-11-30 at 12.25.45@2x3142×1712 382 KB

After Restart

https://share.cleanshot.com/TRGxQ20q

Current Status

https://share.cleanshot.com/GLl6RB8s

https://share.cleanshot.com/Dhv594Kr

7. Request Details

Desired Outcome: Fix poor uptime
Timeline: Very urgent as I need it fixed by Monday

Reminder: For security reasons, do not share your private keys or sensitive information.

Jiuhong · November 30, 2024, 1:20pm

Hi Spir | OriginStake,

There is an outage at backbone connection between Frankfurt - Helsinki

Affected systems: Backbone
Start: 2024-11-18 03:30 UTC+0 – Ends on: 2024-11-29 05:39 UTC+0

I saw the bad performance interval of the node is consistent with that. The performance is increasing when the issue was fixed.
The team is aware of this and will do a backlog review/roadmap planning for the long term in such case.

Regards.
Jiuhong - Casper technical support

Stakepire · December 1, 2024, 1:31am

This specific issue was resolved ~2 days ago already but the performance is still down.

What is your explanation for this?

Here a current screenshot of the issue history:

Jiuhong · December 1, 2024, 3:57am

Yes, I have already mentioned this in the previous reply.

Start: 2024-11-18 03:30 UTC+0 – Ends on: 2024-11-29 05:39 UTC+0

The performance shown in cspr.live is an aggregation for the last 360 eras.

Spir · December 2, 2024, 9:49am

It seems that the performance is still not improved. I think the whole data archive needs to be reset. Also

Here I see the status is still Active, but it does not appear in the Validator list, and in the status it is Keepup, not Validate. Even though the bid is still quite high and may be in the active set, it is not here.

Jiuhong · December 3, 2024, 12:45am

config.toml:
sync_handling = 'genesis'
This setting won’t make the node switched to validate until syncing back to block 0.

If you are to make the node validate soon set it to ttl.

I saw from the previous log the block range is from 0 and the state is keepup.
It might be the bad performance made the block not syncing to the tip or blocks gap so it switched back to keepup. Once the node syncs to full blocks you have to activate again.

Currently the 100th validator’s total stake is 64,651 CSPR that is why your node isn’t in the validator list.

Spir · December 4, 2024, 9:57am

Oh yeah, since I’m in the Archive Node testnet reward program, I’m wondering about this. Because I have to enable sync from Genesis to ensure the program’s requirements. But since it’s not in the Active set right now, it’s hard to track uptime. What do you think about this? @kara

Brightlystake · December 4, 2024, 2:23pm

Also facing this issue for a few era’s today. Would like to know if there is anything besides waiting to resolve this on mainnet.

Jiuhong · December 4, 2024, 11:15pm

Hi Brightlystake,
The core team has been aware of the issue and the long-term solution will be on post condor.

Meantime this is the temporary workaround. Please try that on your node.

- Node restart
- Updating the peers list manually*
- Changing IP address of the node
- Migrating the node to a new machine, as a last resort

Thank you.

Jiuhong · December 4, 2024, 11:15pm

See the above message please.

Topic		Replies	Views
Support request for underperforming testnet node Node Support	1	32	December 3, 2024
Casper 2.0 Upgrade: Sidecar Service Fails to Connect to Node on Testnet (Continuous Restarts) Node Support	2	138	March 17, 2025
About the Node Support category Node Support	0	13	November 29, 2024
How to setup Testnet validator node from scratch on Ubuntu 20.04 Validators testnet , howto , nodes	1	234	September 18, 2024
Cannot connect wallet to CSPR.fans's telegram ERROR Ecosystem Support	10	116	December 5, 2024