Files
frankentrac/router-general-0.9.23-#1014comments.md
2021-04-28 17:56:54 -04:00

123 lines
7.5 KiB
Markdown

### [comment:1](\#comment:1)follow-ups: [2](\#comment:2) [3](\#comment:3) Changed [8 years ago](/timeline?from=2013-09-07T15%3A56%3A13Z&precision=second "See timeline at Sep 7, 2013 3:56:13 PM") by zzz
Component:unspecified →
router/generalMilestone:0.9.8 →
0.9.9
This could be due to clock skew (although I've taken several stabs at fixing it over the years) or something else. In the past I tested by just running Java in the foreground (no wrapper) and then Z. I think the 'check network' message is just a result of zero connected peers.
If you have any relevant logs please include them here.
### [comment:2](\#comment:2) in reply to: [1](\#comment:1) Changed [8 years ago](/timeline?from=2013-09-13T15%3A41%3A13Z&precision=second "See timeline at Sep 13, 2013 3:41:13 PM") by DISABLED
Replying to [zzz](/ticket/1014#comment:1 "Comment 1"):
> This could be due to clock skew (although I've taken several stabs at fixing it over the years) or something else. In the past I tested by just running Java in the foreground (no wrapper) and then Z. I think the 'check network' message is just a result of zero connected peers.
>
> If you have any relevant logs please include them here.
I stopped the VM for a while. I2P restarted once the VM came up, it doesn't usually do this and remains in a dead state.
```
55 13/09/13 16:37:51 CRIT [leTimer2 3/4] net.i2p.util.Clock : Large clock shift forward by 36m
56 13/09/13 16:37:51 ERROR [uterWatchdog] client.ClientManagerFacadeImpl: Client XXXX has a leaseSet that expired 27m
57 13/09/13 16:37:51 ERROR [uterWatchdog] client.ClientManagerFacadeImpl: Client XXXX has a leaseSet that expired 29m
58 13/09/13 16:37:51 ERROR [uterWatchdog] client.ClientManagerFacadeImpl: Client XXXX has a leaseSet that expired 33m
59 13/09/13 16:37:51 ERROR [uterWatchdog] client.ClientManagerFacadeImpl: Client XXXX has a leaseSet that expired 32m
60 13/09/13 16:37:51 ERROR [uterWatchdog] 2p.router.tasks.RouterWatchdog: Ready and waiting jobs: 90
61 13/09/13 16:37:51 ERROR [uterWatchdog] 2p.router.tasks.RouterWatchdog: Job lag: 2202476
62 13/09/13 16:37:51 ERROR [uterWatchdog] 2p.router.tasks.RouterWatchdog: Participating tunnel count: 2438
63 13/09/13 16:37:51 ERROR [uterWatchdog] 2p.router.tasks.RouterWatchdog: 1minute send processing time: 175.88627912006055
64 13/09/13 16:37:51 ERROR [uterWatchdog] 2p.router.tasks.RouterWatchdog: Outbound send rate: 309949.405059494 Bps
65 13/09/13 16:37:51 ERROR [uterWatchdog] 2p.router.tasks.RouterWatchdog: Memory: 123.76M/328.69M
66 13/09/13 16:37:51 CRIT [uterWatchdog] 2p.router.tasks.RouterWatchdog: Router appears hung, or there is severe network congestion. Watchdog starts barking!
67 13/09/13 16:37:51 ERROR [leTimer2 3/4] net.i2p.router.Router : Restarting after large clock shift forward by 36m
68 13/09/13 16:37:52 WARN [ handler 1/1] er.transport.udp.PacketHandler: NTP failure, UDP adjusting clock by 36m
69 13/09/13 16:37:52 ERROR [uter Restart] net.i2p.router.Router : Stopping the router for a restart...
70 13/09/13 16:37:52 WARN [uter Restart] net.i2p.router.Router : Stopping the client manager
```
### [comment:3](\#comment:3) in reply to: [1](\#comment:1) Changed [8 years ago](/timeline?from=2013-09-14T13%3A10%3A16Z&precision=second "See timeline at Sep 14, 2013 1:10:16 PM") by DISABLED
Replying to [zzz](/ticket/1014#comment:1 "Comment 1"):
> I think the 'check network' message is just a result of zero connected peers.
Maybe. It stays this way forever until I hit restart. Hours, days. It's annoying for users on laptops who hibernate and wish to use their client tunnels immediately/shortly/1-3 minutes after waking up. In some cases, it recovers. I'm told in some instances (20-30s), it won't. The instance above was +30 minutes.
### [comment:4](\#comment:4) Changed [6 years ago](/timeline?from=2015-01-03T14%3A03%3A25Z&precision=second "See timeline at Jan 3, 2015 2:03:25 PM") by str4d
Keywords:_hang_ _time_ added
Milestone:0.9.9
### [comment:5](\#comment:5) Changed [6 years ago](/timeline?from=2015-06-19T22%3A15%3A15Z&precision=second "See timeline at Jun 19, 2015 10:15:15 PM") by dg
This is still an issue on Windows 8.1 with 0.9.20. A lot of laptops will hibernate when their lid is shut or they are left on overnight.
The console shows "Check network connection and NAT/firewall".
Relevant (and only) logs:
```
19/06/15 16:07:12 CRIT [NTCP Pumper ] net.i2p.util.Clock : Large clock shift forward by 13h
18/06/15 16:01:42 CRIT [uildExecutor] net.i2p.util.Clock : Large clock shift forward by 29m
18/06/15 05:10:34 CRIT [uterWatchdog] 2p.router.tasks.RouterWatchdog: Router appears hung, or there is severe network congestion. Watchdog starts barking!
18/06/15 05:09:57 CRIT [NTCP Pumper ] net.i2p.util.Clock : Large clock shift forward by 2h
17/06/15 16:03:04 CRIT [NTCP Pumper ] net.i2p.util.Clock : Large clock shift forward by 7h
17/06/15 05:28:20 CRIT [uterWatchdog] 2p.router.tasks.RouterWatchdog: Router appears hung, or there is severe network congestion. Watchdog starts barking!
17/06/15 05:27:45 CRIT [ Establisher] net.i2p.util.Clock : Large clock shift forward by 2h
```
### [comment:6](\#comment:6) Changed [6 years ago](/timeline?from=2015-09-27T19%3A22%3A25Z&precision=second "See timeline at Sep 27, 2015 7:22:25 PM") by dg
Owner:
set to _dg_Status:new →
accepted
Attempting a fix for this.
```
<+dg> zzz: Would you be able to look at #1014 at some point? Or any thoughts so that I can have a stab. It's a recurring problem that I come across
<&zzz> dg all I can give you is some debugging ideas
<&zzz> set default log level to warn
<&zzz> try to figure out if it's a transport issue or a tunnel issue or a timer issue or a soft-restart issue or ...
<&zzz> forward clock shifts are generally an easier problem than backwards
<&zzz> see #1634 for a backwards problem, and how far i got before I got stuck
<&zzz> suspend problems may be easier to debug on linux if you can just suspend (^Z) the process (all da threads) somehow
```
### [comment:7](\#comment:7) Changed [6 years ago](/timeline?from=2015-09-27T19%3A41%3A58Z&precision=second "See timeline at Sep 27, 2015 7:41:58 PM") by dg
Milestone:
→ 0.9.23
### [comment:8](\#comment:8) Changed [6 years ago](/timeline?from=2015-09-27T21%3A18%3A29Z&precision=second "See timeline at Sep 27, 2015 9:18:29 PM") by dg
Status:accepted →
testing
Committed in 0.9.22-11 8dec222a2c0e619fd455367b3647b87bf349c6dc.
Needs some testing, especially on Windows.
### [comment:9](\#comment:9) Changed [6 years ago](/timeline?from=2015-10-03T14%3A46%3A07Z&precision=second "See timeline at Oct 3, 2015 2:46:07 PM") by zzz
Status:testing →
needs\_work
As discussed on IRC:
The commenting out of the active peer count check is a good fix.
The uncommenting of the peer manager and job queue restarts is troublesome.
The peer manager restart saves and reloads all the profiles, which is pointless. Let's figure out what the real change required to peer manager is. Adjusting the time in the stats? Rerunning the organizer? or?
The job queue restart removes all pending jobs, including ones like changing the netdb rotation key at midnight. As with peer manager, seems like some adjustment to the job queu or something is required.
I said that i thought the job queue was a ClockShiftListener?, but it's not, it's a ClockUpdateListener?. That should be investigated, perpaps that's how we can fix it.
tl;dr NACK the changes, need to come up with something more fine-grained, not the restart sledgehammer. Holler if you need help.