Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client idles and stops serving requests #3

Open
Saturnix opened this issue Mar 4, 2020 · 18 comments
Open

Client idles and stops serving requests #3

Saturnix opened this issue Mar 4, 2020 · 18 comments

Comments

@Saturnix
Copy link

Saturnix commented Mar 4, 2020

(Thanks for making this awesome piece of software!)

I have a little problem: I run my media server on Windows, but I'm behind NAT. I also have a Linux VPS, which I use to expose my local machine thanks to your script.

Linux runs in server mode, Windows runs in client mode.

The way I run the client is simply opening a cmd.exe tab, and run the script as in the example in the docs

Python27\python.exe natsrv.py --mode client --secret pass123 --local 192.168.1.117:8096 --admin example.com:80

However, if I leave cmd.exe open for a while, example.com:80 eventually stops serving requests. There are no error messages or anything.

I know the error is on the client because if I simply close cmd.exe, reopen it and rereun the client it starts working again.

This happens even after just a few minutes of inactivity (less than 20mins).

Have any idea what might be causing this?

Many thanks again!

@rofl0r
Copy link
Owner

rofl0r commented Mar 15, 2020

thanks for your report. have you tried running the client on a linux machine already? i wonder whether this is a bug in the windows version of python.
trying python3 might also be an option.

@Saturnix
Copy link
Author

Saturnix commented Mar 15, 2020

Hi, thanks for the answer. Still have to try running the client on Linux: will try and let you know here.

Python 3 doesn’t work, I thought it wasn’t supported so I downgraded to 2.7. Will post you the precise error I get on Python 3 on a separate issue asap.

EDIT: done. Opened a separate issue for Win.

@meetcshah19
Copy link

@rofl0r Same issue for me. I am running both client and server on linux.

@rofl0r
Copy link
Owner

rofl0r commented Oct 6, 2020

@rofl0r Same issue for me. I am running both client and server on linux.

sigh. maybe using python for this project was a bad choice. anyway, maybe you can help debugging to find out what's wrong. for example does netstat output give some hint about the state of connections ?

@meetcshah19
Copy link

Yup thats what I have been trying. It looks like once the socket connection between the client and the server breaks (maybe due to unstable internet) it is unable to establish the connection again.

@rofl0r
Copy link
Owner

rofl0r commented Oct 7, 2020

the way the program currently works is that
0) establishing control channel connection

  1. it establishes an idle data connection to the server
  2. waits for a request
  3. as soon as request is served a new conn is intantiated (i.e. goto 1)
    this is done so there's no latency for establishing the conn after client connect.
    can you figure out whether the control connection or the data connection gets interrupted?

@meetcshah19
Copy link

It seems like the data connection is getting interrupted.

@ariririos
Copy link

Same issue here between two enterprise ethernet connections (MIT local and AWS lightsail remote) so I doubt it's anything to do with the actual internet connection. Maybe some sort of keep-alive isn't being properly set?

@rofl0r
Copy link
Owner

rofl0r commented Mar 10, 2021

that's possible. maybe the "prepare a connection in advance" thing wasn't so smart after all. do you feel capable of changing the code so the data connection is only done after a client connects?

@ariririos
Copy link

that's possible. maybe the "prepare a connection in advance" thing wasn't so smart after all. do you feel capable of changing the code so the data connection is only done after a client connects?

Yes I think so! I'll take a look this weekend.

@ariririos
Copy link

that's possible. maybe the "prepare a connection in advance" thing wasn't so smart after all. do you feel capable of changing the code so the data connection is only done after a client connects?

The socket code here is pretty far above what I'm familiar with, and I don't want to break anything, so I don't think I'll be able to make the necessary edits to resolve this issue. Sorry about this!

@schtritoff
Copy link

Got a working solution for the problem with client idle connection dropping by using systemd unit and watchdog script.

Client systemd unit (for example in /etc/systemd/system/nat-tunnel-01.service)

[Unit]
Description=nat-tunnel for ssh access
After=network.target

[Service]
Type=simple
WatchdogSec=20
NotifyAccess=all
Restart=always
RestartSec=60
Environment=WATCHDOG_USEC=2000000
ExecStart=/usr/bin/python3 /root/nat-tunnel/natsrv.py --mode client --secret $MYSECRET --local localhost:22 --admin $MYVPSHOST:ADMINPORT
ExecStartPost=/root/nat-tunnel/watchdog.sh

[Install]
WantedBy=multi-user.target

Similar could be used for server side but without parameters for watchdog script (ExecStartPost, WatchdogSec, Environment, NotifyAccess).

watchdog.sh contents

#!/usr/bin/env bash

# src: https://www.medo64.com/2019/01/systemd-watchdog-for-any-service/

watchdog() {
    while(true); do

        # src: https://stackoverflow.com/a/19866239
        TIMEOUT=`timeout 1 bash -c 'cat < /dev/null > /dev/tcp/$MYVPSHOST/$PUBLICPORT'`
        if [ "$?" -eq 0 ]; then
            #echo yeah
            /bin/systemd-notify WATCHDOG=1;
            sleep $(($WATCHDOG_USEC / 2000000))
        else
            #echo no
            sleep 1
        fi
    done
}

watchdog &

Now it works reliably for me, on restart or any other occasion - the service is always available in my case. Maybe this could be somewhere in docs / readme for this project.

@rofl0r
Copy link
Owner

rofl0r commented Nov 18, 2021

Maybe this could be somewhere in docs / readme for this project.

your solution is not universal - it depends on systemd, which i despise.
i'd rather find out what's going wrong and fix the bug than documenting bug workarounds.

@schtritoff
Copy link

Regarding SystemD workarround - works for me but not for everyone, I agree. Most popular linux distros have systemd out of the box so this comment might be of value for some users.

On topic - it could be that some router (local gateway or ISP) is dropping/closing connections because there is no activity. I didn't test local (same subnet) client-server variant. If it work in the same subnet it could be that some 3rd party is closing inactive connections. Maybe some heartbeat data flow is needed to keep connection alive.

@rofl0r
Copy link
Owner

rofl0r commented Nov 25, 2021

Maybe some heartbeat data flow is needed to keep connection alive.

this could only mitigate the problem to some degree. the fundamental issue here is that an existing connection is rendered infunctional (disconnected?) but the code fails to detect that event properly.

having an strace dump available for when this happens would help a lot in figuring out what happens effectively.

@therealergo
Copy link

I was having the same same issue while serving from a Linux machine behind a NAT through a remote AWS Linux machine. Everything would work flawlessly for hours, until something goes wrong with the connection and it all stops working.

I couldn't definitively pin down what was going on, but to me it looked like the ISP or NAT was dropping all connections after I attempted to make lots of requests as required by my server.

I reworked the core part of this script so that it proxies all of the data through a single connection that is maintained between the two machines. Each end then opens/closes connections as needed to keep up the external appearance of connections being maintained between the server and client. The changes for this are in my fork here. The script should otherwise have completely identical CLI arguments and behavior.

This completely fixed the issue for me. It's now been serving for a few weeks without issue on my end.

@rofl0r
Copy link
Owner

rofl0r commented Nov 10, 2022

@therealergo nice effort. i see you removed the threading part of the code, do you only support a single client being served ?

@therealergo
Copy link

@rofl0r It supports multiple clients by selecting on the list of client ports and handling available data from any client in a single thread. I don't think that threading adds much with this implementation since it's all going through one connection anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants