Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UserAgent in parameters #170

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

UserAgent in parameters #170

wants to merge 4 commits into from

Conversation

cyber01
Copy link

@cyber01 cyber01 commented Sep 23, 2020

Possible solution to access restriction problems (502-504, 403 HTTP codes) related to blocking most UserAgents by default (curl, pythonlib, Ruby). With this parameter, you can "disguise" as a browser and eventually bypass the restriction. In this way, 350 thousand pages of one of the sites were previously downloaded (full history from 2008)

Copy link

@mathieu-aubin mathieu-aubin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's a great idea to have the default UA as the name of the app. It seems too direct/easy to shut down/prevent default settings from working by just setting a ban on it, forcing users to change UA.

I believe a Firefox/Chrome version would be best as default and could be changed if user wants/need to.

Does this make any sense?

Copy link

@mathieu-aubin mathieu-aubin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's a great idea to have the default UA as the name of the app. It seems too direct/easy to shut down/prevent default settings from working by just setting a ban on it, forcing users to change UA.

I believe a Firefox/Chrome version would be best as default and could be changed if user wants/need to.

@cyber01
Copy link
Author

cyber01 commented Nov 5, 2020

Sounds logical. Changed the default to useragent Firefox 80 on Windows 10

README.md Outdated Show resolved Hide resolved
@@ -58,6 +58,10 @@ option_parser = OptionParser.new do |opts|
options[:list] = true
end

opts.on("-u", "--user-agent STRING", String, "UserAgent for connection (Default is WayBack Machine Downloader)") do |t|

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defsult UA

README.md Outdated

Example:

wayback_machine_downloader http://example.com --user-agent "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:77.0) Gecko/20190101 Firefox/77.0"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use another UA than FF for the example, some smart tv UA's have long lifespans

@mathieu-aubin
Copy link

Maybe, another suggestion... How about adding 'DNT: 1' headers by default? Not sure if it's something that IA_ARCHIVER cares about thi

@cyber01
Copy link
Author

cyber01 commented Nov 6, 2020

Maybe, another suggestion... How about adding 'DNT: 1' headers by default? Not sure if it's something that IA_ARCHIVER cares about thi

A good suggestion, but I think it's better to do it in a separate MR, where you can make some more adjustments to privacy, or to bypass locks.

@sww1235
Copy link

sww1235 commented Nov 20, 2023

would it make sense to add a commandline flag to set a user agent along with defaulting to something like firefox or chrome?

@mathieu-aubin
Copy link

i believe it does

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants