Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

部分链接无法正确获取到标题 #20

Open
Anyexyz opened this issue Nov 14, 2024 · 3 comments
Open

部分链接无法正确获取到标题 #20

Anyexyz opened this issue Nov 14, 2024 · 3 comments
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.

Comments

@Anyexyz
Copy link

Anyexyz commented Nov 14, 2024

测试链接:https://work.weixin.qq.com/mail/
显示状态:Image

@JohnNiang
Copy link

我发现该网站的返回的 Charset 是 GB18030,请看下面的请求示例:

❯ https https://work.weixin.qq.com/mail/ -p h
HTTP/1.1 200 OK
Cache-control: max-age=0
Connection: keep-alive
Content-Encoding: gzip
Content-Security-Policy: script-src 'self' https://tongji.baidu.com https://hm.baidu.com http://hm.baidu.com *.google-analytics.com http://mat1.gtimg.com https://mat1.gtimg.com http://*.soso.com https://*.soso.com http://*.qq.com https://*.qq.com http://*.qqmail.com  https://*.qqmail.com http://*.qmail.com https://*.qmail.com https://midas.gtimg.cn http://midas.gtimg.cn http://pub.idqqimg.com https://captcha.gtimg.com blob: 'unsafe-inline' 'unsafe-eval'; report-uri https://mail.qq.com/cgi-bin/report_cgi?r_subtype=csp&nocheck=false
Content-Type: text/html; charset=GB18030
Date: Mon, 18 Nov 2024 07:24:17 GMT
Referrer-Policy: origin
Server: Wwebsvr
Set-Cookie: sms_id=ArfWBAWVhnWdGrt+Cw+UXlp/UdIFS6irWi8Fkru2gvso6kv7Wllpf8KwDRhvUQVViXGeq+1wnQY1AbwfWgR3ww==; Domain=.exmail.qq.com; Path=/; HttpOnly
Set-Cookie: activity=EXPIRED; Domain=.exmail.qq.com; Path=/; Expires=Sun, 17-Nov-2024 07:24:17 GMT
Set-Cookie: ssl_edition=mail.qq.com; Domain=.exmail.qq.com; Path=/; HttpOnly
Set-Cookie: sms_id=1wwuXzLI61E19rCYEpAommK4pXSOHlEcgTyUykR+JELxTyAo2k4GVuUDqp6Z/cuvsA+8oWJ2b0Y2t6raZYkjgg==; Domain=.exmail.qq.com; Path=/; HttpOnly
Transfer-Encoding: chunked
Vary: Accept-Encoding
X-Frame-Options: SAMEORIGIN
X-W-No: 2

解析的时候却采用的 UTF-8 字符集。

String content = dataBuffer.toString(StandardCharsets.UTF_8);

建议解析 Content-Type,获取到 charset 并使用对应的 charset 进行处理。

/kind bug

@f2c-ci-robot f2c-ci-robot bot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 18, 2024
@JohnNiang
Copy link

/good-first-issue

Copy link

f2c-ci-robot bot commented Nov 18, 2024

@JohnNiang:
This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue command.

In response to this:

/good-first-issue

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@f2c-ci-robot f2c-ci-robot bot added good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants