经过不断的找问题,发现这种机器人刷,虽然它不断地变换ip,但是有个致命的弱点,就是不会执行js脚本和其它脚本,所以我们可以按照这个来限制它执行,等他链接过来,我们利用nginx网站配置,给他截断返回444或是499或是跳转到百度什么的都可以!
map $http_user_agent $ua_type {
default "unknown";
# 爬虫
"~*spider" "spider";
"~*crawler" "crawler";
"~*bot" "bot";
"~*bingbot" "bingbot";
"~*googlebot" "googlebot";
"~*baiduspider" "baiduspider";
"~*360spider" "360spider";
"~*yandex" "yandexbot";
"~*sogou" "sogouspider";
# 浏览器
"~*mozilla" "browser";
"~*chrome" "browser";
"~*safari" "browser";
"~*firefox" "browser";
"~*edge" "browser";
"~*opera" "browser";
# 命令行工具
"~*curl" "curl";
"~*wget" "wget";
"~*httpie" "httpie";
# 调试工具
"~*postman" "postman";
"~*insomnia" "insomnia";
# 编程语言访问
"~*python" "python-script";
"~*python-requests" "python-requests";
"~*java" "java-client";
"~*okhttp" "java-okhttp";
"~*php" "php-client";
"~*perl" "perl-script";
"~*ruby" "ruby-script";
"~*go-http-client" "golang";
# APP类请求
"~*android" "android-app";
"~*iphone" "iphone-app";
# 小程序/微信
"~*micromessenger" "wechat";
"~*miniprogram" "wechat-miniapp";
}
log_format custom '$remote_addr - $remote_user [$time_local] '
'"$scheme://$host$request_uri" "$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent" "$ua_type" '
'"$http_cookie" "$http_x_forwarded_for"';
# 先定义一个组合变量,放在 http{} 或 server{} 外面
map "$ua_type:$http_cookie" $block_access {
default 0;
"~^browser:(-)?$" 1; # $http_cookie为空时是空字符串
}
以上代码放在网站配置的头部,就是server的前面
location / {
if ($block_access = 1) {
return 403;
}
try_files $uri $uri/ /index.php?$args;
}
以上代码放在server里面
access_log /www/wwwlogs/srzxkj.com.log custom;
然后将网站日志改成上面这个就行了!这样没有特征的机器人访问,就返回499了!你可以再网站日志里面看到访问情况了!
