获取网页中所有的文字

# encoding=utf8

import sys

reload(sys)

sys.setdefaultencoding('utf8')

import re
import requests
from bs4 import BeautifulSoup


html = requests.get('https://mp.weixin.qq.com/s?src=11×tamp=1533887718&ver=1051&signature=Xszdx5nmmHyebcH0MXxyHi7-jDwGoNDUDXCHJzPVic68tXGRSTiM3CStUDfSR*aALaC3nK3Ez4e33uLR5ir1pLgy3vEvWXWOvVXgAbsXMn5fB-HWboOW26GH*KMRVhgX&new=1')
soup = BeautifulSoup(html.text, "html5lib")
data = soup.findAll(text=True)


def visible(element):
    if element.parent.name in ['style', 'script', '[document]', 'head', 'title']:
        return False
    elif re.match('', str(element.encode('utf-8'))):
        return False
    return True


result = filter(visible, data)

with open('res.txt', "w+") as p:
    for i in result:
        print(str(i))
        p.write(str(i))


print list(result)

CommentView Plugin for IDAPro7.0

自从ida升级7.0 之后,hexrays做了很多的改动,以前的插件基本都废掉了。于是想要找个插件就变得很困难,最近分析一个文件需要获取所有的注释,但是那个针对低版本开发的commentview已经无力回天了。虽然晚上有开源的代码,但是实际修改起来比较蛋疼,不知道是不是ida的问题,编译的插件获取的地址基本都是错误的。还是按照以前的使用区段枚举,和inf信息获取的方法获取到的地址都错了,着tm就很尴尬了,测试代码如下:

for (int i = 0; i < get_segm_qty(); i++) {
        segment_t *seg = getnseg(i);
        qstring segname;
        get_segm_name( &segname,seg, 1024);
        msg("segname: %s, start_ea= %08x, end_ea= %08x , size=%08x \n", segname.c_str(), seg->start_ea, seg->end_ea, seg->size());
    }
msg("Database Info: start_ea= %08x, min_ea= %08x, max_ea= %08x, omin_ea= %08x, omax_ea= %08x \n", inf.start_ea, inf.min_ea, inf.max_ea, inf.omin_ea, inf.omax_ea);
    msg("lowoff= %08x, highoff= %08x, main= %08x \n", inf.lowoff, inf.highoff, inf.main);

实际获取到的数据如下,测试环境为OSX + IDA 7.0,如果谁看到了这篇文章还获取到了正确的地址麻烦通知我一声(感谢匿名用户的评论反馈:那个基址问题应该是IDA的BUG,在新的IDA 7.0.171130 (SP1)里已经修正了的,如果是正版的话就升级一下吧。)。

segname: .text, start_ea= 10001000, end_ea= 00000001 , size=effff001 
segname: .idata, start_ea= 10005000, end_ea= 00000006 , size=efffb006 
segname: .rdata, start_ea= 1000513c, end_ea= 00000003 , size=efffaec7 
segname: .data, start_ea= 10006000, end_ea= 00000005 , size=efffa005 
Database Info: start_ea= 10007000, min_ea= ff000000, max_ea= 00000000, omin_ea= 0006000f, omax_ea= 06400007 
lowoff= 00500046, highoff= 00000301, main= 10007000 

获取到的end_ea都是错的。

Continue Reading

杂谈nginx 301 重定向在非常规破解中的利用

在某些特定的情况下,如果软件采用本地加服务器校验的方式进行注册时候。单纯的本地破解可能很快就是失效,而服务器破解就成了一个可行的方式。例如pycharm系列的软件,但是有的时候认证服务器和资源服务器在同一个机器上,那么如果直接劫持校验服务器,资源也会无法下载,例如某editor。网上的破解脚本很多,基本都是基于文章最后的python代码。

但是这个东西虽然屏蔽掉了破解校验,但是无法下载服务器的模版和脚本。因为所有的资源都被劫持了,于是通过nginx进行重定向就成了一个选择。可以直接参考下面的代码:

location /***editor {
proxy_pass http://www.***.com/****editor/;
proxy_redirect off ;

proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header REMOTE-HOST $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

至于如何跳过序列号校验,参考这个代码吧:

#!/usr/bin/env python
# -*- coding:utf-8 -*-
from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler
 
HOST = "127.0.0.1"
PORT = 80
 
class RequestHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.send_header("Content-Type", "text/html")
        self.end_headers()
        self.wfile.write("valid")
 
def run_server():
    server = HTTPServer((HOST, PORT), RequestHandler)
    server.serve_forever()
 
if __name__ == "__main__":
    # redirect www.sweetscape.com to 127.0.0.1 in hosts
    run_server()

certbot-auto 阿里云配置安装

在国内的服务器上配置任何的服务都要费一番折腾,首先,国内的服务器连接国外的更新源本身就有问题。效率低还可以忍受,但是直接连不上服务器这个就没什么办法了,只能更换更新源。在《阿里云蛋疼的pip》一文中为了能够正常的使用pip,替换了默认的服务器,修改成了http://mirrors.aliyun.com/pypi/simple/ 这个源,但是也是忧郁这个蛋疼的源出现了各种问题。

在执行

./path/to/certbot-auto renew --quiet --no-self-upgrade

的时候会提示很多错误信息,找不到各种lib。

Had a problem while installing Python packages.

pip prints the following errors: 
=====================================================
Collecting argparse==1.4.0 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 11))
  Downloading http://mirrors.aliyun.com/pypi/packages/f2/94/3af39d34be01a24a6e65433d19e107099374224905f1e0cc6bbe1fd22a2f/argparse-1.4.0-py2.py3-none-any.whl
Collecting pycparser==2.14 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 17))
  Downloading http://mirrors.aliyun.com/pypi/packages/6d/31/666614af3db0acf377876d48688c5d334b6e493b96d21aa7d332169bee50/pycparser-2.14.tar.gz (223kB)
Collecting cffi==1.4.2 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 21))
  Downloading http://mirrors.aliyun.com/pypi/packages/92/00/c8670a2898ab7121cdac3b59f4307977a86e08a59efd662f6c05200a2f11/cffi-1.4.2.tar.gz (365kB)
Collecting ConfigArgParse==0.10.0 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 38))
  Downloading http://mirrors.aliyun.com/pypi/packages/d0/b8/8f7689980caa66fc02671f5837dc761e4c7a47c6ca31b3e38b304cbc3e73/ConfigArgParse-0.10.0.tar.gz
Collecting configobj==5.0.6 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 40))
  Downloading http://mirrors.aliyun.com/pypi/packages/64/61/079eb60459c44929e684fa7d9e2fdca403f67d64dd9dbac27296be2e0fab/configobj-5.0.6.tar.gz
Collecting cryptography==1.5.3 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 42))
  Downloading http://mirrors.aliyun.com/pypi/packages/6c/c5/7fc1f8384443abd2d71631ead026eb59863a58cad0149b94b89f08c8002f/cryptography-1.5.3.tar.gz (400kB)
Collecting enum34==1.1.2 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 65))
  Downloading http://mirrors.aliyun.com/pypi/packages/6f/e9/08fd439384b7e3d613e75a6c8236b8e64d90c47d23413493b38d4229a9a5/enum34-1.1.2.tar.gz (46kB)
Collecting funcsigs==0.4 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 68))
  Downloading http://mirrors.aliyun.com/pypi/packages/5e/9f/025d4c92c6a1a94313cdf0813cd76f5700f8e5434fa15165090a6446ae22/funcsigs-0.4-py2.py3-none-any.whl
Collecting idna==2.0 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 71))

  Downloading http://mirrors.aliyun.com/pypi/packages/7c/75/b566d769455929ee6ab308d8a1c6c5aecc4928e72b25d42dd019c99f7015/idna-2.0-py2.py3-none-any.whl (61kB)
Collecting ipaddress==1.0.16 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 74))
  Downloading http://mirrors.aliyun.com/pypi/packages/23/6a/813ac29a01e4c33c19c2bded98ac3d4266ebbf0bd2c0eb0020e1c969958d/ipaddress-1.0.16-py27-none-any.whl
Collecting linecache2==1.0.0 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 77))
  Downloading http://mirrors.aliyun.com/pypi/packages/c7/a3/c5da2a44c85bfbb6eebcfc1dde24933f8704441b98fdde6528f4831757a6/linecache2-1.0.0-py2.py3-none-any.whl
Collecting ordereddict==1.1 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 80))
  Downloading http://mirrors.aliyun.com/pypi/packages/53/25/ef88e8e45db141faa9598fbf7ad0062df8f50f881a36ed6a0073e1572126/ordereddict-1.1.tar.gz
Collecting parsedatetime==2.1 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 82))
  Downloading http://mirrors.aliyun.com/pypi/packages/85/1f/13fc06097e516f6259d62cea502b116451321c96e18a9d0fff9da3442e02/parsedatetime-2.1-py2-none-any.whl
Collecting pbr==1.8.1 (from -r /tmp/tmp.eLMZJv8t9Y/letsencrypt-auto-requirements.txt (line 85))
Continue Reading

阿里云蛋疼的pip

qq20161103-1

阿里云的主机确实蛋疼,连pip都更新不了也是醉了。参考这个帖子的办法,修改更新服务器吧:

 
mkdir ~/.pip
cat > ~/.pip/pip.conf < < EOF
[global]
trusted-host=mirrors.aliyun.com
index-url=http://mirrors.aliyun.com/pypi/simple/
EOF

Apache2 Django {“detail”:”Authentication credentials were not provided.”}

其实项目已经是很久之前就完成了,部署到服务器上去之后后续的工作由于懒散一致没做,近几天开始进行重新继续项目之后发现一个很蛋疼的问题,在iOS端提交数据的时候提示:

{“detail”:”Authentication credentials were not provided.”},搜索之后发现原来是mod_wsgi转发数据的时候将authorization header 去掉了,所以会导致认证失败。可以参考链接:

http://stackoverflow.com/questions/26906630/django-rest-framework-authentication-credentials-were-not-provided

修复也很简单,修改/etc/apache2/apache2.conf文件添加如下一行即可:

WSGIPassAuthorization On