February 10, 2010

Python and php benchmark


Hi!
It's been a long time since the last time I wrote a post.

I'm installing a web environment and I'm trying to make it as efficient as possible,
so I've decided to make a benchmark between Php and python
I've tried these two web environments:

Php: default configuration (php5 + Apache)
Python: Cherokee webserver+fcgi+Psyco

Cherokee is a minimalist web server like nginx. I decided to use this web server since I've read
that it is more efficient than apache, according to several benchmarks.

Psyco is a JIT Compiler for python that compiles chunks of code to make it faster when executing.

The benchmark consists of a fuzz of 1000 requests using 50 parallel threads.

This was the result (Y axis:System Load , X axis:Time)



I know that php can be optimized, but this was a simple benchmark that I made to test
my new environment, however it evaluates default php installations.

I hope you begin to port to Python ;D

Let's see how HipHop works

Bye!

Pd: That was the script I used

  1. def func():
  2. primeNumbers = []
  3. output = []
  4. for i in xrange(2, 30000):
  5. divisible = False
  6. for number in primeNumbers:
  7. if i % number == 0:
  8. divisible = True
  9. if divisible == False:
  10. primeNumbers.append(i)
  11. output.append(str(i))
  12. print "".join(output)
  13. func()
  1. $primeNumbers = array();
  2. $output = '';
  3. for ($i = 2; $i 30000; $i++) {
  4. $divisible = false;
  5. foreach ($primeNumbers as $number) {
  6. if ($i % $number == 0) {
  7. $divisible = true;
  8. }
  9. }
  10. if ($divisible == false) {
  11. $primeNumbers[] = $i;
  12. $output .= $i;
  13. }
  14. }
  15. echo $output;

June 16, 2009

iSearch tool

Hi everybody

I'll continue introducing iSearch, that was developed with intention of being a tool to fetch results in any search engine, like google, live.com, yandex, etc.

iSearch consists of SearchEngine superclass and some inherited classes to work against several
search engines.

GoogleSearch(SearchEngine)
YouTubeSearch(SearchEngine)
MsnSearch(SearchEngine)
YahooSearch(SearchEngine)
YandexSearch(SearchEngine)
TorrentzSearch(SearchEngine)
MininovaSearch(SearchEngine)
ScrapeTorrentSearch(SearchEngine)
BaiduSearch(SearchEngine)
Figator(SearchEngine)

Here you have an example to fetch all the results in google for a query like 'hello':


from iSearch import *

a=GoogleSearch("hello")
for i in a:
    print i

You can also get only 10 results:

a=GoogleSearch("Hello")
a.getNResults(10)

It's very simple!!.

Therefore, making your own search wrapper is not very difficult, below you can see GoogleSearch implementation:


Search pages usually have the query string and page number in the URL, so you have to define where they must be
with the keywords "{query}" and "{startvar}"

"{query}" will be replaced with the first parameter of the constructor, the query
"{startvar}" will be replaced with the result count, in the google example from 0 to infinite stepping 100

So, to define "{startvar}" you can define two class attributes:

  • self.startIndex : the first index used in {startvar}
  • self.increment : value to add to the last index to get the next page results

And finally you have to define 2 regular expresions:
  • self.urlRegexp : RegExp to match the desired information of the page (you must parenthesize that information, see python re package for more information)
  • self.nextRegexp : RegExp to match the "Next" link or something that reveals that there is another page with more results, when that regular extression does not match, SearchEngine finishes.

You can get iSearch in http://proxystrike.googlecode.com/svn/trunk/iSearch.py

Obviously you can use iSearch in many different ways, for example, below you can see a simple crawler using iSearch, of course inheriting SearchEngine:



You can see extended documentation in Proxystrike Wiki

Enjoy it!

June 3, 2009

Introducing tools

Hi everybody, 

In the next posts I'd like to explain some very useful modules that I've developed to improve my tools.

Below you can see a summary:
reqresp: Library to work with HTTP requests and responses
TextParser: Library to parse text
iSearch: Library to perform searches using search engines
console: Library aimed to create simple command line interfaces easily.

To begin with, I'll introduce reqresp:
You can access the API in the ProxyStrike wiki

Below you can see en example of the use of reqresp module:


I hope this is useful for you. Enjoy it.

April 24, 2009

ProxyStrike plugins


Recently I released the last version of ProxyStrike, but I did't write anything about plugins development.

Now a howto is available to write plugins.

The plugin engine is showed in the following picture:

You can see an example of an email gatherer plugin below:



It's very easy!!!. I recommend you to visit the howto page for more information.

See you soon!
deepbit

April 22, 2009

Detecting encodings

I'd like to write about chardet. This software allows to know what the encoding type of a text is, as for example of a web page (html) or any file. This is very usefull when you are connecting several tools for information interchange.

However, in python versions lower than Python3k, working with encodings is horrible, so a lot of times you have troubles when trying to guess the encoding source.

Chardet gives information about the encodings that should match for a given source with a probability set. Below you can see an example of how to use chardet, it's very easy!


>>> import urllib

>>> urlread = lambda url: urllib.urlopen(url).read()

>>> import chardet
>>> chardet.detect(urlread("http://google.cn/"))
{'encoding': 'GB2312', 'confidence': 0.99}

>>> chardet.detect(urlread("http://yahoo.co.jp/"))
{'encoding': 'EUC-JP', 'confidence': 0.99}



April 19, 2009

weBreak, breaking trends...

To begin with fresh news I'd like to present my new tool. Ok..., It's not really a new tool, it's another interface to wfuzz,
but it's a new interface concept that I began to study in order to develop new tools with a web-based GUI.

weBreak has a RIA (Rich internet Applicartion) interface based on ExtJS. It's very usefull and is cross-browser, so it
makes my tool more portable and more standard.

I' think that In my opinion the future of app interfaces is merging with web browsers, (eg: metasploit, wmware server, etc.)
It implies a new pattern design and new technologies to familiarize with, but it improves the compatibility, and the app
becomes more standard, moreover you can execute the tool on a computer and use it collaboratively.

Nevertheless, the programmig isn't so comfortable, since you have to code (in my case) in python, and javascript
and connect it properly using a web server (I used cherrypy).

So I've planned to develop a framework to join the three technologies (core+webserver+JavaScript)
In my case (python+cherrypy+extJs), I hope it can be portable to other platforms easily.

Next, some screenshots are shown:


April 13, 2009

Deesec!

Wellcome to deesec.

Deesec is a blog which serves the purpouse of publishing my own
projects, related to security and other stuff. They are
focused basically in python development.

See you soon!

Deepbit