ejes consulting

Techincal Consulting Design and Automation

Archive for November 2011

I call Phoney

leave a comment »

So today I was stumbling around on the internet, and found this kids site:

http://cyberfreax.in/2011/11/15/how-to-create-a-virus-2/

which features “how to create a virus”  Who could help themselves but read?

It turns out that this kid is completely full of it.  He tells you to copy this:

01100110011011110111001001101101011000010111010000
100000011000110011101001011100 0010000000101111010100010010111101011000

into a text file and rename it to something.exe and then run it.

Of course anyone with a bit of understanding on how the binary loader works would know that the loader wouldn’t recognize this as an executable program; ALL executable programs in windows start with either “MZ” or “PE”.  These are the “magic numbers” that tell the binary loader that these are, in fact, executable.

There is a lot going on behind the scenes here so let me explain WHY this won’t work.

Inside of a regular “exe” program is a structure to help the operating system determine how to load this program.  The structure looks like this (in C notation):

(info from: http://www.delorie.com/djgpp/doc/exe/)

struct EXE {
  unsigned short signature; /* == 0x5a4D */
  unsigned short bytes_in_last_block;
  unsigned short blocks_in_file;
  unsigned short num_relocs;
  unsigned short header_paragraphs;
  unsigned short min_extra_paragraphs;
  unsigned short max_extra_paragraphs;
  unsigned short ss;
  unsigned short sp;
  unsigned short checksum;
  unsigned short ip;
  unsigned short cs;
  unsigned short reloc_table_offset;
  unsigned short overlay_number;
};

The first short integer ‘signature’ is always 5a4d in MZ executables (by far less complex than PE executables) this is how the loader knows that this is a valid executable.

The first 16-bit integer is the number of bytes in the last block, unless it’s set to zero, which means the whole last block (152 bytes) is used.

The next 16-bit integer is total number of blocks in the executable file, and if the previous short integer is not zero, that number of the last block is used.

The next short is the number of relocation entries in the header, and the next is the number of “paragraphs” in the header.  Followed by the number of paragraphs of additional memory the program would need (that is, if there isn’t at least this many bytes free the loader will not try to load this program) most programmers know this as the BBS size. And finally, following that, is the maximum number of paragraphs of additional memory.

The next part is the relative value of the stack segment.  This value is added to the segment the program is loaded into, and used to initialize the SS (stack segment) register.

The next value is the initial value of the SP (stack pointer) register.  Then a word which is a checksum, which is usually not used.

The next is the initial value of the IP (instruction pointer) register, and then the CS (code segment) register (which is relative to the segment of the program loaded).  Then the offset of the first relocation item in the file, and finally ending with the overlay number.

If you examine the “binary” that Srivathsan provided, obviously none of this structure “fits.”

So what IS Srivathsan trying to pull?  Let’s take the binary, and bring it to a Binary-to-Ascii conversion site.  I used this one:

http://www.roubaixinteractive.com/PlayGround/Binary_Conversion/Binary_To_Text.asp

I pasted the “binary”, and pressed “To Text” and it comes back with:

format c:\ /Q/X

Oh!!  So he just encoded a “format” command and expected it to run.

This will NOT work.

So, what will work then?

There’s an older format, called “.COM” format that does still run in windows (XP tested).  A Com file (http://en.wikipedia.org/wiki/COM_file) is far less complex, it contains no header information, no relocation and no far jumps.

So it looks to me like you CAN use a .COM file in this way.  So now, to find some executable information you can place in this .com file.

To do this, I did a quick Google for “printable shellcode” and came back with a whole slew of stuff.  I chose this (i got it here(http://r00tsecurity.org/forums/topic/12019-16-bit-printable-shellcode-hello-world/):

X5))%@IP5YI5Y@5P!%PAP[55!5e 5O!54(P^)7CC)7SZBBXPSRABCABCABCABCABCABCABCABCABCZ[XH+H*hello world!$

As you might suspect from the final string, this is simply a “hello world” program; in printable ASCII!!

So, all you have to do is copy the above code, paste it into a text file, and rename the .txt extension to .com and ‘ta-da’ instant executable binary.

Nice try http://cyberfreax.in LOL

Advertisements

Written by ejes

November 17, 2011 at 1:30 pm

pSearch Source!!

leave a comment »

So I had some trouble getting my source put onto wordpress.  I can understand their point, they don’t want to share .zip, .tar or any other archive container formats.

In the intrest of brevity, I decided to just use a free file host.  I chose medifire, it was top on google when I checked.

http://www.mediafire.com/?u01bf1nwbemata9

There is where you can find the historical archives of my search development.  I did my best to ensure that it could be compiled on Windows (32-bit XP via, MinGW) or Linux (ubuntu 64-bit server), sometimes OpenBSD. You’ll likely need sqlite (http://www.sqlite.org/) and libcurl (http://curl.haxx.se/), you’ll probably need pcre libraries as well (http://www.pcre.org/).  If you try to compile something that looks like it should work, let me know and I’ll see if there’s any libraries that I might be missing, or at least I can let you know if it SHOULD compile.

All the above source is released under the original BSD Licence.

Copyright (c) 2010-, Evan Stawnyczy (ejes consulting) ejes@torfree.net
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.
3. All advertising materials mentioning features or use of this software
   must display the following acknowledgement:
   This product includes software developed by ejes consulting.
4. Neither the names ejes consulting, Evan Stawnyczy nor the
   names of its addional contributors may be used to endorse or promote
   products derived from this software without specific prior written
   permission.

THIS SOFTWARE IS PROVIDED BY EVAN STAWNYCZY AND EJES CONSULTING ''AS IS''
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL <COPYRIGHT HOLDER> BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

As of right now, I’m working on a complete rewrite.

This rewrite I’m hoping will serve as good working reference code.  It will be clear, easy to understand and most likely SLOW.  I am now using ODBC instead of limiting users to sqlite.  This also will hopefully allow enterprises to adopt use without too much trouble.  I decided to scrap the web-server aspect – hopefully someone will want to bundle search and web-server for home use.  Especially with the wide adoption of ipv6 a single workstation could easily share information with others all indexed through a private search network.  At the very least someone should write an nginx (http://nginx.org/), lighttpd (http://www.lighttpd.net/) and apache (http://www.apache.org/) module that indexes static and cached content and publishes the search results.

That leads me to trying to build this as an ip-agnostic application. I want it to run in both an ipv4 and ipv6 network. Of course this has it’s own challenges as well.  I’m trying to maintain ANSI compliance where I possibly can so that it can be easily portable, and mostly so that it can run on windows or unix without too much trouble.

Written by ejes

November 11, 2011 at 2:54 pm

pSearch – a peer to peer, distributed search engine

leave a comment »

Forward

So I haven’t been posting very much for the last while, and this is mainly because I’ve been very busy.

I always have several projects on the go, and I don’t have enough time to devote to all of these things at once, so usually the least interesting project gets placed on the back burner.

That is what happened to this blog.

Now I’ve spent a great deal of time on this, and have produced some very good design documents as well as a bunch of source code.  So… Without further ado

This is my distributed, peer-to-peer search engine.

Attached to this post you’ll find a couple of architecture documents, a pdf with a visual diagram of how this engine is suppose to work, and another pdf with a long winded, half written description of why and how I expect this conceptually to run.

I’m not a writer, and am mostly a technical person, however, I am actively updating and modifying this project so expect updates as it goes.

The first document is the “pSearch – Document

In this document I attempt to explain the strategy, and reasons for this project and what  I hope that it will accomplish.   This document is incomplete, but I encourage you to read it anyway.

The second document is the “pSearch – Drawing

In this document I have detailed the major aspects of the distributed search.  Hopefully it’s easy to follow, I don’t expect this diagram to change very much.

And I have a LOT of source code that I still have to organize – much of it will be posted here and some of it is too embarrassing.

Summary

So, without drudging into my documentation in too much detail (I posted them above, feel free) a simplified “how does this work” seems appropriate.

Each peer will accept connections  from the internet.  Each search request is forwarded to other peers as defined in it’s database.

While this happens, it also uses a second task to search it’s own internal database.  On a private home machine this internal crawler has a small collection of sites and keywords based on several configurable data collection points (such as your browser cache, or installed programs) which would automatically include a lot of data that would be specific to you.   A public internet site would index their own pages (this isn’t mandatory, but preferred).

After that, it’s a simple case of matching the keyword and publishing the results to the connected client.

Peers who respond quickly, and with a lot of results are flagged as “experts” when it comes to this set of terms.  This way, when you search for a similar set of terms again, the “expert” peers will be consulted first.

This way, common search terms will be responded to by clients who have a lot of information on these terms.  For example a site that indexes movies (like imdb) would respond with a lot of results for movie titles and information about films, but probably have very little to respond when a query has some specific request about cars.

Expect more as I develop more.  I encourage anyone to read and comment about my designs.

Written by ejes

November 7, 2011 at 11:43 am