Detective Story

So the other day, was exciting something i will always remember; becoming a detective. What you say, yes well i will try to keep this mostly anonymous as i think that’s more appropriate due to the circumstances. Well a close friend of mine created a website for buying and selling cars in a foreign country which his brother lives in. So all going well website is fine bla bla, I just helped my friend system admin for the server maintaining apache, mysql, postfix etc.. the website was fairly busy.

But there was fake accounts created over time posting fake ads selling cars but most of which in-fact all were very obvious scams but one of which ads was created 5 times on this website from 5 different users with 5 different emails but each listing was exactly the same. Long and short of it one person fell for this listing and ended up wiring ~£8000 to this scammer. So the victim contacted the local police which contacted us to try and find as much details as possible on the scammer like emails ip’s etc.

As the scammer was trying to get more money out of the victim the police suggested to play along with the scammer so we can catch him in the act. But my job was to find out as much information as you can. And by god the amount of information you can figure out about someone based of server logs is scary if your clever.

So we cross referenced the times which the scammer signed up to the website each time, to get his ip and his email and checking the post data to be 100% sure we had the right ip. There were 5 instances we had to go through. 3 of which ips would origionate from exactly the same place which was a major city and isp the other 2 were in a more local rural town. And if you take more attention to the dates the first 2 sign-ups by this scammer originated from this city and occurred around June then nothing until October or so where the next 3 sign-ups occurred so this could give rise to the idea that this person moved house. We know its the same scammer each time when the listing is exactly the same each time and the trace on the ip go to the same places each time.

So yes this was an interesting day but very very interesting what you can figure out about people from their IP.

 

Update #1

So i plan to try and do an update of me every 2 weeks on my blog to get more into routine in my life. For the last few years i realized i was very depressed and frustrated, from university i would go in and do my mathematics side of my degree and go to my Computer Science classes and hope to learn from these to understand what the updates in my debian system meant like changes to libc for security reasons then once 2nd year hit, it became very obvious that university wasn’t going to teach me the skills i wanted or hoped. So i took things into my own hands and threw my self into ~4 years of solid learning on my own and university kind of took a back step.

I have no regrets in the choices that i made,  doing what i have done has opened so many more doors for me in my life than anything else has in my life, including university. These days so many students go to university especially science related topics and dont get a related job from it in the end. I can say for me i know i am good at programming, i know i love it; i know i want to do this for the rest of my life as a career  not many people at my age can say anything like that about themselves. For me i never wanted to blame university to not get where i wanted to be in life. If it wasn’t for me taking so much time to learn on my own i wouldn’t have become a decent GCC hacker with server access etc. Mentored by Ian Taylor a well know free software hacker. I took my until june this year to break a lot of depression i had in my life and to accept my failing at university i felt like i had let down my whole family so much when i failed a module, it was when i tried the absolute hardest i possibly could at an applied math module at university and it didn’t pay off and there didn’t seem to be anyone to help you or talk to about it. That was in my 2nd year and that was when i really pushed my self into open source and free software.

It feels good to just talk about it and it doesnt make me feel sad anymore, i am back at university this year i should have been finished now but i failed all my modules last year it really took the biscuit if i am honest, i realised even more so that there is no chance students can learn anything useful from university, well at least from my university if you compare what others do. But yeah the long and short of it is i only need to pass 4 modules now this year to finish up my degree and i will be finished. Its hard for me but i will do it there is so much i want to do this year so i am pushing my self into alot of work to keep my mind active and my self in a routine.

 

It scares me how much power your mind has over your body, when getting over all my depression i feel so much better in every day life. Its like i can see again but anyways enough of me blabing on i have lots of gccpy updates to post so i will do that over the coming weeks.

GccPy – Dynamic Typing

Hey guys time for an overview of Gccpy,:

Gccpy is a Python front-end to gcc from my last years GSOC (2010), I was mentored by Ian Lance Taylor
who continues to be a great inspiration to me with his work on gccgo and his help on gcc-help to the community
of users and developers.
An overview of what the project aims to achive is creating an AOT compiled version of Python using GCC as a framework for
middle-end, back-end optimization aswell as protable code-generation. Creating AOT languages has
been generally aimed for more ‘low-level’ languages such as C/C++/Fortran where the language requires
strong typing and other kinds declarative features; which gives rise to much less dynamic
features which languages like Python/PHP/Perl take for granted. The reason these more ‘high-level’
languages are able to do such things is due to the fact they are generaly implemented as interpreted
languages and this allows for much of this dynamic logic to take place at runtime when a program is passed
through their respective interpreters. But gccpy tackles this by generating code which allows for and efficient
runtime as well as the dynamic features required.
Personally I feel this is a very strong and important style of implementing dynamic languages; which needs to be shown
and proven to be an effective and strong way of implementing new and up-comming languages and why GCC is the right platform
to do so.

Gccpy is a Python front-end to gcc from my last years GSOC (2010), I was mentored by Ian Lance Taylor who continues to be a great inspiration to me with his work on gccgo and his help on gcc-help to the community of users and developers.

An overview of what the project aims to achieve is creating an AOT compiled version of Python using GCC as a framework for middle-end, back-end optimization as well as protable code-generation. Creating AOT languages has been generally aimed for more ‘low-level’ languages such as C/C++/Fortran where the language requires strong typing and other kinds declarative features; which gives rise to much less dynamic features which languages like Python/PHP/Perl take for granted. The reason these more ‘high-level’  languages are able to do such things is due to the fact they are generally implemented as interpreted languages and this allows for much of this dynamic logic to take place at runtime when a program is passed through their respective interpreters. But gccpy tackles this by generating code which allows for and efficient runtime as well as the dynamic features required.

Personally I feel this is a very strong and important style of implementing dynamic languages; which needs to be shown and proven to be an effective and strong way of implementing new and up-coming languages and why GCC is the right platform to do so.

So ok this was the overview of my Gsoc application, but i think its about the best i can word it. But first lets discuss how this actually works over several blog posts i will demonstrate some of the basic ideas which lets us compile python.  From the title of this post i will quickly demonstrate the ideas behind dynamic typing:

Lets look at how we can easily generate code for some C code for example:

  1.  
  2. int foobar (void)
  3. {
  4.   int x = 1, y = 2;
  5.   return x + y;
  6. }
  7.  

We could generate an IR of:

  1.  
  2. int foobar (void)
  3. {
  4.  
  5.   int x, y, T.1;
  6.  
  7.   x = 1;
  8.   y = 2;
  9.  
  10.   T.1 = x + y;
  11.  
  12.   return T.1;
  13. }
  14.  

And the i386 code of:

  1.  
  2. .globl foobar
  3. foobar:
  4.   subl $12, %esp            # get some stack space
  5.   mov $1, %esp             # x
  6.   mov $2, 4(%esp)         # y
  7.   mov 4(%esp), 8(%esp) # setting up a very highlevel/*un-optimized* addition
  8.   addl %esp, 8(%esp)     # T.1
  9.   mov 8(%esp), %eax     # the return
  10.   addl $12, %esp           # fix the stack
  11.   ret
  12.  

This is where some interpreters/runtimes start to try and become much more like a ‘virtual machine’ like Java they implement their language by having a runtime which runs code that is in a virtual inscrution set. So when they parse their language with a front-end they generate this virtual instruction set for the given program but then they ‘compile/assemble’ this to bytecode which is a similar akin to C where we assemble the target code to an object code before linking into an executeable format. But really the byte code is nothing more than a binary form of the instruction set to optimize execution of the instruction set.

An example why many people belive generating efficient code for dynamic languages can be difficult is take for example:

  1.  
  2. def foo (x):
  3.   x.append (1)
  4.   return x + [ 1, 2, 3 ]
  5.  

So what happens here in an abstract point of view you can’t assume anything about this code due to dynamic typing compared to something like

  1.  
  2. def List foo (List x):
  3.   x.append (1)
  4.   return x + [ 1, 2, 3 ]
  5.  

Having storage specifiers insantly makes this set of code much more declaritive and gives rise to many more assumptions able to be made; which in turn gives a compiler more ‘hints’ on what it can do to generate code. Of course in the example above this is just a hypothetical language just to demonstrate the idea. So to implement dynamic typing we have to analyse what it actually is.

Lets take a normal/regular python session:

  1.  
  2. >>> x = 1
  3. >>> x = "string"
  4. >>> y = x
  5. >>> x = 2
  6.  

So what is actually happening now line by line by showing what each identifier is assigned to what data.

  1.  
  2. >>> x = 1                 # x = 1        | y = NULL
  3. >>> x = "string"          # x = "string" | y = NULL
  4. >>> y = x                 # x = "string" | y = "string"
  5. >>>  x = 2                 # x = 2        | y = "string"
  6.  

So why is this actually a problem, traditionly take for example code like:

  1.  
  2. int x = 1
  3. x = 1.5555
  4. x = "string"
  5.  

When a c-compiler would run over that code it would give all manar of warnings about type conversion and invalid types being assigned. But why is this since a compiler will want to generate efficient code it will reserve the space valid for an integer on the stack which on an i386 32bit processor would be 32 bits usualy and would asuse subl $4, %esp to have space on the stack for that integer, but the problem arises if we were to then want to put in data which is greater than the size previously allocated for the given initial data. So you will have overflow and corruption of data. So how can you combat that to make dynamic typing work, the method or approach i have taken for gccpy takes much inspiration in how object orientation works. Every piece of static data give in a program is wraped into a gpy_object_t structure at runtime so in turn every type in gccpy is implemented via a gpy_object_t type, so for example the previous python session could be represented in GIMPLE via something like:

  1.  
  2. gpy_object_t * x = fold_integer (1)
  3. incr_ref_count (x)
  4.  
  5. decr_ref_count (x)
  6. x = fold_string ("string")
  7.  
  8. gpy_object_t * y = x
  9. incr_ref_count (y)
  10.  
  11. x = fold_integer (2)
  12. incr_ref_count (x)
  13.  

The basic idea how dynamic typing is not what an identifier with a specified storage specifier holds its what an identifier points to. So when `x = fold_integer (1)` we should look at what the gpy_object_t structure looks like currently its in many ways similar to how PY_object works in the cpython implemetation but is a little more specific and streamlined to gccpy’s needs.

  1.  
  2. typedef struct gpy_object_t {
  3. enum GPY_OBJECT_T T;
  4.   union {
  5.     gpy_object_state_t * object_state;
  6.     struct gpy_callable__t * call;
  7.     gpy_literal_t * literal;
  8.   } o ;
  9. } gpy_object_t ;
  10.  

This structure is quite open to be used in many areas of how gccpy works but what we are interested in is the:

  1.  
  2. gpy_object_state_t * object_state;
  3.  

This is the part where it stores the static data defined be it an integer or a class defined in the source code.

  1.  
  2. typedef struct gpy_rr_object_state_t {
  3.   char * obj_t_ident;
  4.   signed long ref_count;
  5.   void * self;
  6.   struct gpy_typedef_t * definition;
  7. } gpy_object_state_t ;
  8.  

This structure is whats used to hold the object_state it holds the object type identifier as a string the reference count for the garbage collector the pointer to a structure in memory which is the actual data for example int or FILE * etc, and also a pointer to the objects definition structure. Each object has its own definition and each definition requires several hooks:

  1.  
  2. typedef struct gpy_typedef_t {
  3.   char * identifier;
  4.   size_t builtin_type_size;
  5.   gpy_object_t * (*init_hook)(struct gpy_typedef_t *, gpy_object_t **);
  6.   void (*destroy_hook)(gpy_object_t *);
  7.   void (*print_hook)(gpy_object_t * , FILE *, bool);
  8.   struct gpy_number_prot_t * binary_protocol;
  9.   struct gpy_builtin_method_def_t * methods;
  10. } gpy_typedef_t ;
  11.  

It has the identifier the size of the sturcture of which holds the actual data the initilization hook which returns the object state when you initialize an object the destroy hook for the garbage collector, a print hook for the print keyword to print the data. Now things get more interesting, the binary protocol is whats used to allow for dynamic binary operators so things like:

  1.  
  2. x = 2 + 1.5
  3. concat = "foo" + "bar"
  4.  

Can allow for mixed type binary operations, by having hooks for each type of binary operation be it addition subtraction etc. Finaly there is a table of member methods which allows for dot accesors like:

  1.  
  2. list.append ()
  3. list.index ()
  4.  

As append and index are both member methods to the builtin type List.