Hey guys, I’ve been pretty quiet this summer i guess so as it may seem, but I’ve been working frigging hard. I’m starting to get things coming together! So what am i doing? I’ve been doing a little freelance server work, I’ve been working on CRULES which is coming along nicely a lot of the back-end is working pretty nice now and i just need to keep working at it, but i am also trying to start a new front-end into GCC. So what does that mean, so in GCC the gnu compiler collection its a collection of compilers i am writing a small paper on the architecture of GCC and how to build a new front-end but more on that later, but what i mean is each language in GCC be it ADA, Fortran, C, C++, Java they are all front-ends to GCC, that is the part which does the semantic analysis and parsing of the pain text source code into the GIMPLE tree for middle end optimization from there code-generation.
So i am making a Python compiler(front-end), well starting too its going quite slow because i am in the middle of finishing up some automake work, and crules definitions. And GCC is quite complex in its development but its quite amazing and the most amazing project i have ever seen. I mean its development consists of many little files in different languages that aren’t even front-ends in GCC. For example a .md machine description is written in LISP that describes the architecture of an OS or Processor this makes it easier to port GCC to new systems, GIMPLE the IR was a language called SIMPLE but its GNU/Simple = GIMPLE, this gives the IR a better formalization, lang.opt files for command line options so you dont have to do any command line parsing in your front-end thats up to the gcc driver program( I’ll explain this some other time), Make-lang.in GNU/Make on how your code should be compiled, config-lang.in its in bash but does all the configuration on how the front-end is build with respect to Stage-x as in the boot-strap if it requires libraries to be built like JAVA needs libjava or if its an optional language.
Thats a lot of stuff people probably don’t care about!
But its interesting what other project uses a cool a setup as that! I mean LISP, there isn’t even a lisp compiler in there but there its a bit of a one in the backend. There are some exciting things coming from GCC at the moment, LTO or graphite springs to mind, they aren’t ready yet, but LTO mean Link time optimization, so when you compile your program with gcc in whatever language, you compile each file to object code individually but you use gcc to link each object code and library together instead of using LD explicitly because it works nicer, so on Link time optimization when your linking all the code together another pass of the optimizations goes over a huge IR of all the object code to re-structure of remove dead code etc all the usual optimizations. And with OPEN/MP becoming pretty popular graphite you can pass flags like -fparallelize-all to extract loops even dependence loops to push in open/mp pragmas to make fully parallelized code without having to lift a finger. I’m working quite slowly to write 3 papers in total, a long thesis on Crules you can check it in my git-repos over at http://code.redbrain.co.uk and i am working on a thesis on the python-front-end as-well as a little paper on the gcc architecture. There is a tagged branch for a front-end-skeleton here on my git-repos: http://code.redbrain.co.uk/cgit.cgi/gcc-python/
So there is a lot of work there. And i chose to do a python front-end due to the fact everyone can say python is an awesome language no-one can deny that, it has been very well designed; language design is something a lot of people can over-look or take too far and python was a good mix in my opinion. What i mean is if you spend too long on language design your language will become very messy and hard for users to become comfortable with, something that has happened Haskell in my opinion although the way non-determinism is very clever in how Haskell has implemented it. On the other hand if you don’t spend enough time designing your language its going to be very messy and not much re-use of syntax. One thing to remember in design there is always a reason for the smallest of things be it the reason of using an identifier to using a ‘.’ for access or ‘- >;’ for pointer access, keywords have to make sense etc.
I was invited to join the graphite project i posted a quick patch to fix a part of their build-system, but i am not sure what i can really contribute because i would need to work full-time on it because its been going for some time now. Some of the work is pretty exciting. I am going to see later if i can maby merge the actual python parser from python into my GCC front-end because this would make a very nice setup to mirror the language features of python to an extend but there will be differences an GNU’isms because the mere fact that this will be compiled handling things like IO will rely on linking against libraries and the python pre-processor functions like import and from etc will be fiddly to work but i should beĀ able to take alot of work from the gnuJava front-end as there are similarities in how certain things work.
Anyways from my goggling i haven’t been able to find an implementation of python that is compiled properly i mean natively compiled, some people mix the up the fact that python does create byte-code like java think its compiled code and it really isn’t. I remember at university a lecturer told us that compiling JAVA code you were creating Machine code to run natively on a computer. And he got the pipeline so wrong in creating a binary. They always made it seem that a compiler was a piece of magic that made a binary and java is NOT a compiled language unless you use GCJ. I personally find linkers are a piece of magic these days a lot of the know-how in how linkers are build is quite complicated in some ways. Although what you are essentially doing is taking the object code output from an assembler and creating an ELF format file with the object code in the correct segments and data in the top segment etc. And on windows its a COFF format i think, but even at that what are you doing and what does object code look-like is it literally the binary format of instructions in order from the corresponding assembly code?!
I would love to understand more on how that all works, but i mean an object code file is a not a plain text binary file so yeah i even posted to comp.compilers a thread on manging the jit you can look for it in the archives but its something that is still quite hidden, and only a few people who implemented it understand, i see how it all partly works because of the libjit project. like the defintions of an a.out and ELF were quite understandable i guess. But with a jit if your able to for example jit a function like:
-
int multiply( int x, int y )
-
{
-
return x*y;
-
}
If you are able to make target assembly code for this, and you assemble it and link it properly you have a binary how do you actually execute it in a useful way for an interpreter?! I mean do you keep adding new functions into this image and use dlopen functions and treat the symbols like a dynamic library and execute them that way through dlopen? I mean this is what the LLVM project hides from the users very well but anyways enough of that.
Anyways thats what I’ve been doing for a summer what have you been up to?!
You can follow some of my stuff on several mailing lists like bison,autoconf, automake, gcc-help, gcc,comp.compilers. I helped a guy a while back on his bison parser, bison is a very useful tool, but for doing more free-form grammars is a pain I’ve found and i like to think i know YACC pretty well now from using it for about a year.
By the way I’ll be posting a wiki links on crules soon for example on how the language works etc and why but i would really like it if people read my thesis when its finished because i designed this language from a very specific set of reasons in the beginning for a specific application, but its become a fully multi-paradigm language in its design now, and this is what my thesis on it describes.
Anyways i’m going back to get some work done and i have BLUG tomorrow!
Oh hey i found a new favourite band bought their album as soon as i heard this, its a polish metal band they are probably the a new Nightwish i guess! They are very good!
I think thats enough for now i have some photos to upload but i’ll do that next time











[...] http://redbrain.co.uk/?p=446 [...]