First steps to Reverse Engineering an Executable

Reverse engineering has always been an obsession of mine. As a child, I used to go around garage sales, looking for old electronics for the sole purpose of opening them up to mess around with the insides. There is just something gratifying about opening up a closed system to see how it works, where it cuts corners, how it could be modified to work better. Software is no different. This the main reason I love open source: the code is available and for someone like me, who has an affinity for finding (read causing) bugs, available source makes it easy to find areas which need fixing.

But nothing easy is ever fun.

I have developed a strong interest in reverse engineering projects such as: wine, OpenMW, Freeablo, OpenTTD, OpenRCT2, Mono, Monogame, limetext and many others. It’s the sort of thrill that says: Anything you can do I can do better and make it free.

Personally, I’ve been interested in reverse engineering 3DO’s 1999 classic Heroes of Might and Magic 3, even though it is set for re-release by Ubisoft in 2015. The plan is to use LÖVE and take advantage of its high portability and simple 2D game API.

The first step I took is reversing the map file format. This is a tedious process as I could not find any documentation about it, unlike the abundance of documentation you will find for the Elder’s Scrolls series. Luckily, the game comes with a map editor and with the help of a hex editor and a tool I wrote to document the data structures, I was able to identify many parts of the data structures used. That, however, is the subject of another post.

The subject of this post is another strategy I am working on: reverse engineering the map editor’s code in order to extract the map loading logic. This means diving into the compiled code and changing the Portable Executable (PE) structure and inserting our code before the call to the main function.

The software I used in this guide are Ollydbg, a hex editor, mingw x86 compiler and CFF Explorer, the only non-open source software.

Adding a dll to the import table

The strategy I am using is the exact same that is used in OpenRCT2. This involves adding a custom dll for which we have full control of the functions and source code to the compiled executable’s dll import table.

First, we create a dll. Let’s call it divertedmain.dll:

__declspec(dllexport) int __cdecl DivertedMain()
{
    return 42;
}

We only want something dead simple. A main function which returns 42 won’t add more than it needs to and will return a 42 return code, letting us know that it has indeed worked.

To compile and link the dll:


i686-w64-mingw32-gcc -c -o divertedmain.o divertedmain.c

i686-w64-mingw32-gcc -o divertedmain.dll -s -shared divertedmain.o -Wl,--subsystem,windows

Now let’s fire up CFF Explorer and add our new dll with the import adder, making sure it’s in the same directory as the executable and importing by name the DivertedMain function.

Rebuild the import table and save the executable. To make sure it’s importing the dll’s exported function you can check the Import Directory in CFF Explorer. Another way would be to simply remove the divertedmain.dll file and there should be an error message when trying to load the executable.


$ wine executable-with-imported-dll.exe    
err:module:import_dll Library divertedmain.dll (which is needed by L"executable-with-imported-dll.exe") not found
err:module:LdrInitializeThunk Main exe initialization for L"executable-with-imported-dll.exe" failed, status c0000135

Overriding main() function

This part is a little more tricky and requires some trial and error.

We return 42 in the DivertedMain function. Once compiled, this number can be easily found. To see what the function will look like, we can compile the dll source to assembly. Note that gcc returns AT&T style assembly by default and Ollydbg uses Intel style assembly, the -masm=intel flag fixes this:

// i686-w64-mingw32-gcc -S -masm=intel -c -o divertedmain.s divertedmain.c
    .file    "divertedmain.c"
    .intel_syntax noprefix
    .text
    .globl    _DivertedMain
    .def    _DivertedMain;    .scl    2;    .type    32;    .endef
_DivertedMain:
    push    ebp
    mov    ebp, esp
    mov    eax, 42
    pop    ebp
    ret
    .ident    "GCC: (GNU) 4.9.2"
    .section .drectve
    .ascii " -export:\"DivertedMain\""

This snippet will be important to spot while decompiling.

Finding exported function

Loading the modified executable in Ollydbg, we will go to the divertedmain section in the Executable modules section (Alt+E).

This section tells us that the divertedmain code is located in the 0x66B00000 - 0x66B0B000 range. Also note that the executable address space starts at 0x00400000 and that all addresses will be offset by that much. In the divertedmain range, we should find the assembly representation of our DivertedMain() function.

Indeed, in this case it is at the address 0x66B014B0:

It is easily spotted thanks to the assembly compilation from earlier and the flag 2A which represents 42. Again, this requires the need to convert AT&T style asm and intel style asm.

Finding the main() function

We’re looking for the main function, which usually would take the command line parameters as arguments; it is also the function which does most of the logic, so a call which starts the whole program execution is likely to be the main function.

Using Ollydbg, we will step through the functions one instruction at a time. Setting arguments to the call can make it easy to find the function which uses them. In File|Set new arguments…, set “Look for me” as the new arguments.

Using the Step over button  (F8), we will go through each call, looking at the registers for clues.

At about 0x004E7F6A we find a call to KERNEL32.GetCommandLineA which is important to note, since this is where we will get the argument list e.g.argv and indeed, at the next instruction, Ollydbg shows us the executable and the arguments “Look for me” in EAX. This is very important since it means one of the next instructions will call main(). Some of the following calls strip out the executable name from the arguments which is actually the default behaviour for a call to WinMain().

Around 0x004E7FAF, we start seeing a lot of PUSH instructions, a call to KERNEL32.GetModuleHandleA leading up to a call to executable-with-imported-dll.004FBD57. This is a clue that there is a large number of parameters being passed to a function. Since our DivertedMain() function doesn’t take any parameters yet, these will be lost. This, however, should not be a problem for us, yet since the parameters can be added later.

The call to executable-with-imported-dll.004FBD57 starts the application, so we know that this is the call which needs to be edited. For me, it was at address 0x004E7FBF.

Replacing the address of the call to main()

At address 0x004E7FBF, note the current binary form for the call. This will be important to call the original function later.

At that line in Ollydbg, assemble some new code by right clicking on the line and choosing “Assemble…”, or by pressing space. Replace the call address with the address of my DivertedMain() function which was found earlier to be 0x66B014B0. The disassembled view substitutes the address with CALL DivertedMain which is a good sign.
CALL DivertedMain

Note the binary form: E8 EC946166

When Stepping into Screenshot from 2014-12-26 15:04:39(F7) it, the execution starts the DivertedMain() assembly and returns 42.

Saving the changes to the call to DivertedMain()

We now have the address of the call to main() and the binary form of the call to DivertedMain() given by Ollydbg: 0x004E7FBF and E8 EC946166. As we noted earlier, the addresses of the executable are offset by 0x00400000, therefore, in a hex editor, the call is actually at the address 0x000E7FBF.

Opening a hex editor, the byte at 0x000E7FBF is E8 which is the first byte of our new instructions and the x86 opcode for Call Procedure. Looking at the following 4 bytes, we can confirm that it is indeed same call we had in Ollydbg, before we edited it. Using a hex editor, we replace the bytes in 0x000E7FBF - 0x000E7FC3 to be E8 EC946166. Careful to replace and not insert.

We save it as executable-with-divertedmain.exe and run it:

$ wine executable-with-divertedmain.exe
$ echo $?
42

The program didn’t start, but instead quit with the return code 42. This is perfect and it means the main function was diverted into our dll which returns 42. There is no more need to edit the executable and we can now rely on our own code as long as DivertedMain() is the first defined function in our dll.

To demonstrate this, we can get the DivertedMain() function to print out “DIVERTED!” to the console.

#include "stdio.h"

__declspec(dllexport) int __cdecl DivertedMain()
{
    printf("DIVERTED!\n");
    return 42;
}

Recompiling the dll and running the executable should give us:

$ i686-w64-mingw32-gcc -g -c -o divertedmain.o divertedmain.c
$ i686-w64-mingw32-gcc -g -o divertedmain.dll -s -shared divertedmain.o -Wl,--subsystem,windows
$ wine executable-with-divertedmain.exe
DIVERTED!
$ echo $?
42

Calling the original main function from inside our divertedmain

Our Entry point puts itself between the call to the main function and the program execution, prematurely exiting. In order to reverse engineer, we need to place ourselves between those points without actually stopping the execution. To do that, our first step will be to call the actual main function from the diverted main.

Earlier, we spotted command line parameters being pushed in the disassembled view. These parameters are those of the WinMain function which we replaced.

Before we can call the original main function, we need to get its parameters.

Getting the command line parameters

We simply add the parameters to the function signature and to make sure it works, print out the arguments:

#include
#include "stdio.h"

__declspec(dllexport) int __cdecl DivertedMain(
    HINSTANCE hInstance,
    HINSTANCE hPrevInstance,
    LPSTR lpCmdLine,
    int nCmdShow)
{
    printf("Diverted!: %s\n", lpCmdLine);
    return 42;
}

Recompiling the dll and running the executable should give us:

$ i686-w64-mingw32-gcc -g -c -o divertedmain.o divertedmain.c
$ i686-w64-mingw32-gcc -g -o divertedmain.dll -s -shared divertedmain.o -Wl,--subsystem,windows
$ wine executable-with-divertedmain.exe some arguments
DIVERTED: some arguments
$ echo $?
42

The fact that the executable name was missing seemed odd to me, but that’s just the way WinMain‘s lpCmdLine parameter works.

Fetching and calling the original main Function

Earlier when we replaced the call to main, we overrode an address with the address of our diverted main. That address is the original address to WinMain and we will use it in order to get a pointer to the function. We then call that function.
Replace 0x00000000 with the original address:

#include
#include "stdio.h"

// Address of original call to WinMain
#define WINMAINADDR 0x00000000

__declspec(dllexport) int __cdecl DivertedMain(
    HINSTANCE hInstance,
    HINSTANCE hPrevInstance,
    LPSTR lpCmdLine,
    int nCmdShow)
{
    printf("Diverted: %s\n", lpCmdLine);
    void(* WinMain)(HINSTANCE, HINSTANCE, LPSTR, int) = (void*)WINMAINADDR;
    WinMain(hInstance, hPrevInstance, lpCmdLine, nCmdShow);
    return 42;
}

And there we, go. The original WinMain function is called and we can use this technique to call any other function in the original executable, provided we know the address.

Acknowledgments

I couldn’t have figured this out without the helpful tips of IntelOrca and the detailed articles of Ashkbiz Danehkar.
I would also like to thank my friend Sophy for proofreading.

Further Reading

Advertisements

Using Odoo’s runbot to test OCA addons

On the heels of another conference comes another blog post.

OpenDays 2014 was amazing. It was a week packed with good talks, fun code sprints and useful exchange of ideas. On top of that I got to meet many faces of the people behind the OCA.

One of the topics that I discussed a lot was testing. I had setup an example OCA addon repo which integrates travis testing and coveralls test coverage reporting and it seemed to be very much sought after. Code reviews on launchpad prior to the move to github were lacking automated test. I often found myself doing repetitive checking out and pep8 testing of merge proposals. Using travis has the advantage of automating a lot of the repetitive tests, leaving useful and constructive code reviews to the code reviewers. Travis is amazing at what it does and I have no doubt using it will increase the efficiency of both code reviews and contributions. There is, however more that could be done.

At OpenDays, I attended Olivier Dony‘s talk on Using runbot to test developments. The talk was about how runbot is being used to do continuous integration on the branches and merge proposals on Odoo’s github. This adds quality control to Odoo development which was not there in the previous version. The old runbot strategy had more than often resulted in tests after commits catching bugs after they were commited.

What makes runbot different than travis other than being yet another automated testing server geared towards testing Odoo is that after building a certain branch, it will keep the server running so manual tests can be performed by a code reviewer. That addition does a world of difference to reviewing code, both for technically inclined reviewers, as well as more functional reviewers who can now test to see if the workflow works or if the interface is ergonomic. It also saves the effort of creating a new instance of odoo, branching the pull request, merging the code and installing the addons, a process which can take up to 20 minutes.

Oh, and did I mention runbot is nothing more than a simple addon to Odoo 8.0? That’s actually quite amazing. It uses the web frontend features of 8.0 and a little bit of subprocess magic.

Runbot, while still under active development at the time of this writing, has plans to support community addons. I spent a weekend setting up an example repo like the one I did with travis. I also submitted a few fixes and improvements on runbot itself. This leads us to the question at the heart of this post.

How do I test community Odoo addons with runbot?

Odoo setup for runbot

Instead of a virtualenv, like we did in my last post, we will use a Dockerfile. For those who do not use docker, follow the lines in the Dockerfile as commands to run.

# Dockerfile
FROM ubuntu:14.04

# Set the locale
RUN locale-gen en_US.UTF-8  
ENV LANG en_US.UTF-8  
ENV LANGUAGE en_US:en  
ENV LC_ALL en_US.UTF-8  

# Install dependencies
RUN apt-get install -y python-dev python-pip python-lxml python-ldap \
                       python-imaging postgresql \
                       postgresql-server-dev-9.3 postgresql-client \
                       postgresql-contrib-9.3 libgeoip-dev

# Add user
RUN useradd odoo -m

# Install odoo using pip
RUN pip install GeoIP
RUN pip install http://download.gna.org/pychart/PyChart-1.39.tar.gz
RUN sudo -u odoo HOME=/home/odoo pip install https://github.com/savoirfairelinux/odoo/archive/setuptools-addons.tar.gz --user

# Get runbot
RUN apt-get install -y python-matplotlib
RUN sudo -u odoo mkdir -p /home/odoo/.local/share/OpenERP/addons
#RUN sudo -u odoo git clone https://github.com/odoo/odoo-extra.git /home/odoo/.local/share/OpenERP/addons/8.0
RUN sudo -u odoo git clone https://github.com/bwrsandman/odoo-extra.git /home/odoo/.local/share/OpenERP/addons/8.0 -b community-addons

# Prepare database
RUN /etc/init.d/postgresql start && pg_dropcluster --stop 9.3 main ; pg_createcluster --start --locale en_US.UTF-8 9.3 main
RUN /etc/init.d/postgresql start && sudo -u postgres createuser --superuser --createdb --username postgres --no-createrole -w odoo
RUN /etc/init.d/postgresql start && sudo -u postgres createdb -O odoo odoo

# Run
EXPOSE 8069
CMD /etc/init.d/postgresql start && \
    su odoo -c "/home/odoo/.local/bin/odoo.py -d odoo -i runbot"

What this does is fetch a base ubuntu virtual image, install odoo dependencies, odoo itself (note that for now this is a fork of due to a setuptools bug), odoo-extra which contains runbot (my yet unmerged version), creates a database, runs odoo and exposes port 8069.

To build it yourself, you can run the following in the same directory as Dockerfile:

sudo docker build . -t bwrsandman/odoo-runbot-community

You can also get the image off of docker hub:

sudo docker pull bwrsandman/odoo-runbot-community

Run the image using:

sudo docker run -t -i -p 8069:8069 bwrsandman/odoo-runbot-community

You should then be able to connect to the virtual machine on http://localhost:8069

About runbot

Runbot is an Odoo module. It runs other instances of Odoo using the subprocess python module. It will clone a bare git repository into odoo-extra/runbot/static/repos and will run its build in odoo-extra/runbot/static/build. The build is initiated by a cronjob rather than a github webhook. Logs are stored in extra/runbot/static/build/*/log

Runbot does not test pep8, nor does it do coverage reporting. The tests it performs are hardcoded in the module so they cannot be altered via configuration. I will be working on a module which adds that level of configuration.

Runbot offers options regarding concurrent builds and concurrent running. If you find that your builds have stopped being scheduled, it could be that you have reached your limit of concurrent running instances.

Configuring runbot

Known Bug: If, when connecting to http://localhost:8069, you get redirected to http://localhost:8069/runbot and get a 500: Internal Server Error, it’s not as bad as it seems. This is a bug which only appears on the frontend when there are no repositories set up. The bug is tracked as #4 and #5 with a fix as #7. To go around this bug, simply login via http://localhost:8069/web.

When you login to the backend you should be able to see the runbot menu.

Adding Odoo

To run community addons, you need to register odoo core so that it may be tested using the official server or ocb. We will go ahead and add odoo.

All that’s needed is the github url of odoo: https://github.com/odoo/odoo
We will also disable Auto so we don’t start testing every MP to odoo, we will, however, need to clone the repo manually by pressing Update*. This will take a while as runbot clones the whole Odoo history.

* There may be a better way of doing this, but I haven’t explored it yet. An alternative could be to use OCB and not disable Auto as OCA would want to test this as well as community addons. Another option would be to set concurrent running tests to 0. The downside of my method is that Odoo won’t be updated periodically.

After hitting save, we can go to the runbot page at http://localhost:8069/runbot to see that it’s been added. The branch table is empty due to auto being turned off.

Adding our first repos

I have made two addon repositories, one from lp:openerp-hr and one from lp:partner-contact-management. For purposes of demonstration, they have been stripped of module which don’t pass tests on their own.

The main difference from the odoo repository is that we leave Auto on and we specify odoo as our Fallback Repo. We therefore add a repo for https://github.com/bwrsandman/openerp-hr and for https://github.com/bwrsandman/partner-contact-management, both with odoo as the Fallback Repo.

Soon after adding them, the runbot cron should clone and start testing these repos.
Head over to http://localhost:8069/runbot to see. You may need to select a repo under Switch repository.

Capture d'écran de 2014-06-15 18:04:57

Bug: Logs aren’t always reachable. I haven’t looked into this.

Bug: When connecting to a build, the url given is for runbot.odoo.com. There is no way to change this via the interface, however it may be stored in runbot.domain of ir.config_parametter. The way around it is to copy the URL and replace runbot.odoo.com with localhost:8069.

Adding a more complicated repo

I have a repo which has modules which depend on modules found in other repos. Without those repos, the tests will fail and the runbot will not be able to serve anything useful. In a case such as this, specify these repos in Extra dependencies.

Add a new repository with https://github.com/bwrsandman/openerp-travel as the address, odoo as the fallback repo and openerp-hr and partner-contact-management as the Extra dependencies.

This repo has more branches than the others, as well as a few pull requests. Runbot will automatically use the odoo branch which corresponds to the closest matching branch of this repo.

 

After letting the tests run for a while, we can see a few green tests with running Odoo instances which anyone can test given proper network rights.

Continuation

As is demonstrated on Odoo’s github page, if a Pull request is successful, runbot is able to set a message saying so with a link to the build. I haven’t explored that yet, but it would seem to be linked to the Github token option in the repo configuration.

Here are few features and points of improvement which I feel runbot needs:

  • Better test customization.
  • Security
    • Proper sandboxing
    • Prevent unittests from affecting the filesystem
    • Prevent running arbitrary code
    • Right now if I proposed a merge which replaced openerp-server with a bitcoin miner, runbot would run it and keep it alive
  • Pep8 testing option with the following options:
    • Whole repo
    • Diff
    • Off
    • Max length = ___
  • Coverage reports. Coverage can render html report which are good enough on their own or xml reports which can be parsed by runbot’s web frontend to give a more personalized view. Alternatively, coveralls.io can be used.
  • Gitlab support.
  • Custom build areas
  • Cleanup options
  • Virtualenv support
  • runbot will run closed pull requests
  • runbot has no way of directly getting pull request information such as
    • target revision
    • PR name
    • PR status

Running OpenERP 8.0-trunk with pypy and psycopg2cffi

Returning from PyCon 2014, my enthusiasm for python is at an all time high.
Many of the talks I attended centered around the advantages of python3 and pypy. Being an OpenERP developper, this really stings since OpenERP stable (7.0) is configured to run on python2.6.

Knowning pypy is limited by its incompatibility with python ctypes, the question I asked myself today is:

Does OpenERP run in pypy?

In order to answer this question, we must first setup an enviroment which works in python2. For this experiment, I will use the trunk of the (presently) unrealeased version 8.0 of OpenERP. I will also be doing this inside of a virtualenv which is different from how I usually deploy OpenERP at Savoir-faire Linux where we use a buildout recipe.

Setting up a virtualenv

At the time of this writing, OpenERP 8.0 is still in the trunk repository.
After its planned release in June 2014, it should be available at lp:~openerp/openobject-server/8.0 and lp:~openerp/openerp-web/8.0

$ cd ~/openerp-to-pypy
$ virtualenv2 python2-openerp/
$ virtualenv-pypy pypy-openerp/
$ # the following command takes a while
$ bzr branch --stacked lp:~openerp/openobject-server/trunk ./openerp-server
$ bzr branch --stacked lp:~openerp/openerp-web/trunk ./openerp-web

For the purposes of this experiment, we won’t touch lp:~openobject-addons

For both virtualenvs, we need to install some dependencies

python2

$ source python2-openerp/bin/activate

$ pip install lxml pyyaml python-dateutil babel pillow simplejson \
              unittest2 psutil werkzeug reportlab mako Jinja2 \
              docutils mock
$ pip install http://download.gna.org/pychart/PyChart-1.39.tar.gz
$ pip install psycopg2  # to be replaced with psycopg2cffi

deactivate

pypy

For pypy, we skip psycop2 because it is not as efficient under pypy, we will substitute it with psycop2cffi which is a cffi implementation of psycop2 which is faster and more elegant than its ctypes counterpart.

$ source pypy-openerp/bin/activate

$ pip install lxml pyyaml python-dateutil babel pillow simplejson \
              unittest2 psutil werkzeug reportlab mako Jinja2 \
              docutils mock
$ pip install http://download.gna.org/pychart/PyChart-1.39.tar.gz
$ pip install psycopg2cffi psycopg2cffi-compat

$ deactivate

 Run comparison

python2

Nothing much to say about running python2, OpenERP is meant to run with python2 so there should be no issues with our setup. We will revisit this virtualenv, though.

$ createdb python2
$ source python2-openerp/bin/activate

$ ./openerp-server/openerp-server -d python2 --addons-path=./openerp-web/addons/

$ deactivate
$ dropdb python2

pypy

This is where the issues start popping up and you can see some of the legacy code of OpenERP holding it back.

$ createdb pypy
$ source pypy-openerp/bin/activate

$ ./openerp-server/openerp-server -d pypy --addons-path=./openerp-web/addons/

$ deactivate
$ dropdb pypy

Output

Traceback (most recent call last):
 File "app_main.py", line 72, in run_toplevel
 File "./openerp-server/openerp-server", line 2, in <module>
 import openerp
 File "openerp-server/openerp/__init__.py", line 70, in <module>
 import cli
 File "openerp-server/openerp/cli/__init__.py", line 5, in <module>
 from openerp import tools
 File "openerp-server/openerp/tools/__init__.py", line 27, in <module>
 from convert import *
 File "openerp-server/openerp/tools/convert.py", line 35, in <module>
 import openerp.workflow
 File "openerp-server/openerp/workflow/__init__.py", line 22, in <module>
 from openerp.workflow.service import WorkflowService
 File "openerp-server/openerp/workflow/service.py", line 21, in <module>
 from helpers import Session
 File "openerp-server/openerp/workflow/helpers.py", line 1, in <module>
 import openerp.sql_db
 File "openerp-server/openerp/sql_db.py", line 38, in <module>
 from psycopg2.psycopg1 import cursor as psycopg1cursor
ImportError: No module named psycopg2.psycopg1

Indeed, openerp uses a legacy psycopg1 style cursor in its orm. Unfortunately, the cffi implementation of psycopg2 does away with this backwards compatibility.

The use of this, as far as I can tell, is to have access to the dictfetchall() function which is used quite a bit in server, as well as addons:

$ grep dictfetchall -r ./openerp-server | wc -l
44

Before we lose hope, let’s have a look at psycopg2.psycopg1.cursor

As we can see, it is a small wrapper of psycopg2.extensions.cursor and less than 35 lines of code. We could conceivebly move this legacy code into openerp-server/openerp/sql_db.py itself.

=== modified file 'openerp/sql_db.py'
--- openerp/sql_db.py 2014-04-14 07:59:06 +0000
+++ openerp/sql_db.py 2014-04-25 01:47:09 +0000
@@ -35,7 +35,46 @@
 import psycopg2.extensions
 from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT, ISOLATION_LEVEL_READ_COMMITTED, ISOLATION_LEVEL_REPEATABLE_READ
 from psycopg2.pool import PoolError
-from psycopg2.psycopg1 import cursor as psycopg1cursor
+from psycopg2.extensions import cursor as _2cursor
+
+
+class psycopg1cursor(_2cursor):
+ """psycopg 1.1.x cursor.
+
+Note that this cursor implements the exact procedure used by psycopg 1 to
+build dictionaries out of result rows. The DictCursor in the
+psycopg.extras modules implements a much better and faster algorithm.
+
+source: https://github.com/psycopg/psycopg2/blob/5a6a303d43f385f885c12d87240e86d6cb421463/lib/psycopg1.py#L61
+"""
+
+ def __build_dict(self, row):
+ res = {}
+ for i in range(len(self.description)):
+ res[self.description[i][0]] = row[i]
+ return res
+ 
+ def dictfetchone(self):
+ row = _2cursor.fetchone(self)
+ if row:
+ return self.__build_dict(row)
+ else:
+ return row
+ 
+ def dictfetchmany(self, size):
+ res = []
+ rows = _2cursor.fetchmany(self, size)
+ for row in rows:
+ res.append(self.__build_dict(row))
+ return res
+ 
+ def dictfetchall(self):
+ res = []
+ rows = _2cursor.fetchall(self)
+ for row in rows:
+ res.append(self.__build_dict(row))
+ return res
+
 
 psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)

With reimplemented cursor

Now if we run, we run into a wall

$ ./openerp-server/openerp-server -d pypy --addons-path=./openerp-web/addons/

2014-04-25 01:48:35,012 18986 INFO ? openerp.modules.loading: init db
2014-04-25 01:48:35,671 18986 INFO pypy openerp.modules.loading: loading 1 modules...
2014-04-25 01:48:36,271 18986 INFO pypy openerp.modules.module: module base: creating or updating database tables
2014-04-25 01:48:37,856 18986 INFO pypy openerp.osv.orm: Computing parent left and right for table ir_ui_menu...
2014-04-25 01:48:45,002 18986 INFO pypy openerp.osv.orm: Computing parent left and right for table res_partner_category...
2014-04-25 01:48:48,702 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'display_name'
2014-04-25 01:48:48,709 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'views_by_module'
2014-04-25 01:48:48,735 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'reports_by_module'
2014-04-25 01:48:48,763 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'menus_by_module'
2014-04-25 01:48:48,783 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'web_icon_data'
2014-04-25 01:48:48,784 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'needaction_enabled'
2014-04-25 01:48:48,784 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'web_icon_hover_data'
2014-04-25 01:48:48,785 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'global'
2014-04-25 01:48:48,785 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'report_file'
2014-04-25 01:48:48,785 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'image_medium'
2014-04-25 01:48:48,786 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'image_small'
2014-04-25 01:48:48,787 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'commercial_partner_id'
2014-04-25 01:48:48,797 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'crud_model_name'
2014-04-25 01:48:48,798 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'wkf_model_name'
2014-04-25 01:48:48,798 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'logo_web'
2014-04-25 01:48:48,801 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'email'
2014-04-25 01:48:48,808 18986 INFO pypy openerp.osv.orm: storing computed values of fields.function 'phone'
2014-04-25 01:48:48,824 18986 INFO pypy openerp.modules.loading: module base: loading base_data.xml
2014-04-25 01:48:49,481 18986 INFO pypy openerp.modules.loading: module base: loading res/res_currency_data.xml
2014-04-25 01:48:53,814 18986 INFO pypy openerp.modules.loading: module base: loading res/res_country_data.xml
2014-04-25 01:48:57,337 18986 INFO pypy openerp.modules.loading: module base: loading security/base_security.xml
2014-04-25 01:48:57,693 18986 INFO pypy openerp.modules.loading: module base: loading base_menu.xml
2014-04-25 01:48:57,900 18986 INFO pypy openerp.modules.loading: module base: loading res/res_security.xml
2014-04-25 01:48:57,957 18986 INFO pypy openerp.modules.loading: module base: loading res/res_config.xml
2014-04-25 01:48:58,009 18986 INFO pypy openerp.modules.loading: module base: loading res/res.country.state.csv
2014-04-25 01:48:58,261 18986 INFO pypy openerp.modules.loading: module base: loading ir/ir_actions.xml
2014-04-25 01:48:59,740 18986 INFO pypy openerp.modules.loading: module base: loading ir/ir_config_parameter_view.xml
2014-04-25 01:48:59,998 18986 INFO pypy openerp.modules.loading: module base: loading ir/ir_cron_view.xml
[1] 18986 segmentation fault (core dumped) ./openerp-server/openerp-server -d pypy --addons-path=./openerp-web/addons/

I am not sure what causes this fault: pypy or OpenERP itself. A likely culprit would be lxml.

After a bit of digging, it would seem the segfault happens sometimes when calling the function on line 923 of openerp-server/openerp/tools/convert.py

We will skip this step for now by generating the pypy database with our python2 virtualenv

$ dropdb pypy
$ deactivate

$ createdb pypy
$ source python2-openerp/bin/activate

$ ./openerp-server/openerp-server -d pypy --addons-path=./openerp-web/addons/ --stop-after-init

$ deactivate

$ source pypy-openerp/bin/activate
$ ./openerp-server/openerp-server -d pypy --addons-path=./openerp-web/addons/

And there you have it! OpenERP running under pypy with the help of psycopg2cffi!

But wait…

… there’s more!

Edit: The following issue has been fixed in upstream psycopg2cffi.

Everything seems to be all well and good, besides the segfaulting when loading some models.

Let’s try creating a french database, forcing unicode into the enviroment. OpenERP is notorious for not handling unicode, afterall.

Logging out and selecting manage databases, we create a new database with a default language other than English.

Screenshot from 2014-04-24 22:12:55

2014-04-25 02:10:07,804 20716 INFO None openerp.service.db: Create database `new`.
2014-04-25 02:10:09,565 20716 ERROR None openerp.service.db: CREATE DATABASE failed:
Traceback (most recent call last):
 File "openerp-server/openerp/service/db.py", line 33, in _initialize_db
 with closing(db.cursor()) as cr:
 File "openerp-server/openerp/sql_db.py", line 585, in cursor
 return Cursor(self.__pool, self.dbname, serialized=serialized)
 File "openerp-server/openerp/sql_db.py", line 216, in __init__
 self._cnx = pool.borrow(dsn(dbname))
 File "openerp-server/openerp/sql_db.py", line 478, in _locked
 return fun(self, *args, **kwargs)
 File "openerp-server/openerp/sql_db.py", line 541, in borrow
 result = psycopg2.connect(dsn=dsn, connection_factory=PsycoConnection)
 File "pypy-openerp/site-packages/psycopg2cffi/__init__.py", line 109, in connect
 conn = _connect(dsn, connection_factory=connection_factory, async=async)
 File "pypy-openerp/site-packages/psycopg2cffi/_impl/connection.py", line 873, in _connect
 return connection_factory(dsn)
 File "pypy-openerp/site-packages/psycopg2cffi/_impl/connection.py", line 130, in __init__
 self._connect_sync()
 File "pypy-openerp/site-packages/psycopg2cffi/_impl/connection.py", line 135, in _connect_sync
 self._pgconn = libpq.PQconnectdb(self.dsn)
TypeError: initializer for ctype 'char *' must be a str or list or tuple, not unicode

This is a new error for me. It would seem the cffi declaration for libpq.PQconnectdb did not consider unicode strings (wchar *).

Take away

Switching to pypy was not too complicated thanks to the existance to psycopg2cffi and psycopg2cffi-compat.

Some legacy code needed to be copied over and this should serve as note to core developers to drop the psycopg1.cursor and use the DictCursor in the
psycopg.extras modules which implements a much better and faster algorithm.

Something went terribly wrong with xml loading in pypy causing segfaults. While I haven’t had the time to fully investigate, it would be beneficial to know the root cause to be able to fix either OpenERP, pypy or the other libraries used.

Finally, psycopg2cffi, while beautiful and fast, needs to patch unicode support for libpq.PQconnectdb.

Edit: Thank you to lopuhin who merged my fix to unicode support in psycopg2cffi.