Avatar

Because of the nature of Python (interpreted language), securing the source code is a challenging task. In order to execute the source code, it must be available in some form.

Throughout this article, I’ll detail the compiling modules with Cython method/solution to the challenge of protecting a Python-based codebase.

Cython is a static compiler for Python and Cython programming languages, it simplifies the job of writing Python C extensions. Cython allows us to compile Python code, the result is dynamic libraries that can be used as python modules too.

The Cython import process is as follows:

  • shared library (.so, .pyd)
  • python bytecode (.pyo, .pyc)
  • python file (.py)

So… what are the benefits of using Cython compiled modules?

  • Binary modules will impose a much harder task to get the original Python code, reverse engineering techniques must be used to do so.
  • Cython generated C code can be modified to introduce changes, improve protection, etc.
  • GCC optimization flags can be used while compiling the library
  • Tracebacks won’t reveal code, but just line numbers (unless disabled ).
  • Cython takes Python code and translates it to C, which is then compiled by GCC (or similar), the compiled code will run faster than the pure Python version.

Let’s review the basic functionality of Cython

Remember the hello.py script from the HashiCorp Vault Secret Manager article? Well, pulling secrets from HashiCorp Vault is great but If you think about it… if the user can access/modify the code, he/she can add a simple print statement to reveal the secrets (check lines #19 – #21)

import getpass
import hvac
​
VAULT_ADDR = 'http://127.0.0.1:8200'
VAULT_TOKEN = getpass.getpass('Hashicorp Vault Token ID: ')
​
client = hvac.Client()
client = hvac.Client(
url = VAULT_ADDR,
token = VAULT_TOKEN
)
​
response = client.secrets.kv.read_secret_version(path='ap')
​
client_id = response['data']['data']['client_id']
client_secret = response['data']['data']['client_secret']
repo_token = response['data']['data']['repo_token']
​
print("Client ID: " + client_id)
print("Client Secret: " + client_secret)
print("Repo Token: " + repo_token)

hmmm… We need to prevent others from modifying the file… let’s see how Cython can help with that.

1. For the sake of this POC, let’s leave the three print statements (lines #19 – #21). Preferably, these lines should be removed ?

2. Make sure to have the “python3-devel” package installed (e.g., sudo yum install python3-devel)

3. Install Cython- sudo pip3 install Cython

$ sudo pip3 install Cython
Collecting Cython
Downloading https://files.pythonhosted.org/packages/40/67/36322cf0387cf65e6be80ba2d9a33db227ecbc624902f0cb2e4bf456261f/Cython-0.29.23-cp38-cp38-manylinux1_x86_64.whl (1.9MB)
|████████████████████████████████| 1.9MB 23.3MB/s
Installing collected packages: Cython
Successfully installed Cython-0.29.23

4. Convert the python code into C code – cython hello.py –embed (note: add –embed flag to create a standalone program. If –embed is not used the c code will not have a main as it will mean to create a shared object rather than a standalone executable. After the following command is issued and executed, a c source file hello.c should be created in the same directory)

$ cython hello.py -o cython.c
/usr/local/lib64/python3.8/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /home/ec2-user/hello.py
tree = Parsing.p_module(s, pxd, full_module_name)

5. Compile the c code into an executable – gcc `python3-config –cflags –ldflags` hello.c -o hello (note: the include and library paths python must be specified. The execution of the following command should create an executable file hello. this will be a distributable binary)

$ gcc `python3-config --cflags --ldflags` hello.c -o hello
$ [NO OUTPUT]

6. Check the folder content – ls -rtl 

$ ls -rtl
total 276
-rw-rw-r--. 1 ec2-user ec2-user    545 Jul 11 16:06 hello.py
-rw-rw-r--. 1 ec2-user ec2-user 139572 Jul 11 17:27 hello.c
-rwxrwxr-x. 1 ec2-user ec2-user 132312 Jul 11 17:29 hello

7. Run the hello script – ./hello (when asked, enter the “Root Token” from HashiCorp Vault Secret Manager article, step #4)

$ ./hello
Hashicorp Vault Token ID: [ --> Root Token: s.4Gl4TLJb1D82OWxxxxxxxxxx]
Client ID: 123456789
Client Secret: 987654321
Repo Token: a1b2c3d4e5

8. View the hello file content – cat hello (note: file output was truncated)

$ cat hello
ELF>?J@@?@8
@'&@@@@@h??@?@@@HUHU 0]0]`0]`?
?HX[cBE??j??@?@ Cֻ?|??V?T?????@?@ P?td`P`P@`P@??Q?tdR?td0]0]`0]`??/lib64/ld-linux-x86-64.so.2GNU?GNUGNU?M?;>P??¸ܿ???ȡX?d!
:?
@h`(?E@F @b`5`L@??F@<J?J@/?h`Q?K@ea@L@libpython3.6m.so.1.0_ITM_deregisterTMCloneTable__gmon_start___ITM_registerTMCloneTablelibpthread.so.0libdl.so.2libutil.so.1libm.so.6_PyThreadState_UncheckedGetPyFrame_NewPyEval_EvalFrameExPyObject_GetAttrPyObject_CallPyThreadState_Get_Py_CheckRecursionLimit_Py_CheckRecursiveCallPyErr_OccurredPyExc_SystemErrorPyErr_SetStringPyObject_GetAttrString_Py_NoneStructPyDict_SetItemStringPyExc_AttributeErrorPyErr_ExceptionMatchesPyErr_ClearPyExc_ImportErrorPyModule_NewObjectPyModule_GetDictPyDict_GetItemWithErrorPyTuple_PackPyExc_KeyErrorPyErr_SetObjectPyExc_NameErrorPyErr_Format_PyDict_GetItem_KnownHashPyList_NewPyDict_NewPyImport_ImportModuleLevelObjectPyExc_RuntimeErrorPyOS_snprintfPy_GetVersionPyErr_WarnExPyFrame_TypePyTuple_NewPyBytes_FromStringAndSizePyUnicode_FromStringAndSizePyImport_AddModulePyObject_SetAttrStringPyUnicode_InternFromStringPyUnicode_DecodePyObject_HashPyObject_SetAttrPyImport_GetModuleDictPyDict_GetItemStringPyDict_SetItem_PyObject_GetDictPtrPyObject_Not_Py_FalseStruct_Py_TrueStructPyUnicode_FromStringPyFunction_TypePyEval_EvalCodeExPyCFunction_TypePyDict_TypePyObject_GetItemPyNumber_AddPyUnicode_FromFormatPyCode_NewPyMem_MallocPyMem_ReallocPyTraceBack_HerePyModuleDef_InitPyModule_TypePyType_IsSubtypePyModule_ExecDefPyErr_PrintPy_FinalizeExPyMem_RawFreePy_InitializePy_SetProgramNamePySys_SetArgvlibc.so.6setlocalembrtowcmbstowcs__stack_chk_failstrdupstrlenmallocstderrfwrite__libc_start_mainfree_edata__bss_start_end__pyx_module_is_main_helloPyInit_hello_IO_stdin_used__data_start__libc_csu_init__libc_csu_finiquiBC_2.{`_`BC_2h_`5?ii
p_`
x_`?_`?_`?_`!?_`#?_`)?_`8?_`;?_`@?_`A?_`G?_`J?_`K?_`N?_`P?_`R?_`S`` ``(``0``8``@``H``P`X``
h``p``x``?``?``?``?``?``?``?``?````?``?``?``?`` ?``"?``$?``%a`a`'a`(a`* a`+(a`,0a`-8a`.@a`/Ha`0Pa`1Xa`2`a`3ha`4pa`5xa`6?a`7?a`9?a`:?a`<?a`=?a`>?a`??a`B?a`C?a`D?a`E?a`F?a`H?a`I?a`L?a`Mb`b`Qb`Tb`U b`V(b`W??H?H?AC H??t??H???5BC ??%CC ??h?????????h?????????h?????????h????????h????????h????????h????????h??q????????a??????h ??Q??????h
??A??????h
??1??????h
????????h????????h?????????h?????????h?????????h?????????h????????h????????h????????h????????h??q??????h??a??????h??Q??????h??A??????h?1??????h??!??????h????????h????????h?????????h ?????????h!?????????h"?????????h#????????h$????????h%????????h&????????h'??q??????h(??a??????h)??Q??????h*??A??????h+??1??????h,??!??????h-????????h.????????h/?????????h0?????????h1?????????h2?????????h3????????h4????????h5????????h6????????h7??q????? D????%? D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?> D????%?>> D????%> D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?= D????%?== D????%= D????%?< DA????A??A??xIc?H??D9|D1?1?A?D9?}%D??)șA???Hc?H??D9}?H??A????Hc?H??D9}???AVI??AUI??ATI??USH??????H??1?L??H??H???????H??H??tGH?D 1?H?L9?}I?T?H?H??H????1?H???????E H?
I??u
[]A\A]A^????H??H??I???O???L?%?9 ?H ???H A;AVAUATUSH?L???M??u
$3H??H??L??A??H???&????p ?V??P A?$?H?=????
@?H?=?%?{?????t?1??59?} ??????@$H??u#?|???H??H??u?H??8 H?5?%H?8????H??[]A\A]A^?AVE??AUI??ATI??H??US?y???H??t5H;]8 H??1?A??tH??L??L????????H?
u)H?H???P0?H?08 ???H?8?]?????t?????1?[??]A\A]A^???AUI??ATUSQ?$???H?PH??@ H??u H??@ ?"H9?tH?'8 H?%1?H?8?????H?-?B H??t H?E??H?5?%L??????I??H????H???A???I?
$H??u
w%L??L??H?s%?I?????xH???H? I?DL???P0H????H??????I??H????A?H?
u
H?H???P0ZH??[]A\A]?USH??Q?-???H??H??ub?`???H??u[H?H?????t7?1??o???H??H??t7H??H?E6 H?8?e???H?

Final thoughts

This article attempts to find a solution to the problem. Cython seems like a promising option to consider. It is true that any user will have access to binaries that can be used to reverse engineer the application, but that’s going to take a good amount of time and work.

Disclaimers

  1. This article aims to cover the basic functionality of Cython.
  2. It’s also possible to combine the different approaches to provide an even more secure environment.
  3. Want to learn more about Cython? Please contact the Cross-Domain TAB team (mailto: xdc-amer-tab).

 


We’d love to hear what you think. Ask a question or leave a comment below.
And stay connected with Cisco DevNet on social!

LinkedIn | Twitter @CiscoDevNet | Facebook Developer Video Channel

 



Authors

Yossi Meloch

Senior Software Architect

Customer Experience (CX)