Perthon FAQ

FAQ

(Q) When will it be done?
(A) The last 5% will likely take 95% of the time. Perthon is not even near the last 5%.
(Q) Is Perl-to-Python translation possible in all cases?
(A) It is "possible," for in the extreme case, we could port the standard Python interpreter to Perl and generate this code along side the Python source one wants to run. This would execute, but the generated source code certainly won't look like hand-crafted Perl. We minus well instead embed the standard Python interpreter into Perl. We could go further and convert all the constructs (e.g. if-statements) in the Python source into corresponding Perl code. The problem with this is that data types and libraries are different (e.g. Python string objects have an "endswith" method, which does not exist in Perl strings). We might solve this with tied hashes in the Perl source code or write our own Pythonic string class for use throughout the generated code, and this will run, but it also certainly won't look like hand-crafted Perl. A typical Perl program does not create its own string classes in order to invoke an "endswith" method. Instead, it uses appropriate regular expressions. Another example is in Python exception handling and the Python exception class hierarchy, which are quite different from the more unstructured Perl counterparts. If we want to make generated Perl code conform to all the Perlisms, we may have to sacrifice completeness (i.e. only translate a subset of Python) and/or absolute correctness of the translation. To improve this situation, we could require the programmer "decorate" the Python source code (undesirable) in any case in which the translator sees an ambiguity. There are also subtile differences between Python and Perl that are hard to efficiently translate in the general case, such that we might not attempt to do so at all. For example, Python classes and Perl packages (used for Perl classes) have different scoping rules, so we could simply reject or warn on Python code relying on more obscure rules (e.g. classes defined locally within code blocks).
Q: How is language translation achieved?
A: We use Parse::RecDescent and Perl regular expressions as the underlying parsing technology. Parse::RecDescent is somewhat atypical, and also unlike the Python interpreter itself, in that it merges the lexing and parsing steps. Although Perthon utilizes this default behavior, it currently also preprocesses Python source code prior to the regular lexing/parsing process, and this is done to insert the INDENT/DEDENT tokens (i.e. implementing Python's infamous whitespace identing rules). The Python BNF grammar (given in the Links section) is being converted to into a Parse::RecDescent format, and this requires the removal of left-recursion; implementation-specific optimizations can be done as well. This grammar is also decorated with actions, coded in Perl, that convert the recognized Python structure into corresponding Perl code--one Python construct at a time.
Q: Why Parse::RecDescent? Why not Parse::Yapp or <insert your tool here>
A: This is what I'm most familiar with, it's implemented in Perl, it's well documented, it's remarkably flexible, and it works fairly well, although it does do a top-down parse (which means that the left-recursive Python grammar has to be translated), and it is reportedly noticably slower than Parse::Yapp. Parse::RecDescent seems to be working fine so far for Perthon, but comments on moving Perthon to something other than Parse::RecDescent are welcome.