Tuesday, June 17, 2008

Here's why dynamic languages are slow and how to fix it

Dynamic languages are emerging as the Next Big Thing. They are known for making development faster, being more powerful and more flexible. Today, more and more people are using them in production environments. However, one problem stands in the way of mass adoption: SPEED. There is an urban legend that dynamic programs are way slower than their static counterparts. Here's my take on it.

Why are dynamic languages slow TODAY?

The purpose of a dynamic language is to have as few static elements as possible. The idea is that this offers more flexibility. For example, in Python method calls are never static. This means that the actual code that will be executed is known only at run time. This is what makes monkey patching possible. This is what allows you to have great unit testing frameworks.

# globals.py
A = 2

# main.py
from globals import A
A = []
Dynamic languages leave as many decisions as possible to run time. What is the type of A? You can only know for sure when the code runs because it can be changed at any point in the program.
The result is that it is hard to analyse dynamic languages in order to make optimizations. Compared to static languages - which offer plenty of opportunities for optimization - dynamic languages are hard to optimize. Thus their implementations are usually slow.

The problem with dynamic languages is that it isn't trivial to optimize an addition. You can hardly know what '+' will be binded to at runtime. You probably can't even infer the types of the operands. This is the result of mutation. In Python, almost everything is mutable. This leaves few information the compiler can rely on.

Does mutability hurt performance and why?

It can, depending on the case. Let me illustrate how by comparing the factorial function in C and Python. Don't think of this as a benchmark. This is just an example.

Compiling the factorial function in C with LLVM-GCC will generate efficient machine code.

// Factorial in C
int fac(int n) {
if (n == 0) return 1;
return n*fac(n-1);
int main(){
return fac(30);

; Assembly generated by LLVM-GCC
movl $1, %eax
xorl %ecx, %ecx
movl $30, %edx
.align 4,0x90
LBB1_1: ## bb4.i
imull %edx, %eax
decl %edx
incl %ecx
cmpl $30, %ecx
jne LBB1_1 ## bb4.i
LBB1_2: ## fac.exit

The compiler was able to infer many properties from the source code. For example, it concluded that the fac function referenced in main was the fac defined at compile time. This allowed the compiler to replace the assembly call instruction with fac's code. The function was then specialized for the call site and thanks to static typing, the compiler was able to transform each arithmetic operations into direct machine instructions.
Can you notice the other optimizations?

Let's look at how CPython executes the factorial.

# fac.py
def fac(n):
return 1 if n == 0 else n * fac(n -1)
First, fac.py is parsed and translated to bytecode instructions. Then the bytecode instructions are interpreted by the CPython Virtual Machine.

# CPython Bytecode for fac.py
# Think of this as an interpreted language which Python is translated into.
# See http://docs.python.org/lib/bytecodes.html
# fac
11 0 LOAD_FAST 0 (n)
3 LOAD_CONST 1 (0)
6 COMPARE_OP 2 (==)
9 JUMP_IF_FALSE 7 (to 19)
13 LOAD_CONST 2 (1)
16 JUMP_FORWARD 18 (to 37)
>> 19 POP_TOP
20 LOAD_FAST 0 (n)
23 LOAD_GLOBAL 0 (fac)
26 LOAD_FAST 0 (n)
29 LOAD_CONST 2 (1)
# main
14 0 LOAD_GLOBAL 0 (fac)
3 LOAD_CONST 1 (30)

CPython could not inline the call to fac because this would violate the language's semantics. In Python, fac.py could be imported at run time by another module. It cannot inline fac into main because a sub-module could change the binding of fac and thus invalidate main. And because main doesn't have it's own copy of fac, the code cannot be specialized for this particular call. This hurts because it would be very beneficial to specialize the function for an integer argument.

Notice that there are no references to machine addresses. CPython adds a layer of indirection to access every object in order to implement the dynamism of Python. For example, main is found by a look-up in a table. Even constant numbers are found through look-ups. This adds a significant amount of slow memory read/writes and indirect jumps.

Python doesn't even contain any explicit hints you can give to help the compiler. This makes the problem of optimizing Python non-trivial.

What about type inference?

The problem of type inference in dynamic languages remains unsolved. Type inference is a form of static analysis. Static analysis is the analysis of source code at compile time to derive some "truths" about it. You can imagine how this falls short for dynamic languages.

Michael Salib attempted to solve this problem with StarKiller. The compiler manages type inference by collecting more information than usual and using the CTA algorithm. Instead of compiling each module separatly, like most compilers, the whole program is analyzed and compiled in one pass. The knowledge of the complete program opens the door to more optimizations. The fac function of the previous example can be specialized by Starkiller because it knows how it will be used.

Though the work seems very promising, it has three major flaws. First, the compiler accepts only a subset of the Python language. Advanced functions like eval and exec aren't supported. Second, whole-program analysis doesn't scale with bigger projects. Compiling 100,000 LOC would take a prohibitive amount of time. Third, the compiler violates Python's semantics by doing whole-program analysis. Like most dynamic languages, the import mechanism of Python is done at runtime. The language doesn't guarantee that the module available at compile time is the same as the module available at run time.

Read this for more.

What about VMs?

Virtual Machines are a natural fit for dynamic languages. VM with JIT compilers are able to optimize a dynamic program without having to guess it's behavior in advance. This saves a lot of heavy lifting. Programs are optimized simply by observing their behavior while they run. This is known as dynamic analysis. For instance, noticing that fac is often called with an integer argument, the VM could create a new version of that function specialized for integers and use it instead.

In my opinion Virtual Machines are not a long-term solution.

  1. Self-hosting a VM is prohibitive.
  2. A VM sets a limit on the kinds of programs you can make. No Operating Systems, no drivers, no real-time systems, etc.
  3. Optimizing a program run through a VM is hard because you cannot know exactly what is going on behind the hood. There are many layers and many corners where performance can slip out.

For most projects, these problems aren't an issue. But I believe their existence would restrain dynamic languages. They are enough to prevent a dynamic language from being a general purpose tool. And that is what people want: no restrictions, no surprises, pure freedom.

How would I make them faster?

from types import ModuleType
import re
declare(re, type=ModuleType, constant=True, inline=True)

A compiler helped by Static Annotations is the way to go. Please don't put all static annotations in the same bag. Static annotations like type declarations don't have to be as painful as JAVA's. Annotations are painful in Java because they are pervasive and often useless. They restrict the programmer. Programmers have to fight them. Annotations can be just the opposite. They can give the programmer more freedom! With them, programmers can set constraints to their code where it matters. Because they have the choice, static annotations become a tool that offers MORE flexibility.

A savvy programmer could reduce the dynamism of his code at a few key points. Just enough to allow type inference and the likes to do their job well. Optimizing some code would usually just become a matter of expressing explicitly the natural constraints that apply to it.

# Just an example.
def fac(n):
assert type(n) in [float, int]
return 1 if n == 0 else n * fac(n -1)

There are a many ways to implement static annotations in dynamic languages. I believe the flexibility of dynamic languages can allow static annotations to be very convenient. How would you do it?


Reilly said...

Just one quibble. Static types don't prevent kick-ass testing frameworks. QuickCheck for Haskell is the equal of any of the testing frameworks for dynamically typed languages.

Maht said...

As interesting as your analysis is, it is not true that you can't make a VM that's also an OS.

Here's one that's mature and is in use today :



Kid Meier said...

To further Maht's point: while you can't get down to the hardware directly in a VM, its really a moot point since that is only a concern in a small number of applications/systems.

The widespread success of the Java VM is proof of this. Yes, I don't see any OSes with significant market share written in Java, but I would have to question if that is really a bad thing.

You will get almost nowhere trying to make a single tool that is designed to solve every single problem in computing.

Yann said...

Yes, there are some good testing frameworks for static languages but they are very different from what I was talking about.
Quickcheck automatically generates tests, right? I don't know any dynamic language with a framework like this. But with monkey patching - for instance - dynamic languages offer other kinds of possibilities. I really recommend you take a look at the link I mentioned:

Very cool link. Also I didn't mean to say it was impossible to make an OS with a VM.
Microsoft is even supporting similar effort:

michaelw said...

Quickcheck automatically generates tests, right? I don't know any dynamic language with a framework like this.

Quviq QuickCheck is a port to Erlang, with some interesting added capabilities (shrinking test cases).

Anothing framework for automatically generating test cases is an RT extension by pfdietz.

riffraff said...

there are quickcheck ports for ruby (rushcheck) and for perl (test::lectrotest, IIRC).
I recall one for python too, but not sure about the name.

But I can't see why you could not have testing DSLs a-la BDD in haskell, type classes seem enough, and Scala seem to have a nice framework for that.

Luis said...

You mentioned Starkiller, which is just vaporware (announced with fanfare but never released). For the real thing check Mark Dufour's Shedskin (http://shed-skin.blogspot.com/). It is a python to c++ compiler which can compile a whole program or generate extensions for cpython. It works right now and is very usable.
It works by restricting your coding style in a static way (not changing the type of variables at any time within your code), it performs a type inference analisis and generates equivalent c++ code.

As for your prefered way of doing it, you may want to check Boo (http://boo.codehaus.org/) or Cobra (http://cobra-language.com/). Both languages are very similar and are python-like languages for the .NET framework. They are static, but with type inference, allowing you to write code without declaring types.

Anonymous said...
This comment has been removed by a blog administrator.
Karthika Shree said...

It's interesting that many of the bloggers to helped clarify a few things for me as well as giving.Most of ideas can be nice content.The people to give them a good shake to get your point and across the command.
Java Training in Chennai

sunitha vishnu said...

It is amazing and wonderful to visit your site.Thanks for sharing this information,this is useful to me...
Android Training in Chennai
Ios Training in Chennai

Sean Parker said...

Best of luck for you to do such a amazing thing.
manchester airport parking deals

gowsalya said...

Thanks a lot very much for the high quality and results-oriented help. I won’t think twice to endorse your blog post to anybody who wants and needs support about this area.
digital marketing training in tambaram

digital marketing training in annanagar

digital marketing training in marathahalli

digital marketing training in rajajinagar

Digital Marketing online training

full stack developer training in pune

johnsy sai said...

Existing without the answers to the difficulties you’ve sorted out through this guide is a critical case, as well as the kind which could have badly affected my entire career if I had not discovered your website.
full stack developer training in tambaram

full stack developer training in velachery

sai said...

Thanks for the informative article. This is one of the best resources I have found in quite some time. Nicely written and great info. I really cannot thank you enough for sharing.
python training in tambaram
python training in annanagar
python training in velachery

ummi ari said...

I always enjoy reading quality articles by an individual who is obviously knowledgeable on their chosen subject. Ill be watching this post with much interest. Keep up the great work, I will be back
Blueprism training in Pune

Blueprism online training

Blue Prism Training in Pune

Nila shri said...

Nice tutorial. Thanks for sharing the valuable information. it’s really helpful. Who want to learn this blog most helpful. Keep sharing on updated tutorials…

Data Science Training in Chennai
Data science training in bangalore
Data science online training
Data science training in pune

Richa T said...

It's interesting that many of the bloggers to helped clarify a few things for me as well as giving.Most of ideas can be nice content.The people to give them a good shake to get your point and across the command
java training in chennai | java training in bangalore

java training in tambaram | java training in velachery

genga g said...

Your very own commitment to getting the message throughout came to be rather powerful and have consistently enabled employees just like me to arrive at their desired goals.

angularjs Training in bangalore

angularjs Training in btm

angularjs Training in electronic-city

angularjs online Training

angularjs Training in marathahalli

Harshavardhan said...

Nice article!!.. keep blogging
UI UX Design Courses in Chennai

Roja Priya said...

Thank you for sharing your article. Great efforts put it to find the list of articles which is very useful to know, Definitely will share the same to other forums.

best openstack training in chennai | openstack course fees in chennai | openstack certification in chennai | openstack training in chennai velachery

Roja Priya said...

Thanks for sharing your information. Great efforts put it to find it which is really amazing. It is very useful to know, Definitely will share the same to other forums.
openstack training in chennai omr | openstack training in chennai velachery | openstack certification training in Chennai | openstack training in chennai

IT Tutorials said...

Thank you so much for your information,its very useful and helful to me.Keep updating and sharing. Thank you.
RPA training in chennai | UiPath training in chennai

amsa leka said...

Thanks for your great and helpful presentation I like your good service. I always appreciate your post. That is very interesting I love reading and I am always searching for informative information like this.AngularJS Training in Chennai | Best AngularJS Training Institute in Chennai

DJ PRASATH said...

Thanks for your post. This is excellent information. The list of your blogs is very helpful for those who want to learn, It is amazing!!! You have been helping many application.
best selenium training in chennai | best selenium training institute in chennai selenium training in chennai | best selenium training in chennai | selenium training in Velachery | selenium training in chennai omr | quora selenium training in chennai | selenium testing course fees | java and selenium training in chennai | best selenium training institute in chennai | best selenium training center in chennai

kavinilavu G said...

Such a Great Article!! I learned something new from your blog. Amazing stuff. I would like to follow your blog frequently. Keep Rocking!!
Blue Prism training in chennai | Best Blue Prism Training Institute in Chennai

Rithi Rawat said...

Very nice post here thanks for it .I always like and such a super contents of these post.Excellent and very cool idea and great content of different kinds of the valuable information's.

machine learning training in velachery

top institutes for machine learning in chennai |Android studio training in chennai

Roja Priya said...

Given so much info in it, The list of your blogs are very helpful for those who want to learn more interesting facts. Keeps the users interest in the website, and keep on sharing more, To know more about our service:
Please free to call us @ +91 9884412301 / 9600112302

Openstack course training in Chennai | best Openstack course in Chennai | best Openstack certification training in Chennai | Openstack certification course in Chennai | openstack training in chennai omr | openstack training in chennai velachery | openstack training in Chennai | openstack course fees in Chennai | openstack certification training in Chennai

amsa leka said...

Wow!! Really a nice Article. Thank you so much for your efforts. Definitely, it will be helpful for others. I would like to follow your blog. Share more like this. Thanks Again.
iot training in Chennai | Best iot Training Institute in Chennai

kavinilavu G said...

Great post! I am actually getting ready to across this information, It's very helpful for this blog. Also great with all of the valuable information you have Keep up the good work you are doing well.DevOps Training in Chennai | Best DevOps Training Institute in Chennai

amsa leka said...

Thanks for such a great article here. I was searching for something like this for quite a long time and at last, I’ve found it on your blog. It was definitely interesting for me to read about their market situation nowadays.angularjs best training center in chennai | angularjs training in velachery | angularjs training in chennai | best angularjs training institute in chennai

IT Tutorials said...

Really useful information. Thank you so much for sharing.It will help everyone.Keep Post. RPA training in chennai | RPA training in Chennai with placement | UiPath training in Chennai | UiPath Chennai

Rithi Rawat said...

Outstanding blog thanks for sharing such wonderful blog with us ,after long time came across such knowlegeble blog. keep sharing such informative blog with us.

Check out : big data training and placement in chennai
big data hadoop training in chennai
big data certification in chennai
hadoop big data training in chennai

Unknown said...

It's Good to see such a piece of great information about dynamic languages and the fixing I am really Impressed
manchester airport parking deals

priya said...

Read all the information that i've given in above article. It'll give u the whole idea about it.
Microsoft Azure online training
Selenium online training
Java online training
Java Script online training
Share Point online training

Ram Niwas said...

Digital Marketing Training online

sasitamil said...

Its really an Excellent post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog. Thanks for sharing....
devops online training

aws online training

data science with python online training

data science online training

rpa online training

Cams said...

Why you can Buy Standard Lens