Current location - Health Preservation Learning Network - Healthy weight loss - Why is Java Hashmap much slower than Python's dictionary?
Why is Java Hashmap much slower than Python's dictionary?
If you run the test according to the Java and Python codes given by the topic owner and the default configuration in common environments, you will really find that the Python version is faster than the Java version. This is naturally a pit of Java performance-there has never been any theory that "Java should be faster than Python in theory", but under the premise of understanding common performance pits and best practices, pure Java programs may have better performance than pure Python programs in large-scale operations and are relatively more scalable.

On the other hand, this example makes Java run faster than Python without modifying the code, and only needs to modify the startup parameters of Java slightly. In addition, I also believe that this small example may not fully reflect the performance problem of the Java program that takes more than 3 hours. The program may greatly improve performance by only slightly adjusting the startup parameters, or it may need to modify the code to eliminate some bad practices.

So specifically, what makes Java much slower than Python? HashMap's writing is not as good as dict's? Is it because HashMap is implemented in Java and dict is implemented in C? Is it because of JIT preheating cost? Or something else?

Assuming that the environment in which the subject runs Java and Python is relatively common, such as Oracle JDK vs original CPython, then the object of discussion on the Java side is the performance of HotSpot VM of Oracle JDK.

The answer is: it is because the topic owner has not set the GC parameters well, and the default GC parameters of HotSpot VM are not applicable in this case, resulting in poor performance of the Java version under the default parameters. Java HashMap itself is not slow, especially after JIT compilation, and it will not be slower than CPython's dict, at least it is not the reason for the performance difference of this example.