Blog | Taichi Docs

How I created the tranquil autumn air within 99 lines of Python code

May 16, 2022 · 10 min read

CEO @ Taichi Graphics | Ph.D MIT | Tsinghua Alumni

On a Sunday afternoon about a couple of months ago, when Ye (https://github.com/k-ye) and I were on our way back from a long week of travel, we decided to do something to relax on the train ( to kill time). Since we happened to mention Minecraft and MagicaVoxel, we decided to do a Hackathon, where we use Taichi Lang to create a GPU path tracing voxel renderer. Soon, before we were back home, we had our prototype:

Our first ray-tracing renderer prototype

Taichi Lang is embedded in Python and it runs on any operating system and can easily interact with Python. As far as I know, apart from Taichi Lang, there's no such tooling in the Python ecosystem for generating GPU path tracing voxel renders. With Taichi Lang, one can easily create such a renderer (https://github.com/taichi-dev/voxel-challenge/blob/main/renderer.py) in around 300 lines of code.

Is Taichi Lang comparable to or even faster than CUDA?

May 6, 2022 · 5 min read

Haidong Lan

Ph.D High Performance Computing & Bioinformatics, Shandong University

In our recently published blog: Taichi AOT, the solution for deploying kernels in mobile devices [1], we demonstrated how to deploy a gravity-based, interactive physical simulation on an Android mobile phone. As we know, the computing capability of mobile devices is limited by hardware and production costs and is barely satisfactory. The question then arises: Is Taichi Lang able to make better use of the underlying hardware than other native, low-level programming languages? With this question in mind, we kick-started the benchmark project in an attempt to provide a comprehensive and accurate performance evaluation of Taichi Lang.

Taichi Lang is a domain-specific language (DSL) and can solve a great many numerical computing problems with just a few lines of code. We established a testing set of frequently used algorithms, from which we compared Taichi Lang with the top performer (CUDA) in the field on every benchmark. To put it differently, we compared Taichi Lang's implementation of these algorithms with other top-notch third-party implementations. The aim is to evaluate the effectiveness of the language's inbuilt optimization mechanism and to look for room for improvement. Further, the comparison between Taichi Lang and CUDA uses Nvidia GPUs to ensure that we evaluate Taichi Lang with the highly optimized CUDA code.

Taichi AOT, the solution for deploying kernels in mobile devices

April 20, 2022 · 6 min read

Ye Kuang

CTO @ Taichi Graphics | Former SWE @ Google | Tsinghua Alumni

Physical simulation, which Taichi Lang is best at, has wide applications on mobile devices, such as real-time physically-based interactions in mobile games or cool visual effects in short videos. This is thanks to Taichi's features such as fast prototyping and cross-platform GPU acceleration.

However, Taichi is currently a language embedded in the Python frontend. Python is not the most ideal language when it comes to deployment, because Python's heavy virtual machine design often makes it hard to embed Python in other host languages. Therefore, we've been constantly thinking about how to have Taichi's users enjoy both the rapid iteration of Python and seamless deployment in real industrial scenarios.

AST refactoring

April 13, 2022 · 15 min read

Lin Jiang

Intern @ Taichi Graphics | Tsinghua CS graduate

Simple is better than complex.

In the previous blog post, we mentioned this sentence, which is a part of the zen of Python. In this post, we will show you how we simplified the code of Taichi. We have refactored the Abstract Syntax Tree (AST) transformer of taichi in the past few months. The error reporting experience has improved dramatically after the refactor. This blog mainly talks about why and how we did that.

Taichi & Torch 02: Data containers in Torch & Taichi

March 26, 2022 · 4 min read

Ailing Zhang

SWE @ Taichi Graphics | Former PyTorch dev @ Facebook | UIUC CS graduate

In this blog post I'll briefly talk about the data containers in Taichi and Torch. As you might have already known, both Taichi and Torch have a core concept of multi-dimensional array containers, called taichi.field and torch.Tensor respectively. They, as well as numpy.arrays, share a lot in common so users might think they're exactly the same. Therefore, we want to share a few interesting differences in this blog so that new users don't get confused by similar names or usages.

Taichi & Torch 01: Resemblances and Differences

March 18, 2022 · 5 min read

Ailing Zhang

SWE @ Taichi Graphics | Former PyTorch dev @ Facebook | UIUC CS graduate

"What is the advantage of Taichi over Pytorch or Tensorflow when running on GPU?" Not surprisingly, this is one of the top questions we've received in the Taichi user community. In this blog series we will walk you through some major concepts and components in Taichi and Torch, elaborating on the analogies and differences which might not be obvious at the first sight.

Let’s start with a simple fact. Except for some minor intersections, Taichi and Torch target almost completely different users and applications. Torch is always your first choice for deep learning tasks like computer vision, natural language processing and so on. Taichi, on the other hand, specializes in parallel high-performance numerical computational tasks and really comes in handy when it comes to physics simulation and visual computing tasks.

From a high-level perspective, Taichi looks very similar to Torch in the sense that their main goals are both to lower the bar for their users. Compared to the static computation graph-based Tensorflow 1.0, Torch eager mode changes the game by building the graph on the fly as your python program runs. Similarly, Taichi aims to enable more people to write high-performance parallel programs that used to require a lot of domain knowledge in CUDA, OpenGL, or Vulkan.

Head First Taichi: A Beginner's Guide to High Performance Computing in Python

October 12, 2021 · 15 min read

Dunfan Lu

SWE @ Facebook | Oxford CS graduate | Taichi Alumni

Ever since the Python programming language was born, its core philosophy has always been to maximize the readability and simplicity of code. In fact, the reach for readability and simplicity is so deep within Python's root, that if you type import this in a Python console, it will recite a little poem:

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
...

Simple is better than complex. Readability counts. No doubt, Python has indeed been quite successful at achieving these goals: it is by far the most friendly language to learn, and an average Python program is often 5-10 times shorter than equivalent C++ code. Unfortunately, there is a catch: Python's simplicity comes at the cost of reduced performance. In fact, it is almost never surprising for a Python program to be 10-100 times slower than its C++ counterpart. It thus appears that there is a perpetual trade-off between speed and simplicity, and no programming language shall ever possess both.

But don't you worry, all hope is not lost.