文 | James
马斯克：谢谢您的线上采访。 我希望明年能有机会能亲自参加，我很喜欢到中国。 中国总是给我惊喜，中国有很多既聪明又勤奋的人，中国充满了正能量，中国人对未来满怀期待。我会让未来成为现实，所以我非常期待再次回来。
主持人：Hello, Elon. Even though you cannot be in Shanghai right now, it’s nice to have you at the 2020 world artificial intelligence conference over video.
马斯克：Thanks for having me. Yes, but it is great to be here again. I look forward to attending in person in the future.
Q：Great. Let’s get started with a couple of questions. First, in terms of Tesla products, we know that Autopilot is one of its most popular features. How does it work in China?
A：Tesla Autopilot does work reasonably well in China. It does not work quite as well in China as it does in the US because still most of our engineering is in the US so that tends to be the local group of optimization. So Autopilot tends to work the best in California because that is where the engineers are. And then once it works in California, we then extend it to the rest of the world. But we are building up our engineering team in China. And so if you’re interested in working at Tesla China as an engineer, we would love to have you work there. That will be great.
I really want to emphasize it is a lot that we are going to be doing original engineering in China. It’s not just converting sort of stuff from America to work in China, we will be doing original design and engineering in China. So please do consider Tesla China, if you’re thinking about working somewhere.
Q：Great. How confident are you that level five autonomy will eventually be with us? And when do you think we will reach full level five autonomy?
A：I’m extremely confident that level five or essentially complete autonomy will happen, and I think will happen very quickly.
I think at Tesla, I feel like we are very close to level five autonomy.I think I remain confident that we will have the basic functionality for level five autonomy complete this year. The thing to appreciate for level five autonomy is what level of safety is acceptable for the public streets relative to human safety? And then, so is it enough to be twice as safe as humans? Like I do not think that the regulators will accept equivalent safety to humans.
So the question is, will it be twice as safe as a requirement, three times as safe, five times as safe, 10 times as safe? So you can think of really level five autonomy as kind of like a march of 9s. Like do you have 99.99% safety? 99.99999%? How many 9s do you want? what is the acceptable level? And then what amount of data is required to convince regulators that it is sufficiently safe? Those are the actual in-depth questions, I think, to be asking about level five autonomy. That it will happen is a certainty.
So yes, I think there are no fundamental challenges remaining for level five autonomy. There are many small problems. And then there’s the challenge of solving all those small problems and then putting the whole system together, and just keep addressing the long tail of problems. So you’ll find that you’re able to handle the vast majority of situations. But then there will be something very odd. And then you have to have the system figure out a train to deal with these very odd situations. This is why you need a kind of a real world situation. Nothing is more complex and weird than the real world. Any simulation we create is necessarily a subset of the complexity of the real world.
So we are really deeply enmeshed in dealing with the tiny details of level five autonomy. But I’m absolutely confident that this can be accomplished with the hardware that is in Tesla today, and simply by making software improvements, we can achieve level five autonomy.
Q：Great. If we look at the three building blocks of AI and robotics: perception, cognition, and action, how would you assess the progress respectively so far?
A：I am not sure I totally agree with dividing it into those categories: perception, cognition, and action. But if you do use those categories, I’d say that probably perception we’ve made, if you can say like the recognition of objects, we’ve made incredible progress in recognition of objects. In fact, I think it would probably fair to say that advanced image recognition system today is better than almost any human, even in an expert field.
So it is really a question of how much compute power, how many computers were required to train it? How many compute hours? What was the efficiency of the image training system? But in terms of image recognition or sound recognition, and really any signal you can say, generally speaking any byte stream, Can an AI system recognize things accurately with a given byte stream？ Extremely well.
Cognition. This is probably the weakest area. Do you understand concepts？Are you able to reason effectively? And can you be creative in a way that makes sense? You have so many advanced AIs that are very creative, but they do not curate their creative actions very well. We look at it and it is not quite right. It will become right though.
And then action, sort of like things like games, as maybe something part of the action part of thing. Obviously at this point, any game with rules, AI will be superhuman at any game with an understandable set of rules, essentially any game below a certain degree of freedom level. Let us say at this point, any game, it would be hard-pressed to think of a game where if there was enough attention paid to it, that we would not make it superhuman AI that could play it. That’s not even taking into account the faster reaction time of AI.
Q：In what ways does Autopilot stimulate the development of AI algorithms and chips? And how do you does it refresh our understanding of AI technology?
A：In developing AI chips for Autopilot, what we found was that there was no system on the market that was capable of doing inference within a reasonable cost or power budget. So if we had gone with a conventional GPUs, CPUs and that kind of thing, we would have needed several hundred watts and we would have needed to fill up the trunk with computers and GPUs and a big cooling system. It would have been costly and bulky and have taken up too much power, which is important for range for an electric car.
So we developed our own AI chip, the Tesla Full Self-Driving computer with dual system on chips with the eight bit and accelerators for doing the dot products. I think probably a lot of people in this audience are aware of it. But AI consists of doing a great many dot products. This is like, if you know what a dot product is, it’s just a lot of dot products, which effectively means that our brain must be doing a lot of dot products. We still actually haven’t fully explored the power of the Tesla Full Self-Driving computer. In fact, we only turned on the second system on chip harshly a few months ago. So making full use of Tesla Full-Self Driving computer will probably take us at least another year or so.
Then we also have the Tesla Dojo system, which is a training system. And that’s intended to be able to process fast amounts of video data to improve the training for the AI system. The Dojo system, that’s like an fp16 training system and it is primarily constrained by heat and by communication between the chips. We are developing new buses and new sort of heat projection or cooling systems that enable a very high operation computer that will be able to process video data effectively.
How do we see the evolution of AI algorithms? I’m not sure how the best way to understand it, except that neural net seems to mostly do is to take a massive amount of information from reality, primarily passive optical, and create a vector space, essentially compress a massive amount of photons into a vector space. I am just thinking actually on the drive this morning, have you tried accessing the vector space in your mind? Like we normally take reality just granted in kind of analog way. But you can actually access the vector space in your mind and understand what your mind is doing to take in all the world data. What we actually doing is trying to remember the least amount of information possible.
So it’s taking a massive amount of information, filtering it down, and saying what is relevant. And then how do you create a vector space world that is a very tiny percentage of that original data? Based on that vector space representation, you make decisions. It is like a really compression and decompression that is just going on a massive scale, which is kind of how physics is like. You think of physics out physics algorithms as essentially compression algorithms for reality.
That is what physics does. Those physics formulas are compression algorithms for reality, which may sound very obvious. But if you simplify what it means, we are the proof points of this. If you simply ran a true physics simulation of the universe, it also takes a lot of compute. If you are given enough time, eventually you will have sentience. The proof of that is us. And if you believe in physics and the arches of the universe, it started out as sort of quarks electrons. And there was hydrogen for quite a while, and then helium and lithium. And then there were supernovas, the heavy elements formed billions of years later, some of those heavy elements learned to talk. We are essentially evolved hydrogen. If you just leave hydrogen out for a while, it turns into us. I think people don’t quite appreciate this. So if you say, where does the specialist come in? Where does sentience come in? The whole universe is sentience special or nothing is? Or you could say at what point from hydrogen to us did it become sentient?
Q：Great. Our last question, congratulations on an incredible year so far at Tesla. How are things going at Gigafactory Shanghai? Is there any application of AI to manufacturing specifically at Giga Shanghai?
A：Thank you. Things are going really well at Giga Shanghai. I’m incredibly proud of the Tesla team. They’re doing an amazing job. And I look forward to visiting Giga Shanghai as soon as possible. It’s really an impressive work that’s been done. I really can’t say enough good things. Thank you to the Tesla China team.
We expect over time to use more AI and essentially smarter software in our factory. But I think it will take a while to really employ AI effectively in a factory situation. You can think of a factory as a complex, cybernetic collective involving humans and machines. This is actually how all companies are really, but especially manufacturing companies, or at least the robot component of manufacturing companies is much higher. So now that interesting thing about this is that I think over time there will be both more jobs and having jobs will be optional.
One of the false premises sometimes people have about economics is that there’s a finite number of jobs. There is definitely not a finite number of jobs. An obvious, reductive example would be if you had the populations increased tenfold in a century, If there’s a finite number of jobs and 90% of people would be unemployed? Or think of the transition from an agrarian to an industrial society where at an agrarian society, 90% people or more would be working in the farm. Now we have 2% or 3% of people working in the farm. So at least the short to medium term, my biggest concern about growth is being able to find enough humans. That is the biggest constraint in growth.
主持人：Thanks again you on for your time and joining us at this year’s world artificial intelligence conference. We hope to see you next year in person.
马斯克：Thank you for having me in virtual form. I look forward to visiting physically next year, and I always enjoy visiting China. I am always amazed by how many smart, hardworking people that are in China and just that how much positive energy there is, and that people are really excited about the future. I want to make things happen. I cannot wait to be back.